Data science resources, from finding ebooks and blogs, to finding raw datasets and analysis. Learn about data science resources, analysis, communities and data management. Also learn about hte datasets openly available and dataset purchase program.

- Stack OverflowOnline community for programmers to learn, share their knowledge and solve coding problems.
- GitHubOnline community for developers to learn, share and work in collaboration to build software.
- HashnodeConversational community for software developers.
- CoderwallSpace for developers to connect, share, and build code.

- Mathematical and Statistical Methods for Data Science and Machine Learning byCall Number: Carlson Library Q325.5 .K76 2020ISBN: 9781138492530Publication Date: 2019-11-26
- Leadership in Statistics and Data Science byCall Number: E-BookISBN: 3030600599Publication Date: 2021-03-23
- Machine Learning and Data Science byCall Number: E-BookISBN: 9781119776482Publication Date: 2022-07-15
- Digital Political Participation, Social Networks and Big Data byCall Number: E-BookISBN: 9783030277567Publication Date: 2019-10-01
- Data Science Ethics byCall Number: Carlson Library QA76.9.B45 M36 2022ISBN: 9780192847263Publication Date: 2022-06-24
- Ace the Data Science Interview byCall Number: Carlson Library HF5549.5.I6 H86 2022ISBN: 9780578973838Publication Date: 2021-01-01

- Learning Spark by Data in all domains is getting bigger. How can you work with it efficiently? Recently updated for Spark 1.3, this book introduces Apache Spark, the open source cluster computing system that makes data analytics fast to write and fast to run. With Spark, you can tackle big datasets quickly through simple APIs in Python, Java, and Scala. This edition includes new information on Spark SQL, Spark Streaming, setup, and Maven coordinates. Written by the developers of Spark, this book will have data scientists and engineers up and running in no time. You'll learn how to express parallel jobs with just a few lines of code, and cover applications from simple batch jobs to stream processing and machine learning. Quickly dive into Spark capabilities such as distributed datasets, in-memory caching, and the interactive shell Leverage Spark's powerful built-in libraries, including Spark SQL, Spark Streaming, and MLlib Use one programming paradigm instead of mixing and matching tools like Hive, Hadoop, Mahout, and Storm Learn how to deploy interactive, batch, and streaming applications Connect to data sources including HDFS, Hive, JSON, and S3 Master advanced topics like data partitioning and shared variablesCall Number: E-BookISBN: 9781449358624Publication Date: 2015-02-27
- Statistics for Data Science by Get your statistics basics right before diving into the world of data scienceAbout This Book* No need to take a degree in statistics, read this book and get a strong statistics base for data science and real-world programs;* Implement statistics in data science tasks such as data cleaning, mining, and analysis* Learn all about probability, statistics, numerical computations, and more with the help of R programsWho This Book Is ForThis book is intended for those developers who are willing to enter the field of data science and are looking for concise information of statistics with the help of insightful programs and simple explanation. Some basic hands on R will be useful.What You Will Learn* Analyze the transition from a data developer to a data scientist mindset* Get acquainted with the R programs and the logic used for statistical computations* Understand mathematical concepts such as variance, standard deviation, probability, matrix calculations, and more* Learn to implement statistics in data science tasks such as data cleaning, mining, and analysis* Learn the statistical techniques required to perform tasks such as linear regression, regularization, model assessment, boosting, SVMs, and working with neural networks* Get comfortable with performing various statistical computations for data science programmaticallyIn DetailData science is an ever-evolving field, which is growing in popularity at an exponential rate. Data science includes techniques and theories extracted from the fields of statistics; computer science, and, most importantly, machine learning, databases, data visualization, and so on.This book takes you through an entire journey of statistics, from knowing very little to becoming comfortable in using various statistical methods for data science tasks. It starts off with simple statistics and then move on to statistical methods that are used in data science algorithms. The R programs for statistical computation are clearly explained along with logic. You will come across various mathematical concepts, such as variance, standard deviation, probability, matrix calculations, and more. You will learn only what is required to implement statistics in data science tasks such as data cleaning, mining, and analysis. You will learn the statistical techniques required to perform tasks such as linear regression, regularization, model assessment, boosting, SVMs, and working with neural networks.By the end of the book, you will be comfortable with performing various statistical computations for data science programmatically.Style and approachStep by step comprehensive guide with real world examplesCall Number: E-BookISBN: 1788290674Publication Date: 2017-11-17
- Data Science with Java by Get your statistics basics right before diving into the world of data scienceAbout This Book* No need to take a degree in statistics, read this book and get a strong statistics base for data science and real-world programs;* Implement statistics in data science tasks such as data cleaning, mining, and analysis* Learn all about probability, statistics, numerical computations, and more with the help of R programsWho This Book Is ForThis book is intended for those developers who are willing to enter the field of data science and are looking for concise information of statistics with the help of insightful programs and simple explanation. Some basic hands on R will be useful.What You Will Learn* Analyze the transition from a data developer to a data scientist mindset* Get acquainted with the R programs and the logic used for statistical computations* Understand mathematical concepts such as variance, standard deviation, probability, matrix calculations, and more* Learn to implement statistics in data science tasks such as data cleaning, mining, and analysis* Learn the statistical techniques required to perform tasks such as linear regression, regularization, model assessment, boosting, SVMs, and working with neural networks* Get comfortable with performing various statistical computations for data science programmaticallyIn DetailData science is an ever-evolving field, which is growing in popularity at an exponential rate. Data science includes techniques and theories extracted from the fields of statistics; computer science, and, most importantly, machine learning, databases, data visualization, and so on.This book takes you through an entire journey of statistics, from knowing very little to becoming comfortable in using various statistical methods for data science tasks. It starts off with simple statistics and then move on to statistical methods that are used in data science algorithms. The R programs for statistical computation are clearly explained along with logic. You will come across various mathematical concepts, such as variance, standard deviation, probability, matrix calculations, and more. You will learn only what is required to implement statistics in data science tasks such as data cleaning, mining, and analysis. You will learn the statistical techniques required to perform tasks such as linear regression, regularization, model assessment, boosting, SVMs, and working with neural networks.By the end of the book, you will be comfortable with performing various statistical computations for data science programmatically.Style and approachStep by step comprehensive guide with real world examplesCall Number: E-BookISBN: 1491934069Publication Date: 2017-11-17

- IEEE Xplore This link opens in a new window Full text of IEEE and IET journals, magazines, transactions and conference proceedings as well as active IEEE standards. Also includes access to the IEEE eLearning Library.

- Data Science Journal Published by Committee on Data for Science and Technology (CODATA) of the International Council for Science (ICSU) , this is a peer-reviewed journal about a range of topics related to data and application systems.

- Linguistic Data Consortium This link opens in a new window Linguistic Data Consortium is an open consortium of universities, libraries, corporations and government research laboratories. LDC was formed in 1992 to address the critical data shortage then facing language technology research and development. The Linguistic Data Consortium (LDC) maintains a collection of datasets, often large, for research in natural language processing, speech technology, and machine translation. University of Rochester provides access to the LDC collection of datasets from the Linguistic Data for researchers. To access the collection, users will need to sign-up for an account on the LDC page.

- Coursera.orgCoursera.org is a collection of massive open online courses, for self-paced learning from top names in education. Topics vary widely, and several are focused on data science specializations, such as bioinformatics.
- UdacityA website hosting hundreds of massive open online courses in a range of topics.Similar to Coursera, you pay for some but there are some free courses available.
- EdX.orgEdX.org provides massive open online courses (MOOCs) in hundreds of topics, and have a history of high-quality offerings. Founded by Harvard, MIT and other high-capacity institutional partners, EdX.org continues to be run by academic institutions.
- DataCamp Podcast: DataFramedA weekly podcast about what's new in a still new field, typically featuring a lively conversation between host, Hugo Bowne-Anderson, and a professional data scientist.
- LinkedIn LearningMultiple self-paced data science related courses, including Tableau, Python, and general data science topics.