Preface 1. Introduction: What Is Data Science? Big Data and Data Science Hype Getting Past the Hype Why Now? Datafication The Current Landscape (with a Little History) Data Science lobs A Data Science Profile Thought Experiment: Meta-Definition OK, So What Is a Data Scientist, Really? In Academia In Industry 2. Statistical Inference, Exploratory Data Analysis, and the Data Science Process Statistic.a1 Thinking in the Age of Big Data Statistical Inference Populations and Samples Populations and Samples of Big Data Big Data Can Mean Big Assumptions Modeling Exploratory Data Analysis Philosophy of Exploratory Data Analysis Exercise: EDA The Data Science Process A Data Scientist's Role in This Process Thought Experiment: How Would You Simulate Chaos? Case Study: RealDirect How Does RealDirect Make Money? Exercise: RealDirect Data Strategy 3. Algorithms Machine Learning Algorithms Three Basic Algorithms Linear Regression k-Nearest Neighbors (k-NN) k-means Exercise: Basic Machine Learning Algorithms Solutions Summing It All Up Thought Experiment: Automated Statistician 4. Spare Filters, Naive Bayes, and Wrangling Thought Experiment: Learning by Example Why Won't Linear Regression Work for Filtering Spare? How About k-nearest Neighbors? Naive Bayes Bayes Law A Spare Filter for Individual Words A Spam Filter That Combines Words: Naive Bayes Fancy It Up: Laplace Smoothing Comparing Naive Bayes to k-NN Sample Code in bash Scraping the Web: APIs and Other Tools Jake's Exercise: Naive Bayes for Article Classification Sample R Code for Dealing with the NYT API 5. Logistic Regression Thought Experiments Classifiers Runtime You Interpretability Scalability M6D Logistic Regression Case Study Chck Models The Underlying Math 6.1ime Stamps and Financial Modeling 7.Extracting Meaning from Data 8.Recommendation Engines:Building a User-Facing Data Product at Scale 9.Data Visualization and Fraud Detection 10.SociaI Networks and Data Journalism 11.Causality 12.Epidemiology 13.Lessons Learned from Data Competitions:Data Leakage and Model Evaluation 14.Data Engineering:MapReduce,Pregel,and Hadoop 15.The Students Speak 16.Next-Generation Data Scientists,Hubris,and Ethics Index