Contents Biography . .iv Preface. v PART 1INTRODUCTION CHAPTER 1Statistical Machine Learning 1.1Types of Learning 3 1.2Examples of Machine Learning Tasks . 4 1.2.1Supervised Learning 4 1.2.2Unsupervised Learning . 5 1.2.3Further Topics 6 1.3Structure of This Textbook . 8 PART 2STATISTICS AND PROBABILITY CHAPTER 2Random Variables and Probability Distributions 2.1Mathematical Preliminaries . 11 2.2Probability . 13 2.3Random Variable and Probability Distribution 14 2.4Properties of Probability Distributions 16 2.4.1Expectation, Median, and Mode . 16 2.4.2Variance and Standard Deviation 18 2.4.3Skewness, Kurtosis, and Moments 19 2.5Transformation of Random Variables 22 CHAPTER 3Examples of Discrete Probability Distributions 3.1Discrete Uniform Distribution . 25 3.2Binomial Distribution . 26 3.3Hypergeometric Distribution. 27 3.4Poisson Distribution . 31 3.5Negative Binomial Distribution . 33 3.6Geometric Distribution 35 CHAPTER 4Examples of Continuous Probability Distributions 4.1Continuous Uniform Distribution . 37 4.2Normal Distribution 37 4.3Gamma Distribution, Exponential Distribution, and Chi-Squared Distribution . 41 4.4Beta Distribution . 44 4.5Cauchy Distribution and Laplace Distribution 47 4.6t-Distribution and F-Distribution . 49 CHAPTER 5Multidimensional Probability Distributions 5.1Joint Probability Distribution 51 5.2Conditional Probability Distribution . 52 5.3Contingency Table 53 5.4Bayes’ Theorem. 53 5.5Covariance and Correlation 55 5.6Independence . 56 CHAPTER 6Examples of Multidimensional Probability Distributions61 6.1Multinomial Distribution . 61 6.2Multivariate Normal Distribution . 62 6.3Dirichlet Distribution 63 6.4Wishart Distribution . 70 CHAPTER 7Sum of Independent Random Variables 7.1Convolution 73 7.2Reproductive Property 74 7.3Law of Large Numbers 74 7.4Central Limit Theorem 77 CHAPTER 8Probability Inequalities 8.1Union Bound 81 8.2Inequalities for Probabilities 82 8.2.1Markov’s Inequality and Chernoff’s Inequality 82 8.2.2Cantelli’s Inequality and Chebyshev’s Inequality 83 8.3Inequalities for Expectation . 84 8.3.1Jensen’s Inequality 84 8.3.2H?lder’s Inequality and Schwarz’s Inequality . 85 8.3.3Minkowski’s Inequality . 86 8.3.4Kantorovich’s Inequality . 87 8.4Inequalities for the Sum of Independent Random Vari-ables 87 8.4.1Chebyshev’s Inequality and Chernoff’s Inequality 88 8.4.2Hoeffding’s Inequality and Bernstein’s Inequality 88 8.4.3Bennett’s Inequality. 89 CHAPTER 9Statistical Estimation 9.1Fundamentals of Statistical Estimation 91 9.2Point Estimation 92 9.2.1Parametric Density Estimation . 92 9.2.2Nonparametric Density Estimation 93 9.2.3Regression and Classification. 93 9.2.4Model Selection 94 9.3Interval Estimation. 95 9.3.1Interval Estimation for Expectation of Normal Samples. 95 9.3.2Bootstrap Confidence Interval 96 9.3.3Bayesian Credible Interval. 97 CHAPTER 10Hypothesis Testing 10.1Fundamentals of Hypothesis Testing 99 10.2Test for Expectation of Normal Samples 100 10.3Neyman-Pearson Lemma . 101 10.4Test for Contingency Tables 102 10.5Test for Difference in Expectations of Normal Samples 104 10.5.1 Two Samples without Correspondence . 104 10.5.2 Two Samples with Correspondence 105 10.6Nonparametric Test for Ranks. 107 10.6.1 Two Samples without Correspondence . 107 10.6.2 Two Samples with Correspondence 108 10.7Monte Carlo Test . 108 PART 3GENERATIVE APPROACH TO STATISTICAL PATTERN RECOGNITION CHAPTER 11Pattern Recognition via Generative Model Estimation113 11.1Formulation of Pattern Recognition . 113 11.2Statistical Pattern Recognition . 115 11.3Criteria for Classifier Training . 117 11.3.1 MAP Rule 117 11.3.2 Minimum Misclassification Rate Rule 118 11.3.3 Bayes Decision Rule 119 11.3.4 Discussion . 121 11.4Generative and Discriminative Approaches 121 CHAPTER 12Maximum Likelihood Estimation 12.1Definition. 123 12.2Gaussian Model. 125 12.3Computing the Class-Posterior Probability . 127 12.4Fisher’s Linear Discriminant Analysis (FDA