Probability and Statistics in Computing
Basic Information
Time and venue: MW 1130-1300, A-201.
Please email me if you would like to be added to the mailing list for this class.
Some material from an earlier iteration of this course is available here.
Homeworks
Lectures
2023-05-03 (#30). Class will be in A-238.
Complete proof of the main transportation cost ingredient from the previous class.2023-05-01 (#29). Class rescheduled to 0915 due to department seminar at 1100.
Proof of the main transportation cost ingredient for the extended bounded difference inequality (contd.).See [BLM, Chapters 4 and 8] and also [S00].
See also Himanshu Tyagi’s class notes on concentration inequalities at IISc.
2023-04-26. No class due to STCS PhD interviews. Make-up class on 2023-05-03.
2023-04-24 (#28). Proof of the convex distance inequality from the extended bounded difference inequality. Proof of the main transportation cost ingredient for the extended bounded difference inequality (contd.).
See [BLM, Chapters 4 and 8] and also [S00].
2023-04-19 (#27). Marton’s method for reducing concentration inequalities to transportation cost inequalities. Marton’s proof of the bounded differences inequality. Statements of Marton’s extended bounded difference inequality and Talagrand’s convex distance inequality.
2023-04-17 (#26). Variational representation of the KL-divergence. Pinsker’s inequality.
2023-04-12 (#25). Applications of information theory. Combinatorial: Loomis-Whitney inequality. Statistical inference: Fano’s inequality and average error in hypothesis testing. Fisher information and the Cramer-Rao bound for unbiased estimators.
See also the section on Fano’s inequality in [Tsybakov, Chapter 2].
2023-04-10 (#24). A crash course in information theory: Sanov’s theorem, mutual information, the chain rule, the data processing inequality. Combinatorial applications: Shearer’s lemma.
2023-04-05 (#23). An example of a multiple hypothesis testing lowerbound: the stochastic block model.
See Section V of [ABH16] for a tighter argument for the stochastic block model lower-bound.
2023-04-03 (#22). Recap of the two-hypothesis case. Multiple hypothesis testing lowerbounds.
2023-03-29 (#21). Various distance measures: Bhattacharya coefficicent, Hellinger and KL divergences. Two-hypothesis testing lowerbounds. Lower bounds for estimation. Example: distinguishing two Bernoullis.
2023-03-27 (#20). Brief recap. Introduction to minimax lower bounds. Bayes error for two-hypothesis testing.
2023-03-22. No class. (गुढी पाडवा)
2023-03-20 (#19). \(\epsilon\)-nets for function classes with small VC-dimension bound, and consequent Rademacher complexity upper bounds.
2023-03-15 (#18). The VC-lemma on effective size of function classes with small VC-dimension. Rademacher complexity.
2023-03-13 (#17). VC-dimension: basic examples. Bounding the difference between empirical vs distributional optima for function classes with small VC-dimension. The symmetrization trick.
2023-03-08 (#16). Proof of Dudley’s inequality. Examples of other ways of constructing \(\epsilon\)-nets: bounded polytopes with small number of vertices (done on the board: not in transcript).
2023-03-06 (#15). Concentration for empirical processes. The need for multi-scale arguments. Informal discussion of the technique of chaining. Dudley’s inequality.
2023-03-01 (#14). Gaussian comparison inequalities (Gordon, without proof). Application to expected minimum singular values of random matrices with Gaussian entries. A brief sketch of the use of Gaussian interpolation and Stein’s operator for the normal distribution (“Gaussian integration by parts”) in the proof of Slepian-like comparison inequalities (some final heuristic calculations were done on the board and are not in the transcripts below).
2023-02-27 (#13). Gaussian comparison inequalities (Slepian/Sudakov-Fernique, without proofs). Application to expected maximum singular values of random matrices with Gaussian entries.
2023-02-22 (#12). Review of \(\epsilon\)-nets. Guassian processes as an alternative approach for Gaussian-based random design matrices: preliminaries. Gaussian isoperimetry (without proof) and concentration.
2023-02-20 (#11). \(\epsilon\)-nets for understanding properties (including the RI property) of random design matrices. General discussion of \(\epsilon\)-nets.
2023-02-15 (#10). Sparse recovery if the noisy setting: \(\ell_1\)-minimization and “LASSO” formulations and their analysis.
2023-02-13 (#9). Recap of restricted subspace property. Derivation of the restricted subspace (RS) property from the restricted isometry (RI) property. Setup for sparse recovery in the noisy setting.
2023-02-08 (#8). Putting the pieces together for clustering. Discussion on how to get around the problem of the data itself being used to create the projections (not in transcript). Recovery of sparse signals in the noiseless setting. The restricted subspace property (not in transcript).
2023-02-06 (#7). Using \(\epsilon\)-nets for union bounds over large sets.
2023-02-01 (#6). Concentration results for the problem of finding the best-fist space (contd.). \(\epsilon\)-nets.
2023-01-30 (#5). Concentration results for the problem of finding the best-fist space.
- See also Vempala and Wang, JCSS 80(4), 2004.
2023-01-25 (#4). Singular value decomposition (SVD).
- See also Vempala and Wang, JCSS 80(4), 2004.
2023-01-23 (#3). Distance based clustering for identifying mixtures of (well-separated) spherical Gaussians. Dimension reduction to the best fit space.
- See also Vempala and Wang, JCSS 80(4), 2004.
2023-01-18 (#2). The normal distribution in high dimensions. Distance based clustering for identifying mixtures of (well-separated) spherical Gaussians.
2023-01-16 (#1). Preliminaries 1: Some basic statistical terminology. Hypothesis testing and the Neyman-Pearson lemma. Discussion of the geometry of the sphere in high dimension.
References
Foundations of Data Science. A. Blum, J. E. Hopcroft, R. Kannan. Hindustan Book Agency.
Concentration Inequalities. S. Boucheron, G. Lugosi, P. Massart. Oxford University Press.
Concentration of Measure Inequalities for Markov Chains and \(\Phi\)-Mixing Processes. Paul-Marie Samson. Annals of Probability, 28 (1) (January 2000): 416–461.
Introduction to Nonparametric Estimation. Alexandre B. Tsybakov. Springer Series in Statistics, 2009.
High Dimensional Probability. Roman Vershynin. Cambridge University Press.
High Dimensional Statistics. Martin J. Wainright. Cambridge University Press.