Probability and Statistics in Computing

Basic Information

Time and venue: MW 1130-1300, A-201.

Please email me if you would like to be added to the mailing list for this class.

Some material from an earlier iteration of this course is available here.

Homeworks

HW01.
HW02.
HW03. Due: May 08.

Lectures

2023-05-03 (#30). Class will be in A-238.
Complete proof of the main transportation cost ingredient from the previous class.
- Transcript (left over material from previous class)
2023-05-01 (#29). Class rescheduled to 0915 due to department seminar at 1100.
Proof of the main transportation cost ingredient for the extended bounded difference inequality (contd.).
- See [BLM, Chapters 4 and 8] and also [S00].
- See also Himanshu Tyagi’s class notes on concentration inequalities at IISc.
- Transcript
- Transcript (part 2)
2023-04-26. No class due to STCS PhD interviews. Make-up class on 2023-05-03.
2023-04-24 (#28). Proof of the convex distance inequality from the extended bounded difference inequality. Proof of the main transportation cost ingredient for the extended bounded difference inequality (contd.).
- See [BLM, Chapters 4 and 8] and also [S00].
- Transcript
2023-04-19 (#27). Marton’s method for reducing concentration inequalities to transportation cost inequalities. Marton’s proof of the bounded differences inequality. Statements of Marton’s extended bounded difference inequality and Talagrand’s convex distance inequality.
- See [BLM, Chapters 4 and 8].
- Transcript
- Transcript (part 2)
2023-04-17 (#26). Variational representation of the KL-divergence. Pinsker’s inequality.
- See [BLM, Chapters 4 and 8].
- Transcript
2023-04-12 (#25). Applications of information theory. Combinatorial: Loomis-Whitney inequality. Statistical inference: Fano’s inequality and average error in hypothesis testing. Fisher information and the Cramer-Rao bound for unbiased estimators.
- See also the section on Fano’s inequality in [Tsybakov, Chapter 2].
- Transcript
- Transcript (part 2)
2023-04-10 (#24). A crash course in information theory: Sanov’s theorem, mutual information, the chain rule, the data processing inequality. Combinatorial applications: Shearer’s lemma.
- Transcript
2023-04-05 (#23). An example of a multiple hypothesis testing lowerbound: the stochastic block model.
- See Section V of [ABH16] for a tighter argument for the stochastic block model lower-bound.
- Transcript
2023-04-03 (#22). Recap of the two-hypothesis case. Multiple hypothesis testing lowerbounds.
- See [Tsybakov, Chapter 2]
- Transcript
- Transcript (part 2)
2023-03-29 (#21). Various distance measures: Bhattacharya coefficicent, Hellinger and KL divergences. Two-hypothesis testing lowerbounds. Lower bounds for estimation. Example: distinguishing two Bernoullis.
- See [Tsybakov, Chapter 2]
- Transcript
2023-03-27 (#20). Brief recap. Introduction to minimax lower bounds. Bayes error for two-hypothesis testing.
- See [Tsybakov, Chapter 2]
- Transcript
2023-03-22. No class. (गुढी पाडवा)
2023-03-20 (#19). \(\epsilon\)-nets for function classes with small VC-dimension bound, and consequent Rademacher complexity upper bounds.
- See [Vershynin, Chapter 8]
- Transcript (including material from previous lecture)
- Transcript (part 2)
2023-03-15 (#18). The VC-lemma on effective size of function classes with small VC-dimension. Rademacher complexity.
- See [Vershynin, Chapter 8]
- Transcript (including material from previous lecture)
2023-03-13 (#17). VC-dimension: basic examples. Bounding the difference between empirical vs distributional optima for function classes with small VC-dimension. The symmetrization trick.
- See [Vershynin, Chapter 8]
- Transcript
2023-03-08 (#16). Proof of Dudley’s inequality. Examples of other ways of constructing \(\epsilon\)-nets: bounded polytopes with small number of vertices (done on the board: not in transcript).
- See [Vershynin, Chapter 8]
- Transcript (including material from previous lecture)
2023-03-06 (#15). Concentration for empirical processes. The need for multi-scale arguments. Informal discussion of the technique of chaining. Dudley’s inequality.
- See [Vershynin, Chapter 8]
- Transcript (including material from previous lecture)
2023-03-01 (#14). Gaussian comparison inequalities (Gordon, without proof). Application to expected minimum singular values of random matrices with Gaussian entries. A brief sketch of the use of Gaussian interpolation and Stein’s operator for the normal distribution (“Gaussian integration by parts”) in the proof of Slepian-like comparison inequalities (some final heuristic calculations were done on the board and are not in the transcripts below).
- See [Vershynin, Chapter 7]
- Transcript (including material from previous lecture)
- Transcript, part 2
2023-02-27 (#13). Gaussian comparison inequalities (Slepian/Sudakov-Fernique, without proofs). Application to expected maximum singular values of random matrices with Gaussian entries.
- See [Vershynin, Chapter 7]
- Transcript (including material from previous lecture)
2023-02-22 (#12). Review of \(\epsilon\)-nets. Guassian processes as an alternative approach for Gaussian-based random design matrices: preliminaries. Gaussian isoperimetry (without proof) and concentration.
- See [Vershynin, Chapter 7]
- Transcript (including material from previous lecture)
2023-02-20 (#11). \(\epsilon\)-nets for understanding properties (including the RI property) of random design matrices. General discussion of \(\epsilon\)-nets.
- Transcript (including material from subsequent lecture)
2023-02-15 (#10). Sparse recovery if the noisy setting: \(\ell_1\)-minimization and “LASSO” formulations and their analysis.
- See [Wainright, Chapter 7].
- Transcript (including recap of part of the previous lecture).
2023-02-13 (#9). Recap of restricted subspace property. Derivation of the restricted subspace (RS) property from the restricted isometry (RI) property. Setup for sparse recovery in the noisy setting.
- See [Wainright, Chapter 7].
- Transcript (including recap of part of the previous lecture).
2023-02-08 (#8). Putting the pieces together for clustering. Discussion on how to get around the problem of the data itself being used to create the projections (not in transcript). Recovery of sparse signals in the noiseless setting. The restricted subspace property (not in transcript).
- Transcript (including previous and subsequent lectures).
- See also Vempala and Wang, JCSS 80(4), 2004.
2023-02-06 (#7). Using \(\epsilon\)-nets for union bounds over large sets.
- Transcript (including previous and subsequent lectures).
- See also Vempala and Wang, JCSS 80(4), 2004.
2023-02-01 (#6). Concentration results for the problem of finding the best-fist space (contd.). \(\epsilon\)-nets.
- Transcript (including subsequent lectures)
- See also Vempala and Wang, JCSS 80(4), 2004.
2023-01-30 (#5). Concentration results for the problem of finding the best-fist space.
- See also Vempala and Wang, JCSS 80(4), 2004.
2023-01-25 (#4). Singular value decomposition (SVD).
- See also Vempala and Wang, JCSS 80(4), 2004.
2023-01-23 (#3). Distance based clustering for identifying mixtures of (well-separated) spherical Gaussians. Dimension reduction to the best fit space.
- See also Vempala and Wang, JCSS 80(4), 2004.
2023-01-18 (#2). The normal distribution in high dimensions. Distance based clustering for identifying mixtures of (well-separated) spherical Gaussians.
- Old notes.
2023-01-16 (#1). Preliminaries 1: Some basic statistical terminology. Hypothesis testing and the Neyman-Pearson lemma. Discussion of the geometry of the sphere in high dimension.
- Old notes: part 1, part 2.

References

Foundations of Data Science. A. Blum, J. E. Hopcroft, R. Kannan. Hindustan Book Agency.
Concentration Inequalities. S. Boucheron, G. Lugosi, P. Massart. Oxford University Press.
Concentration of Measure Inequalities for Markov Chains and \(\Phi\)-Mixing Processes. Paul-Marie Samson. Annals of Probability, 28 (1) (January 2000): 416–461.
Introduction to Nonparametric Estimation. Alexandre B. Tsybakov. Springer Series in Statistics, 2009.
High Dimensional Probability. Roman Vershynin. Cambridge University Press.
High Dimensional Statistics. Martin J. Wainright. Cambridge University Press.