## Essentials

• Time & Place. WF 11:00-12:15, Siebel 1214 (used to be 4405).

• Instructor. Matus Telgarsky (Office hours: Siebel 3212 (used to be 3336), M 5:00-7:00).

• Homework. 50% of your grade. Can work alone or in a group of size 2. Homework must be $$\LaTeX$$-compiled, and submitted through gradescope (self-enrollment code 9J8G59). Full details appear below.

• Project. 50% of your grade. Must contain some theoretical component. Full details appear below.

• Academic integrity. Cheating in this class wastes everyone’s time, just take something else. Please see the full information below.

• Discussion. piazza, here’s the signup link.

• Feedback. This course is experimental; not just new material and a new instructor, but new notes written in a new way (markdown+latex). Please provide feedback!

## Schedule

Notes posted following lectures; they will lack some intuition, pictures, and discussion found in class, but may have some more rigor. The schedule for future lectures is tentative (and perhaps too quick).

 Date. Topics. Notes. Coursework. 8/24 Syllabus, philosophy, and a quick proof. html, pdf. hw$$0$$ out: html, pdf, md. Representation. 8/26 Linear, linear with rich bases, lemma for next lecture. html, pdf. 8/31 Trees, boosted trees, branching programs, neural nets intro. html, pdf. hw$$0$$ due. 9/2 3 layer networks, 2 layer networks. html, pdf. 9/7 Benefits of depth, part 1. html, pdf. 9/9 Benefits of depth, part 2. html, pdf. hw$$1$$ out: tex, pdf, bib. Optimization. 9/14 Convexity bootcamp part 1: basic objects. html, pdf. pm$$0$$ due. 9/16 Convexity bootcamp part 2: duality. html, pdf. 9/21 SVM basics, representer theorem. html, pdf. 9/23 SVM recap, convex opt overview, Frank-Wolfe. html, pdf. 9/28 No class! Go to Allerton! 9/30 Smoothness and steepest descent. html, pdf. 10/5 Smoothness and steepest descent, part 2; AdaBoost intro. html, pdf. hw$$1$$ due. 10/7 AdaBoost (from the steepest descent perspective). html, pdf. 10/12 Consistency of convex risk minimization, part 1. html, pdf. 10/14 Consistency of convex risk minimization, part 2. html, pdf. 10/19 Clustering bootcamp. html, pdf. Generalization. 10/21 Measure concentration bootcamp. html, pdf. 10/26 Finite classes, primitive covering numbers. html, pdf. 10/28 Symmetrization and Rademacher complexity. html, pdf. 11/2 Rademacher complexity properties: lipschitz losses and finite classes. html, pdf. hw$$2$$ out: tex, pdf. 11/4 Rademacher complexity properties: VC dimension and margins. html, pdf. 11/9 Covering and Rademacher bounds for neural networks. html, pdf. 11/11 VC dimension of linear threshold networks. html, pdf. 11/16 VC dimension of ReLU networks. html, pdf. Miscellaneous. 11/18 Fast rates. hw$$2$$ due. hw$$3$$ out: tex, pdf. 11/30 Non-convex gradient descent guarantees. 12/2 The Kolmogorov-Arnol’d Theorem. 12/7 Class cancelled. Final presentations and homework. 12/7 Due in gradescope by $$11$$am: project writeup and slides. 12/8 Final presentations, 12-4pm, Siebel 3403. 12/14 hw$$3$$ due. Omitted material Learning despite heavy tails. Consistency of boosting. Sparse recovery and the LASSO. Active learning. Spectral methods. Online mirror descent. SVMs and representation: universal kernels.

## Homework policies

• Homework 0 only counts for 1 point, awarded for any reasonable attempt. This homework is to callibrate both you and me.
• Other homeworks have 1 or 2 problems graded for ~10 points; other problems receive a checkmark for reasonable attempts.
• Lowest full homework grade is dropped.
• There will be 3-4 full homeworks. (Most likely 3.)
• Groups.
• Homework 0 must be completed individually.
• For homeworks 1-4 you may work alone, or in a pair.
• Please cite any other resource/discussion you use.
• Submission.
• Homeworks must pass through a $$\LaTeX$$ compiler. If you wish, use a markdown+latex compiler (e.g., pandoc).
• I recommend lshort as a $$\LaTeX$$ tutorial and rudimentary reference.
• Electronic submission only through gradescope (self-enrollment code 9J8G59).
• No late homework. In exchange, I will grade promptly (before the next class.)
• Homework is due before class (11am) on the day it is due.
• There is absolutely no reason to cheat in an optional grad class; please do not waste your time or my time and just drop instead.
• Cite any resource you use.
• For further guidelines, please see Jeff Erickson’s comments (which don’t apply verbatim, but you get the idea).

## Project policies

• Content. The project must contain a theoretical component; whether you include something else as well is up to you, but will not fundamentally affect the grade. Also, please focus on quality; if you can make something cleaner and shorter without removing information, that is preferred.

• Project themes. Here are some possible projects (concrete ideas will be sprinkled throughout the course):
• A genuinely new, non-trivial theoretical result.
• A clean-up of some complicated, confusing result; e.g., replacing a 20 page analysis with a 2 page analysis (in a way that isn’t obvious given other recent papers).
• A survey which aims to unify and clarify relationships between the works it considers.
• Submission. You will turn in both a written report and present on December 8.
• The report must be at least 2 pages. This is vastly shorter than the homework; therefore, your submission should be high quality. A 2-page submission which consists of filler is not good.
• Presentations consist of exactly 2 slides and take no more than 5 minutes; the first summarizes the topic, the second says something interesting you encountered.
• Projects are handed in on gradescope just like everything else.
• The idea behind the submission is anti-busywork; it’s short, but should be good!
• Milestones. There will be a few deadlines for the project.
• Project milestone 0, due September 14.
• You must hand in a list of three project suggestions. For each suggestion, write 1-5 sentences, including the “theme” of the project, and include at least one reference. Here is example tex and example pdf.