## Essentials

• Time & Place. MW 3:30-4:45, Siebel 1214.

• Instructor. Matus Telgarsky (Office hours: Siebel 3212, W 4:45-6:00).
TA. Yucheng Chen (Office hours: Siebel 2107, M 5-6, when homework due M 5-7).

• Evaluation. Homework is 80% of your grade, project is 20% of your grade. All handed-in work must be $$\LaTeX$$-compiled, on time, and submitted through gradescope (self-enrollment code 9GEERY). Homework details and project details appear below.

• Academic integrity. Cheating in this class wastes everyone’s time, just take something else. Please see the full information below.

• Discussion and announcements. piazza, here’s the signup link.

## Schedule

• Notes are typed when possible, otherwise handwritten notes are scanned.

• Schedule for future lectures is imprecise.

 Date. Topics. Notes. Coursework. 8/28 Administrivia; perceptron. pdf. hw0 out: tex, pdf. 8/30 Perceptron; decomposition of learning problems. pdf. Representation. 9/6 Failure of linear; box apx (linear over boxes, decision trees). pdf. hw0 due! 9/11 End of box apx: boosted decision trees, branching programs, 3-layer ReLU nets. Start of poly-fit: Stone-Weierstrass! pdf. 9/13 Polynomial fit via Stone-Weierstrass: sums of exponentials, RBF kernels, 2-layer networks. pdf. 9/18 RKHS interlude. pdf. 9/20 Succinct deep networks; multiplication with networks; networks and polynomials, smooth functions; Wasserstein distance, probability modeling, and GANs. Optimization. Convexity bootcamp; gradient descent in the smooth case; subgradient descent in the bounded+Lipschitz case; mirror descent and geometry; GLORIOUS MAUREY SPARSIFICATION; consistency of convex risk minimization; something non-convex; clustering. Generalization. Concentration bootcamp; Finite classes and primitive covering; Symmetrization and Rademacher complexity; Lipschitz losses, margin losses, finite classes; Full covering, Dudley, and Sudakov; Linear predictors via covers and Rademacher; VC dimension and VC for neural networks; Rademacher and covering bounds for neural networks; Miscellaneous. Depends on how much time remains! I hope: heavy tails; online learning; reinforcement learning; spectral methods.

## Homework policies

• Homework 0 is 5% of your grade. It should be easy.
• Groups.
• Homework 0 must be completed individually.
• For homeworks 1-3, everyone must submit an individual, unique handin on gradescope. You may discuss with up to 3 people; state their NetIDs on page 1 of the handin.
• Submission.
• Homeworks must pass through a $$\LaTeX$$ compiler.
• I recommend lshort as a $$\LaTeX$$ tutorial and rudimentary reference.
• Electronic submission only through gradescope (self-enrollment code 9GEERY).
• No late homework. In exchange, homework is graded promptly (within 1 week).
• Homework is due at 3:30pm on the day it is due.
• There is absolutely no reason to cheat in an optional grad class; please do not waste your time or my time and just drop instead.
• I prefer if you do not use outside resources. If you do, you must cite them, and still you must state everything in your own words.
• If we find possible cheating cases, we will immediately submit them to the department review board without fretting over it.

## Project policies

• Groups. You may work individually, or in pairs.

• Content. The project must contain a theoretical component; whether you include something else as well is up to you, but will not fundamentally affect the grade. Also, please focus on quality; if you can make something cleaner and shorter without removing information, that is preferred.

• Project themes. Here are some possible projects (concrete ideas will be sprinkled throughout the course):
• A genuinely new, non-trivial theoretical result.
• A clean-up of some complicated, confusing result; e.g., replacing a 20 page analysis with a 2 page analysis (in a way that isn’t obvious given other recent papers).
• A survey which aims to unify and clarify relationships between the works it considers.
• Submission. You will turn in both a written report and present at the end of the semester.
• The report must be at least 2 pages. This is vastly shorter than the homework; therefore, your submission should be high quality. A 2-page submission which consists of filler is not good.
• Presentations consist of exactly 2 slides and take no more than 5 minutes; the first summarizes the topic, the second says something interesting you encountered.
• Projects are handed in on gradescope just like everything else.
• The idea behind the submission is anti-busywork; it’s short, but should be good!
• Milestones. We’ll schedule meetings some time in October so that I can sanity check all projects.

## Resources

Other learning theory-ish classes. All of these courses are different, and all have good material, and there are many I neglected to include!

• Lieven Vandenberghe @ UCLA. This is not a learning theory course, it’s part 3 of a long optimization course, covering material not in the standard Boyd-Vandenberghe book. The lectures links are to slides; the proofs there are incredibly clean, indeed this is my favorite resouce for many of these methods.

Textbooks and surveys. Again, there are many others, but here are a key few.