Approximation/representation power of deep networks. I proved there exist deep networks which can only be approximated by shallower networks if they have exponentially as many nodes (arXiv, video, lecture notes one and two), and continue to work on related questions (e.g., rational functions, and generative networks (with Bolton Bailey)).
Optimization and implicit regularization of deep networks. In grad school I studied AdaBoost, and found that taking the step size to 0 leads to margin maximization (arXiv). Ziwei Ji and I have been studying margin maximization for deep networks, first proving it for logistic regression (arXiv), and then for deep linear networks (arXiv).
My research is funded by an NSF CAREER award, and an NVIDIA GPU grant.
During summer 2019 I am co-organizing a Simons Institute summer program on deep learning; I was also at the Simons Institute during Spring 2017.