Greedy bi-criteria approximations for \(k\)-medians and \(k\)-means.
Daniel Hsu, Matus Telgarsky.
[arXiv]
- A natural greedy algorithm for \(k\)-medians and \(k\)-means using
\(\mathcal{O}(k\ln(1/\varepsilon))\)
centers and achieving approximation ratio
\(2+\varepsilon\) (or even better when the data is nice) efficiently,
or \(1+\varepsilon\) in general with randomly restarted descent methods
(\(n^{1/\varepsilon}\) restarts for \(k\)-means,
\(n^{\mathcal{O}(1/\varepsilon^2)}\) restarts for \(k\)-medians).
Benefits of Depth in Neural Networks.
Matus Telgarsky.
[arXiv]
[10min video]
- Conference on Learning Theory (COLT), 2016.
- For every positive integer \(k\), there exist deep neural networks with \(\Theta(k^3)\) layers and nodes
which can not be approximated by any shallow network with \(\leq k\) layers and \(\leq 2^k\) nodes.
- See also this simplified version.
Convex Risk Minimization and Conditional Probability Estimation.
Matus Telgarsky, Miroslav DudÃk, Robert Schapire.
[arXiv]
[short video]
[poster]
- Conference on Learning Theory (COLT), 2015.
- Even when the parameter space is ill-behaved (infinite dimensional, minima
don't exist, not bounded or regularized), risk minimization of certain standard
losses still converges to a unique object;
in the finite dimensional case, uniform convergence (generalization) holds for empirical risk minimization.
Moment-based Uniform Deviation Bounds for \(k\)-means and Friends.
Matus Telgarsky, Sanjoy Dasgupta.
[pdf]
[arXiv]
[poster]
- Advances in Neural Information Processing Systems (NIPS), 2013.
- Generalization bounds for \(k\)-means cost and
Gaussian mixture log-likelihood for unbounded parameter sets
when the data has a few bounded moments
(no boundedness or further modeling assumptions needed).
Margins, Shrinkage, and Boosting.
Matus Telgarsky.
[arXiv]
[video]
- International Conference on Machine Learning (ICML), 2013.
- AdaBoost, with a variety of losses, attains optimal margins
by simply multiplying the step size with a small constant.
Agglomerative Bregman Clustering.
Matus Telgarsky, Sanjoy Dasgupta.
[pdf]
[short video]
- International Conference on Machine Learning (ICML), 2012.
- Provides the natural algorithm,
with attention to: handling degenerate clusters via smoothing,
Bregman divergences for nondifferentiable convex functions,
exponential families without minimality assumptions.
Full list of papers.