Follow
Matus Telgarsky
Matus Telgarsky
Courant Institute of Mathematical Sciences, New York University
Verified email at nyu.edu - Homepage
Title
Cited by
Cited by
Year
Tensor decompositions for learning latent variable models.
A Anandkumar, R Ge, DJ Hsu, SM Kakade, M Telgarsky
J. Mach. Learn. Res. 15 (1), 2773-2832, 2014
13142014
Spectrally-normalized margin bounds for neural networks
PL Bartlett, DJ Foster, MJ Telgarsky
Advances in neural information processing systems 30, 2017
12282017
Benefits of depth in neural networks
M Telgarsky
Conference on learning theory, 1517-1539, 2016
6502016
Non-convex learning via stochastic gradient langevin dynamics: a nonasymptotic analysis
M Raginsky, A Rakhlin, M Telgarsky
Conference on Learning Theory, 1674-1703, 2017
5482017
The implicit bias of gradient descent on nonseparable data
Z Ji, M Telgarsky
Conference on learning theory, 1772-1798, 2019
299*2019
Representation benefits of deep feedforward networks
M Telgarsky
arXiv preprint arXiv:1509.08101, 2015
2622015
Gradient descent aligns the layers of deep linear networks
Z Ji, M Telgarsky
arXiv preprint arXiv:1810.02032, 2018
2202018
Polylogarithmic width suffices for gradient descent to achieve arbitrarily small test error with shallow relu networks
Z Ji, M Telgarsky
arXiv preprint arXiv:1909.12292, 2019
1802019
Directional convergence and alignment in deep learning
Z Ji, M Telgarsky
Advances in Neural Information Processing Systems 33, 17176-17186, 2020
1432020
Hartigan’s method: k-means clustering without voronoi
M Telgarsky, A Vattani
Proceedings of the thirteenth international conference on artificial …, 2010
1182010
Neural networks and rational functions
M Telgarsky
International Conference on Machine Learning, 3387-3393, 2017
902017
Margins, shrinkage, and boosting
M Telgarsky
International Conference on Machine Learning, 307-315, 2013
822013
Gradient descent follows the regularization path for general losses
Z Ji, M Dudík, RE Schapire, M Telgarsky
Conference on Learning Theory, 2109-2136, 2020
562020
Characterizing the implicit bias via a primal-dual analysis
Z Ji, M Telgarsky
Algorithmic Learning Theory, 772-804, 2021
492021
Neural tangent kernels, transportation mappings, and universal approximation
Z Ji, M Telgarsky, R Xian
arXiv preprint arXiv:1910.06956, 2019
482019
Agglomerative bregman clustering
M Telgarsky, S Dasgupta
arXiv preprint arXiv:1206.6446, 2012
412012
Early-stopped neural networks are consistent
Z Ji, J Li, M Telgarsky
Advances in Neural Information Processing Systems 34, 1805-1817, 2021
332021
A Primal-Dual Convergence Analysis of Boosting.
M Telgarsky, Y Singer
Journal of Machine Learning Research 13 (3), 2012
332012
Representational strengths and limitations of transformers
C Sanford, DJ Hsu, M Telgarsky
Advances in Neural Information Processing Systems 36, 2024
312024
Generalization bounds via distillation
D Hsu, Z Ji, M Telgarsky, L Wang
arXiv preprint arXiv:2104.05641, 2021
292021
The system can't perform the operation now. Try again later.
Articles 1–20