Follow
Libin Zhu
Title
Cited by
Cited by
Year
Loss landscapes and optimization in over-parameterized non-linear systems and neural networks
C Liu, L Zhu, M Belkin
Applied and Computational Harmonic Analysis 59, 85-116, 2022
2082022
On the linearity of large non-linear models: when and why the tangent kernel is constant
C Liu, L Zhu, M Belkin
Advances in Neural Information Processing Systems 33, 15954-15964, 2020
1512020
Toward a theory of optimization for over-parameterized systems of non-linear equations: the lessons of deep learning
C Liu, L Zhu, M Belkin
arXiv preprint arXiv:2003.00307 7, 2020
892020
Quadratic models for understanding neural network dynamics
L Zhu, C Liu, A Radhakrishnan, M Belkin
arXiv preprint arXiv:2205.11787, 2022
152022
Catapults in sgd: spikes in the training loss and their impact on generalization through feature learning
L Zhu, C Liu, A Radhakrishnan, M Belkin
arXiv preprint arXiv:2306.04815, 2023
72023
Restricted strong convexity of deep learning models with smooth activations
A Banerjee, P Cisneros-Velarde, L Zhu, M Belkin
arXiv preprint arXiv:2209.15106, 2022
72022
Transition to linearity of general neural networks with directed acyclic graph architecture
L Zhu, C Liu, M Belkin
Advances in neural information processing systems 35, 5363-5375, 2022
52022
Neural tangent kernel at initialization: linear width suffices
A Banerjee, P Cisneros-Velarde, L Zhu, M Belkin
Uncertainty in Artificial Intelligence, 110-118, 2023
42023
Transition to linearity of wide neural networks is an emerging property of assembling weak models
C Liu, L Zhu, M Belkin
arXiv preprint arXiv:2203.05104, 2022
32022
A note on Linear Bottleneck networks and their Transition to Multilinearity
L Zhu, P Pandit, M Belkin
arXiv preprint arXiv:2206.15058, 2022
2022
The system can't perform the operation now. Try again later.
Articles 1–10