Follow
Ashish Vaswani
Ashish Vaswani
Startup
Verified email at fastmail.com
Title
Cited by
Cited by
Year
Attention is all you need
A Vaswani, N Shazeer, N Parmar, J Uszkoreit, L Jones, AN Gomez, ...
Advances in neural information processing systems 30, 2017
1178652017
Relational inductive biases, deep learning, and graph networks
PW Battaglia, JB Hamrick, V Bapst, A Sanchez-Gonzalez, V Zambaldi, ...
arXiv preprint arXiv:1806.01261, 2018
34062018
Self-attention with relative position representations
P Shaw, J Uszkoreit, A Vaswani
arXiv preprint arXiv:1803.02155, 2018
22782018
Image transformer
N Parmar, A Vaswani, J Uszkoreit, L Kaiser, N Shazeer, A Ku, D Tran
International conference on machine learning, 4055-4064, 2018
17952018
Attention augmented convolutional networks
I Bello, B Zoph, A Vaswani, J Shlens, QV Le
Proceedings of the IEEE/CVF international conference on computer vision …, 2019
12062019
Stand-alone self-attention in vision models
P Ramachandran, N Parmar, A Vaswani, I Bello, A Levskaya, J Shlens
Advances in neural information processing systems 32, 2019
11842019
Attention is all you need. arXiv 2017
A Vaswani, N Shazeer, N Parmar, J Uszkoreit, L Jones, AN Gomez, ...
arXiv preprint arXiv:1706.03762 3762, 2023
10192023
Bottleneck transformers for visual recognition
A Srinivas, TY Lin, N Parmar, J Shlens, P Abbeel, A Vaswani
Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2021
10172021
Tensor2tensor for neural machine translation
A Vaswani, S Bengio, E Brevdo, F Chollet, AN Gomez, S Gouws, L Jones, ...
arXiv preprint arXiv:1803.07416, 2018
6062018
Attention is all you need (2017)
A Vaswani, N Shazeer, N Parmar, J Uszkoreit, L Jones, AN Gomez, ...
arXiv preprint arXiv:1706.03762, 2019
5032019
Efficient content-based sparse attention with routing transformers
A Roy, M Saffar, A Vaswani, D Grangier
Transactions of the Association for Computational Linguistics 9, 53-68, 2021
4762021
One model to learn them all
L Kaiser, AN Gomez, N Shazeer, A Vaswani, N Parmar, L Jones, ...
arXiv preprint arXiv:1706.05137, 2017
3802017
Scaling local self-attention for parameter efficient visual backbones
A Vaswani, P Ramachandran, A Srinivas, N Parmar, B Hechtman, ...
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2021
3742021
Learning whom to trust with MACE
D Hovy, T Berg-Kirkpatrick, A Vaswani, E Hovy
Proceedings of the 2013 Conference of the North American Chapter of the …, 2013
3742013
Mesh-tensorflow: Deep learning for supercomputers
N Shazeer, Y Cheng, N Parmar, D Tran, A Vaswani, P Koanantakool, ...
Advances in neural information processing systems 31, 2018
3612018
Decoding with large-scale neural language models improves translation
A Vaswani, Y Zhao, V Fossum, D Chiang
Proceedings of the 2013 conference on empirical methods in natural language …, 2013
2972013
Attention is all you need. CoRR abs/1706.03762 (2017)
A Vaswani, N Shazeer, N Parmar, J Uszkoreit, L Jones, AN Gomez, ...
2642017
Attention is all you need In: Guyon I, Luxburg UV, Bengio S, et al, eds
A Vaswani, N Shazeer, N Parmar
Advances in Neural Information Processing Systems 30, 5998-6008, 0
252
Relational inductive biases, deep learning, and graph networks. arXiv 2018
PW Battaglia, JB Hamrick, V Bapst, A Sanchez-Gonzalez, V Zambaldi, ...
arXiv preprint arXiv:1806.01261, 2018
1712018
Proceedings of the 31st international conference on neural information processing systems
A Vaswani, N Shazeer, N Parmar, J Uszkoreit, L Jones, AN Gomez, ...
Curran Associates Inc., Red Hook, NY, USA, 2017
1652017
The system can't perform the operation now. Try again later.
Articles 1–20