Quartznet: Deep automatic speech recognition with 1d time-channel separable convolutions S Kriman, S Beliaev, B Ginsburg, J Huang, O Kuchaiev, V Lavrukhin, ... ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 342 | 2020 |
Jasper: An end-to-end convolutional neural acoustic model J Li, V Lavrukhin, B Ginsburg, R Leary, O Kuchaiev, JM Cohen, H Nguyen, ... arXiv preprint arXiv:1904.03288, 2019 | 288 | 2019 |
Nemo: a toolkit for building ai applications using neural modules O Kuchaiev, J Li, H Nguyen, O Hrinchuk, R Leary, B Ginsburg, S Kriman, ... arXiv preprint arXiv:1909.09577, 2019 | 268 | 2019 |
Mellotron: Multispeaker expressive voice synthesis by conditioning on rhythm, pitch and global style tokens R Valle, J Li, R Prenger, B Catanzaro ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 166 | 2020 |
Stochastic gradient methods with layer-wise adaptive moments for training of deep networks B Ginsburg, P Castonguay, O Hrinchuk, O Kuchaiev, V Lavrukhin, R Leary, ... arXiv preprint arXiv:1905.11286, 2019 | 103 | 2019 |
Training neural speech recognition systems with synthetic speech augmentation J Li, R Gadde, B Ginsburg, V Lavrukhin arXiv preprint arXiv:1811.00707, 2018 | 70 | 2018 |
Mixed-precision training for nlp and speech recognition with openseq2seq O Kuchaiev, B Ginsburg, I Gitman, V Lavrukhin, J Li, H Nguyen, C Case, ... arXiv preprint arXiv:1805.10387, 2018 | 51 | 2018 |
SpeakerNet: 1D depth-wise separable convolutional network for text-independent speaker recognition and verification NR Koluguri, J Li, V Lavrukhin, B Ginsburg arXiv preprint arXiv:2010.12653, 2020 | 42 | 2020 |
Cross-language transfer learning, continuous learning, and domain adaptation for end-to-end automatic speech recognition J Huang, O Kuchaiev, P O'Neill, V Lavrukhin, J Li, A Flores, G Kucsko, ... arXiv preprint arXiv:2005.04290, 2020 | 27 | 2020 |
Salm: Speech-augmented language model with in-context learning for speech recognition and translation Z Chen, H Huang, A Andrusenko, O Hrinchuk, KC Puvvada, J Li, S Ghosh, ... ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | 24 | 2024 |
Ace-vc: Adaptive and controllable voice conversion using explicitly disentangled self-supervised speech representations S Hussain, P Neekhara, J Huang, J Li, B Ginsburg ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 18 | 2023 |
Adapting TTS models for new speakers using transfer learning P Neekhara, J Li, B Ginsburg arXiv preprint arXiv:2110.05798, 2021 | 10 | 2021 |
Training deep networks with stochastic gradient normalized by layerwise adaptive second moments B Ginsburg, P Castonguay, O Hrinchuk, O Kuchaiev, V Lavrukhin, R Leary, ... | 10 | 2019 |
Improving robustness of llm-based speech synthesis by learning monotonic alignment P Neekhara, S Hussain, S Ghosh, J Li, R Valle, R Badlani, B Ginsburg arXiv preprint arXiv:2406.17957, 2024 | 2 | 2024 |