Mohammad Gheshlaghi Azar

Cited by

	All	Since 2019
Citations	12878	12368
h-index	27	26
i10-index	34	33

3700

1850

925

2775

201520162017201820192020202120222023202441 52 76 263 565 841 1811 2970 3693 2473

Public access

View all

3 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Rémi MunosGoogle DeepMindVerified email at inria.fr
Bilal PiotGoogle DeepmindVerified email at google.com
Michal ValkoLlama @ Meta Paris & Inria & MVA - Ex: Gemini and BYOL @ Google DeepMindVerified email at meta.com
Zhaohan Daniel GuoDeepMindVerified email at google.com
Florent AltchéResearch Engineer, DeepMindVerified email at google.com
Jean-bastien GrillVerified email at google.com
Corentin TallecDeepMindVerified email at google.com
Hado van HasseltResearch Scientist, DeepMind; Honorary Professor, UCLVerified email at google.com
Florian STRUBCohereVerified email at cohere.com
Pierre RichemondGoogle DeepMindVerified email at deepmind.com
Hilbert Johan KappenRadboud UniversityVerified email at science.ru.nl
Will DabneyDeepMindVerified email at google.com
Elena BuchatskayaResearch Engineer, Google DeepMindVerified email at google.com
Matteo HesselResearch Engineer, Google DeepMindVerified email at google.com
Dan HorganGoogle DeepMindVerified email at google.com
Eva L. DyerGeorgia Institute of TechnologyVerified email at gatech.edu
Mark RowlandResearch Scientist, Google DeepMindVerified email at google.com
Shantanu ThakoorResearch Engineer at DeepMindVerified email at google.com
Carl DoerschGoogle DeepMindVerified email at google.com
Tom SchaulSenior Staff Scientist, DeepMindVerified email at nyu.edu

Mohammad Gheshlaghi Azar

Cohere

Verified email at google.com - Homepage

RL for Generative AI Self-Supervised Learning Exploration Optimization


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Bootstrap your own latent-a new approach to self-supervised learning JB Grill, F Strub, F Altché, C Tallec, P Richemond, E Buchatskaya, ... Advances in neural information processing systems 33, 21271-21284, 2020	6108	2020
Rainbow: Combining improvements in deep reinforcement learning M Hessel, J Modayil, H Van Hasselt, T Schaul, G Ostrovski, W Dabney, ... Proceedings of the AAAI conference on artificial intelligence 32 (1), 2018	2641	2018
Minimax regret bounds for reinforcement learning MG Azar, I Osband, R Munos International conference on machine learning, 263-272, 2017	825	2017
koray kavukcuoglu, Remi Munos, and Michal Valko. Bootstrap your own latent-a new approach to self-supervised learning JB Grill, F Strub, F Altché, C Tallec, P Richemond, E Buchatskaya, ... Advances in neural information processing systems 33, 21271-21284, 2020	455	2020
Large-scale representation learning on graphs via bootstrapping S Thakoor, C Tallec, MG Azar, M Azabou, EL Dyer, R Munos, P Veličković, ... arXiv preprint arXiv:2102.06514, 2021	395*	2021
Minimax PAC bounds on the sample complexity of reinforcement learning with a generative model M Gheshlaghi Azar, R Munos, HJ Kappen Machine learning 91, 325-349, 2013	296	2013
Speedy Q-Learning MG Azar, M Ghavamzadeh, HJ Kappen, R Munos Advances in Neural Information Processing Systems, 2411-2419, 2011	210*	2011
The reactor: A fast and sample-efficient actor-critic agent for reinforcement learning A Gruslys, W Dabney, MG Azar, B Piot, M Bellemare, R Munos arXiv preprint arXiv:1704.04651, 2017	169*	2017
A general theoretical paradigm to understand learning from human preferences MG Azar, ZD Guo, B Piot, R Munos, M Rowland, M Valko, D Calandriello International Conference on Artificial Intelligence and Statistics, 4447-4455, 2024	159	2024
Dynamic Policy Programming M Gheshlaghi Azar, V Gomez, HJ Kappen Journal of Machine Learning Research 13, 3207-3245, 2012	151	2012
Bootstrap latent-predictive representations for multitask reinforcement learning ZD Guo, BA Pires, B Piot, JB Grill, F Altché, R Munos, MG Azar International Conference on Machine Learning, 3875-3886, 2020	145	2020
Observe and look further: Achieving consistent performance on atari T Pohlen, B Piot, T Hester, MG Azar, D Horgan, D Budden, G Barth-Maron, ... arXiv preprint arXiv:1805.11593, 2018	137	2018
Sequential transfer in multi-armed bandit with finite set of models MG Azar, A Lazaric, E Brunskill Advances in Neural Information Processing Systems, 2220-2228, 2013	119	2013
On the sample complexity of reinforcement learning with a generative model MG Azar, R Munos, B Kappen arXiv preprint arXiv:1206.6461, 2012	116	2012
Hindsight credit assignment A Harutyunyan, W Dabney, T Mesnard, M Gheshlaghi Azar, B Piot, ... Advances in neural information processing systems 32, 2019	96	2019
Meta-learning of sequential strategies PA Ortega, JX Wang, M Rowland, T Genewein, Z Kurth-Nelson, ... arXiv preprint arXiv:1905.03030, 2019	90	2019
Neural predictive belief representations ZD Guo, MG Azar, B Piot, BA Pires, R Munos arXiv preprint arXiv:1811.06407, 2018	89	2018
Stochastic optimization of a locally smooth function under correlated bandit feedback MG Azar, A Lazaric, E Brunskill 31st International Conference on Machine Learning (ICML), 2014	66*	2014
A cryptography-based approach for movement decoding EL Dyer, M Gheshlaghi Azar, MG Perich, HL Fernandes, S Naufel, ... Nature biomedical engineering 1 (12), 967-976, 2017	63	2017
Byol-explore: Exploration by bootstrapped prediction Z Guo, S Thakoor, M Pîslar, B Avila Pires, F Altché, C Tallec, A Saade, ... Advances in neural information processing systems 35, 31855-31870, 2022	58	2022

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors