Data-free quantization through weight equalization and bias correction M Nagel, M Baalen, T Blankevoort, M Welling Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2019 | 559 | 2019 |
Up or down? adaptive rounding for post-training quantization M Nagel, RA Amjad, M Van Baalen, C Louizos, T Blankevoort International Conference on Machine Learning, 7197-7206, 2020 | 463 | 2020 |
A white paper on neural network quantization M Nagel, M Fournarakis, RA Amjad, Y Bondarenko, M Van Baalen, ... arXiv preprint arXiv:2106.08295, 2021 | 453 | 2021 |
Bayesian bits: Unifying quantization and pruning M Van Baalen, C Louizos, M Nagel, RA Amjad, Y Wang, T Blankevoort, ... Advances in neural information processing systems 33, 5741-5752, 2020 | 126 | 2020 |
Gradient Regularization for Quantization Robustness M Alizadeh, A Behboodi, M Van Baalen, C Louizos, T Blankevoort, ... arXiv preprint arXiv:2002.07520, 2020 | 58 | 2020 |
Fp8 quantization: The power of the exponent A Kuzmin, M Van Baalen, Y Ren, M Nagel, J Peters, T Blankevoort Advances in Neural Information Processing Systems 35, 14651-14662, 2022 | 52 | 2022 |
A white paper on neural network quantization. arXiv 2021 M Nagel, M Fournarakis, RA Amjad, Y Bondarenko, M van Baalen, ... arXiv preprint arXiv:2106.08295, 0 | 31 | |
FP8 versus INT8 for efficient deep learning inference M van Baalen, A Kuzmin, SS Nair, Y Ren, E Mahurin, C Patel, ... arXiv preprint arXiv:2303.17951, 2023 | 23 | 2023 |
Deep matrix factorization for recommendation M van Baalen Master's Thesis, Univ. of Amsterdam, Sep 30, 2016 | 21 | 2016 |
Pruning vs quantization: which is better? A Kuzmin, M Nagel, M Van Baalen, A Behboodi, T Blankevoort Advances in neural information processing systems 36, 2024 | 19 | 2024 |
Cyclical pruning for sparse neural networks S Srinivas, A Kuzmin, M Nagel, M van Baalen, A Skliar, T Blankevoort Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022 | 15 | 2022 |
The llm surgeon TFA van der Ouderaa, M Nagel, M Van Baalen, YM Asano, T Blankevoort arXiv preprint arXiv:2312.17244, 2023 | 12 | 2023 |
Simulated quantization, real power savings M van Baalen, B Kahne, E Mahurin, A Kuzmin, A Skliar, M Nagel, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022 | 9 | 2022 |
A practical mixed precision algorithm for post-training quantization NP Pandey, M Nagel, M van Baalen, Y Huang, C Patel, T Blankevoort arXiv preprint arXiv:2302.05397, 2023 | 8 | 2023 |
FP8 versus INT8 for efficient deep learning inference M Baalen, A Kuzmin, SS Nair, Y Ren, E Mahurin, C Patel, S Subramanian, ... arXiv preprint arXiv:2303.17951, 2023 | 6 | 2023 |
Gptvq: The blessing of dimensionality for llm quantization M van Baalen, A Kuzmin, M Nagel, P Couperus, C Bastoul, E Mahurin, ... arXiv preprint arXiv:2402.15319, 2024 | 3 | 2024 |
Qbitopt: Fast and accurate bitwidth reallocation during training J Peters, M Fournarakis, M Nagel, M Van Baalen, T Blankevoort Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023 | 3 | 2023 |
Quantized sparse weight decomposition for neural network compression A Kuzmin, M van Baalen, M Nagel, A Behboodi arXiv preprint arXiv:2207.11048, 2022 | 2 | 2022 |
Sparse High Rank Adapters K Bhardwaj, NP Pandey, S Priyadarshi, V Ganapathy, R Esteves, ... arXiv preprint arXiv:2406.13175, 2024 | 1 | 2024 |
A Practical Mixed Precision Algorithm for Post-Training Quantization N Prasad Pandey, M Nagel, M van Baalen, Y Huang, C Patel, ... arXiv e-prints, arXiv: 2302.05397, 2023 | 1 | 2023 |