Publications

E. Nichani, J. Lee, A. Bietti. Understanding Factual Recall in Transformers via Associative Memories. preprint, 2024. [arxiv]

L. Chen, J. Bruna, A. Bietti. How Truncating Weights Improves Reasoning in Language Models. preprint, 2024. [arxiv]

S. Golkar, A. Bietti, M. Pettee, M. Eickenberg, Polymathic AI team. Contextual Counting: A Mechanistic Study of Transformers on a Quantitative Task. preprint, 2024. [arxiv]

F. Kunstner, R. Yadav, A. Milligan, M. Schmidt, A. Bietti. Heavy-Tailed Class Imbalance and Why Adam Outperforms Gradient Descent on Language Models. In NeurIPS, 2024. Spotlight. [arxiv, code]

M. McCabe, B. Regaldo, L. Parker, R. Ohana, M. Cranmer, Polymathic AI team. Multiple Physics Pretraining for Physical Surrogate Models. In NeurIPS, 2024. [arxiv, code]

L. Parker, F. Lanusse, S. Golkar, L. Sarra, Polymathic AI team. AstroCLIP: A Cross-Modal Foundation Model for Galaxies. Monthly Notices of the Royal Astronomical Society, 531(4):4990-5011, 2024. [arxiv, code]

V. Cabannes, B. Simsek, A. Bietti. Learning Associative Memories with Gradient Descent. In ICML, 2024. [arxiv]

A. Mishkin, A. Bietti, R. Gower. Level Set Teleportation: An Optimization Perspective. preprint, 2024. [arxiv]

V. Cabannes, E. Dohmatob, A. Bietti. Scaling Laws for Associative Memories. In ICLR, 2024. Spotlight. [arxiv]

A. Bietti, J. Bruna, L. Pillaud-Vivien. On Learning Gaussian Multi-index Models with Gradient Flow. preprint, 2023. [arxiv]

S. Golkar, M. Pettee, M. Eickenberg, A. Bietti, Polymathic AI team. xVal: A Continuous Number Encoding for Large Language Models. preprint, 2023. [arxiv, code]

A. Bietti, V. Cabannes, D. Bouchacourt, H. Jegou, L. Bottou. Birth of a Transformer: A Memory Viewpoint. In NeurIPS, 2023. Spotlight. [arxiv, code]

V. Cabannes, B. T. Kiani, R. Balestriero, Y. LeCun, A. Bietti. The SSL Interplay: Augmentations, Inductive Bias, and Generalization. In ICML, 2023. [arxiv]

V. Cabannes, A. Bietti, R. Balestriero. On minimal variations for unsupervised representation learning. In ICASSP, 2023. [arxiv]

A. Bietti, J. Bruna, C. Sanford, M. J. Song. Learning Single-Index Models with Shallow Neural Networks. In NeurIPS, 2022. [arxiv]

D. Brandfonbrener, A. Bietti, J. Buckman, R. Laroche, J. Bruna. When does return-conditioned supervised learning work for offline reinforcement learning? In NeurIPS, 2022. [arxiv]

E. Dohmatob, A. Bietti. On the (Non-)Robustness of Two-Layer Neural Networks in Different Learning Regimes. preprint, 2022. [arxiv]

A. Bietti, C.-Y. Wei, M. Dudík, J. Langford, Z. S. Wu. Personalization Improves Privacy-Accuracy Tradeoffs in Federated Learning. In ICML, 2022. [arxiv, code, video]

A. Bietti. Approximation and Learning with Deep Convolutional Models: a Kernel Perspective. In ICLR, 2022. [arxiv, code]

H. Zenati, A. Bietti, E. Diemert, J. Mairal, M. Martin, P. Gaillard. Efficient Kernelized UCB for Contextual Bandits. In AISTATS, 2022. [arxiv, code]

C. Domingo-Enrich, A. Bietti, M. Gabrié, J. Bruna, E. Vanden-Eijnden. Dual Training of Energy-Based Models with Overparametrized Shallow Neural Networks. preprint, 2021. [arxiv, code]

A. Bietti, L. Venturi, J. Bruna. On the Sample Complexity of Learning under Invariance and Geometric Stability. In NeurIPS, 2021. [arxiv, video]

N. Keriven, A. Bietti, S. Vaiter. On the Universality of Graph Neural Networks on Large Random Graphs. In NeurIPS, 2021. [arxiv]

C. Domingo-Enrich, A. Bietti, E. Vanden-Eijnden, J. Bruna. On Energy-Based Models with Overparametrized Shallow Neural Networks. In ICML, 2021. Long talk. [arxiv, code]

A. Bietti, F. Bach. Deep Equals Shallow for ReLU Networks in Kernel Regimes. In ICLR, 2021. [arxiv, code, video]

A. Bietti, A. Agarwal, J. Langford. A Contextual Bandit Bake-off. In Journal of Machine Learning Research (JMLR), 22(133):1-49, 2021. [arxiv, hal, code]

N. Keriven, A. Bietti, S. Vaiter. Convergence and Stability of Graph Convolutional Networks on Large Random Graphs. In NeurIPS, 2020. Spotlight presentation. [arxiv]

H. Zenati, A. Bietti, M. Martin, E. Diemert, J. Mairal. Counterfactual Learning of Continuous Stochastic Policies. preprint, 2020. [arxiv, code]

A. Bietti. Foundations of Deep Convolutional Models through Kernel Methods. PhD thesis, Université Grenoble-Alpes, 2019. Best PhD prize from Université Grenoble-Alpes. [slides, hal]

A. Bietti, J. Mairal. On the Inductive Bias of Neural Tangent Kernels. In NeurIPS, 2019. [arxiv, hal, poster]

A. Bietti, G. Mialon, D. Chen, J. Mairal. A Kernel Perspective for Regularizing Deep Neural Networks. In ICML, 2019. [arxiv, hal, code]

A. Bietti, J. Mairal. Group Invariance, Stability to Deformations, and Complexity of Deep Convolutional Representations. In Journal of Machine Learning Research (JMLR), 20(25):1−49, 2019. [arxiv, hal]

A. Bietti, J. Mairal. Invariance and Stability of Deep Convolutional Representations. In NIPS, 2017. [hal]

A. Bietti, J. Mairal. Stochastic Optimization with Variance Reduction for Infinite Datasets with Finite-Sum Structure, In NIPS, 2017. Spotlight presentation. [arxiv, hal, code]

A. Bietti, F. Bach, A. Cont. An online EM algorithm in hidden (semi-)Markov models for audio segmentation and clustering. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2015. Best student paper, Machine Learning for Signal Processing track. [hal, code]

A. Bietti. Online learning for audio clustering and segmentation. Master’s thesis, Ecole Normale Supérieure de Cachan and Mines ParisTech, 2014. [hal, slides, code]