Alberto Bietti

John Sous

CoRR, May, 2026

Multimodal Alignment and Preference Optimization for Zero-Shot Conditional RNA Generation.

[BibT_eX]

[DOI]

Roman Klypa

Sergei Grudinin

CoRR, May, 2026

Geometric Factual Recall in Transformers.

[BibT_eX]

[DOI]

CoRR, May, 2026

MIMIC: A Generative Multimodal Foundation Model for Biomolecules.

[BibT_eX]

[DOI]

CoRR, April, 2026

Sharp Capacity Scaling of Spectral Optimizers in Learning Associative Memory.

[BibT_eX]

[DOI]

CoRR, March, 2026

Understanding Contextual Recall in Transformers: How Finetuning Enables In-Context Reasoning over Pretraining Knowledge.

[BibT_eX]

[DOI]

Christos Thrampoulidis

CoRR, March, 2026

Protein Design with Agent Rosetta: A Case Study for Specialized Scientific Agents.

[BibT_eX]

[DOI]

Jacopo Teneggi

S. M. Bargeen A. Turzo

Tanya Marwah

P. Douglas Renfrew

Vikram Khipple Mulligan

Siavash Golkar

CoRR, March, 2026

Learning to Recall with Transformers Beyond Orthogonal Embeddings.

[BibT_eX]

[DOI]

CoRR, March, 2026

Representation Learning for Spatiotemporal Physical Systems.

[BibT_eX]

[DOI]

CoRR, March, 2026

2025

Understanding the Mechanisms of Fast Hyperparameter Transfer.

[BibT_eX]

[DOI]

Nikhil Ghosh

Denny Wu

CoRR, December, 2025

From Shortcut to Induction Head: How Data Diversity Shapes Algorithm Selection in Transformers.

[BibT_eX]

[DOI]

CoRR, December, 2025

Physics Steering: Causal Control of Cross-Domain Concepts in a Physics Foundation Model.

[BibT_eX]

[DOI]

CoRR, November, 2025

Predicting partially observable dynamical systems via diffusion models with a multiscale inference scheme.

[BibT_eX]

[DOI]

CoRR, November, 2025

Walrus: A Cross-Domain Foundation Model for Continuum Dynamics.

[BibT_eX]

[DOI]

CoRR, November, 2025

Universal Spectral Tokenization via Self-Supervised Panchromatic Representation Learning.

[BibT_eX]

[DOI]

CoRR, October, 2025

Emergence of Linear Truth Encodings in Language Models.

[BibT_eX]

[DOI]

CoRR, October, 2025

Aristotle: IMO-level Automated Theorem Proving.

[BibT_eX]

[DOI]

CoRR, October, 2025

GSM-Agent: Understanding Agentic Reasoning Using Controllable Environments.

[BibT_eX]

[DOI]

CoRR, September, 2025

Counterfactual Learning of Stochastic Policies with Continuous Actions.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2025

AION-1: Omnimodal Foundation Model for Astronomical Sciences.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

In-Context Denoising with One-Layer Transformers: Connections between Attention and Associative Memory Retrieval.

[BibT_eX]

[DOI]

Matthew Smart

Anirvan M. Sengupta

Proceedings of the Forty-second International Conference on Machine Learning, 2025

BAnG: Bidirectional Anchored Generation for Conditional RNA Design.

[BibT_eX]

[DOI]

Roman Klypa

Sergei Grudinin

Proceedings of the Forty-second International Conference on Machine Learning, 2025

Understanding Factual Recall in Transformers via Associative Memories.

[BibT_eX]

[DOI]

Eshaan Nichani

Jason D. Lee

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Distributional Associations vs In-Context Reasoning: A Study of Feed-forward and Attention Layers.

[BibT_eX]

[DOI]

Lei Chen

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Learning Compositional Functions with Transformers from Easy-to-Hard Data.

[BibT_eX]

[DOI]

Proceedings of the Thirty Eighth Annual Conference on Learning Theory, 2025

Level Set Teleportation: An Optimization Perspective.

[BibT_eX]

[DOI]

Aaron Mishkin

Robert M. Gower

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2025

2024

How Truncating Weights Improves Reasoning in Language Models.

[BibT_eX]

[DOI]

Lei Chen

CoRR, 2024

Contextual Counting: A Mechanistic Study of Transformers on a Quantitative Task.

[BibT_eX]

[DOI]

Kyunghyun Cho

Shirley Ho

CoRR, 2024

Multiple Physics Pretraining for Spatiotemporal Surrogate Models.

[BibT_eX]

[DOI]

Michael McCabe

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Heavy-Tailed Class Imbalance and Why Adam Outperforms Gradient Descent on Language Models.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Learning Associative Memories with Gradient Descent.

[BibT_eX]

[DOI]

Vivien Cabannes

Berfin Simsek

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Scaling Laws for Associative Memories.

[BibT_eX]

[DOI]

Vivien Cabannes

Elvis Dohmatob

Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023

On Learning Gaussian Multi-index Models with Gradient Flow.

[BibT_eX]

[DOI]

Loucas Pillaud-Vivien

CoRR, 2023

AstroCLIP: Cross-Modal Pre-Training for Astronomical Foundation Models.

[BibT_eX]

[DOI]

Tiberiu Tesileanu

Kyunghyun Cho

Shirley Ho

CoRR, 2023

Multiple Physics Pretraining for Physical Surrogate Models.

[BibT_eX]

[DOI]

Michael McCabe

CoRR, 2023

xVal: A Continuous Number Encoding for Large Language Models.

[BibT_eX]

[DOI]

Tiberiu Tesileanu

Kyunghyun Cho

Shirley Ho

CoRR, 2023

Birth of a Transformer: A Memory Viewpoint.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

The SSL Interplay: Augmentations, Inductive Bias, and Generalization.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

On Minimal Variations for Unsupervised Representation Learning.

[BibT_eX]

[DOI]

Vivien Cabannes

Randall Balestriero

Proceedings of the IEEE International Conference on Acoustics, 2023

2022

On the (Non-)Robustness of Two-Layer Neural Networks in Different Learning Regimes.

[BibT_eX]

[DOI]

Elvis Dohmatob

CoRR, 2022

Efficient Kernel UCB for Contextual Bandits.

[BibT_eX]

[DOI]

CoRR, 2022

Personalization Improves Privacy-Accuracy Tradeoffs in Federated Optimization.

[BibT_eX]

[DOI]

CoRR, 2022

When does return-conditioned supervised learning work for offline reinforcement learning?

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Learning single-index models with shallow neural networks.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Personalization Improves Privacy-Accuracy Tradeoffs in Federated Learning.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Approximation and Learning with Deep Convolutional Models: a Kernel Perspective.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

Efficient Kernelized UCB for Contextual Bandits.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

2021

A Contextual Bandit Bake-off.

[BibT_eX]

[DOI]

Alekh Agarwal

John Langford

J. Mach. Learn. Res., 2021

Dual Training of Energy-Based Models with Overparametrized Shallow Neural Networks.

[BibT_eX]

[DOI]

Carles Domingo-Enrich

CoRR, 2021

On the Sample Complexity of Learning with Geometric Stability.

[BibT_eX]

[DOI]

Luca Venturi

CoRR, 2021

On Approximation in Deep Convolutional Networks: a Kernel Perspective.

[BibT_eX]

[DOI]

CoRR, 2021

On the Universality of Graph Neural Networks on Large Random Graphs.

[BibT_eX]

[DOI]

Nicolas Keriven

Samuel Vaiter

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

On the Sample Complexity of Learning under Geometric Stability.

[BibT_eX]

[DOI]

Luca Venturi

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

On Energy-Based Models with Overparametrized Shallow Neural Networks.

[BibT_eX]

[DOI]

Carles Domingo-Enrich

Eric Vanden-Eijnden

Proceedings of the 38th International Conference on Machine Learning, 2021

Deep Equals Shallow for ReLU Networks in Kernel Regimes.

[BibT_eX]

[DOI]

Francis R. Bach

Proceedings of the 9th International Conference on Learning Representations, 2021

2020

Optimization Approaches for Counterfactual Risk Minimization with Continuous Actions.

[BibT_eX]

[DOI]

CoRR, 2020

Convergence and Stability of Graph Convolutional Networks on Large Random Graphs.

[BibT_eX]

[DOI]

Nicolas Keriven

Samuel Vaiter

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

2019

Foundations of deep convolutional models through kernel methods. (Méthodes à noyaux pour les réseaux convolutionnels profonds).

[BibT_eX]

[DOI]

PhD thesis, 2019

Group Invariance, Stability to Deformations, and Complexity of Deep Convolutional Representations.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2019

On the Inductive Bias of Neural Tangent Kernels.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

A Kernel Perspective for Regularizing Deep Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 36th International Conference on Machine Learning, 2019

2018

On Regularization and Robustness of Deep Neural Networks.

[BibT_eX]

[DOI]

Grégoire Mialon

CoRR, 2018

Practical Evaluation and Optimization of Contextual Bandit Algorithms.

[BibT_eX]

[DOI]

Alekh Agarwal

John Langford

CoRR, 2018

2017

Group Invariance and Stability to Deformations of Deep Convolutional Representations.

[BibT_eX]

[DOI]

CoRR, 2017

Invariance and Stability of Deep Convolutional Representations.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Stochastic Optimization with Variance Reduction for Infinite Datasets with Finite Sum Structure.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

2015

An online EM algorithm in hidden (semi-)Markov models for audio segmentation and clustering.

[BibT_eX]

[DOI]