Simon Lacoste-Julien

CoRR, September, 2025

Bias Analysis in Unconditional Image Generative Models.

[BibT_eX]

[DOI]

CoRR, June, 2025

Position: Adopt Constraints Over Penalties in Deep Learning.

[BibT_eX]

[DOI]

Juan Ramirez

Meraj Hashemizadeh

CoRR, May, 2025

Cooper: A Library for Constrained Optimization in Deep Learning.

[BibT_eX]

[DOI]

CoRR, April, 2025

Understanding Adam Requires Better Rotation Dependent Assumptions.

[BibT_eX]

[DOI]

Tianyue H. Zhang

Lucas Maes

Alan Milligan

Ioannis Mitliagkas

Damien Scieur

Reza Babanezhad Harikandeh

Charles Guille-Escuret

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

Tight Lower Bounds and Improved Convergence in Performative Prediction.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

Accelerating Training with Neuron Interaction and Nowcasting Networks.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Feasible Learning.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2025

Performative Prediction on Games and Mechanism Design.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2025

2024

Promoting Exploration in Memory-Augmented Adam using Critical Momenta.

[BibT_eX]

[DOI]

Pranshu Malviya

Gonçalo Mordido

Aristide Baratin

Trans. Mach. Learn. Res., 2024

PopulAtion Parameter Averaging (PAPA).

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2024

Understanding Adam Requires Better Rotation Dependent Assumptions.

[BibT_eX]

[DOI]

Lucas Maes

Tianyue H. Zhang

Ioannis Mitliagkas

Damien Scieur

Charles Guille-Escuret

CoRR, 2024

Nonparametric Partial Disentanglement via Mechanism Sparsity: Sparse Actions, Interventions and Sparse Temporal Dependencies.

[BibT_eX]

[DOI]

CoRR, 2024

On PI Controllers for Updating Lagrange Multipliers in Constrained Optimization.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Balancing Act: Constraining Disparate Impact in Sparse Models.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

On the Identifiability of Quantized Factors.

[BibT_eX]

[DOI]

Proceedings of the Causal Learning and Reasoning, 2024

Weight-Sharing Regularization.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2024

2023

A Survey of Self-Supervised and Few-Shot Object Detection.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., April, 2023

Identifiability of Discretized Latent Coordinate Systems via Density Landmarks Detection.

[BibT_eX]

[DOI]

CoRR, 2023

Additive Decoders for Latent Variables Identification and Cartesian-Product Extrapolation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Unlocking Slot Attention by Changing Optimal Transport Costs.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Synergies between Disentanglement and Sparsity: Generalization and Identifiability in Multi-Task Learning.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Can We Scale Transformers to Predict Parameters of Diverse ImageNet Models?

[BibT_eX]

[DOI]

Boris Knyazev

Doha Hwang

Proceedings of the International Conference on Machine Learning, 2023

CrossSplit: Mitigating Label Noise Memorization through Data Splitting.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

2022

SVRG meets AdaGrad: painless variance reduction.

[BibT_eX]

[DOI]

Benjamin Dubois-Taine

Mach. Learn., 2022

Predicting Tactical Solutions to Operational Planning Problems Under Imperfect Information.

[BibT_eX]

[DOI]

INFORMS J. Comput., 2022

Synergies Between Disentanglement and Sparsity: a Multi-Task Learning Perspective.

[BibT_eX]

[DOI]

CoRR, 2022

Partial Disentanglement via Mechanism Sparsity.

[BibT_eX]

[DOI]

Sébastien Lachapelle

CoRR, 2022

Bayesian structure learning with generative flow networks.

[BibT_eX]

[DOI]

Proceedings of the Uncertainty in Artificial Intelligence, 2022

Dynamics of SGD with Stochastic Polyak Stepsizes: Truly Adaptive Variants and Convergence to Exact Solution.

[BibT_eX]

[DOI]

Antonio Orvieto

Nicolas Loizou

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Data-Efficient Structured Pruning via Submodular Optimization.

[BibT_eX]

[DOI]

Marwa El Halabi

Suraj Srinivas

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Controlled Sparsity via Constrained Optimization or: How I Learned to Stop Tuning Penalties and Love Constraints.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Multiset-Equivariant Set Prediction with Approximate Implicit Differentiation.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

Online Adversarial Attacks.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

Disentanglement via Mechanism Sparsity Regularization: A New Principle for Nonlinear ICA.

[BibT_eX]

[DOI]

Proceedings of the 1st Conference on Causal Learning and Reasoning, 2022

On the Convergence of Continuous Constrained Optimization for Structure Learning.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

2021

Convergence Rates for the MAP of an Exponential Family and Stochastic Mirror Descent - an Open Problem.

[BibT_eX]

[DOI]

CoRR, 2021

Discovering Latent Causal Variables via Mechanism Sparsity: A New Principle for Nonlinear ICA.

[BibT_eX]

[DOI]

CoRR, 2021

Stochastic Gradient Descent-Ascent and Consensus Optimization for Smooth Games: Convergence Analysis under Expected Co-coercivity.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Structured Convolutional Kernel Networks for Airline Crew Scheduling.

[BibT_eX]

[DOI]

Yassine Yaakoubi

François Soumis

Proceedings of the 38th International Conference on Machine Learning, 2021

Affine Invariant Analysis of Frank-Wolfe on Strongly Convex Sets.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Repurposing Pretrained Models for Robust Out-of-domain Few-Shot Learning.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

An Analysis of the Adaptation Speed of Causal Models.

[BibT_eX]

[DOI]

Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast Convergence.

[BibT_eX]

[DOI]

Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

Implicit Regularization via Neural Feature Alignment.

[BibT_eX]

[DOI]

Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

2020

Machine learning in airline crew pairing to construct initial clusters for dynamic constraint aggregation.

[BibT_eX]

[DOI]

Yassine Yaakoubi

François Soumis

EURO J. Transp. Logist., 2020

Geometry-Aware Universal Mirror-Prox.

[BibT_eX]

[DOI]

Reza Babanezhad

CoRR, 2020

On the Convergence of Continuous Constrained Optimization for Structure Learning.

[BibT_eX]

[DOI]

CoRR, 2020

Flight-connection Prediction for Airline Crew Scheduling to Construct Initial Clusters for OR Optimizer.

[BibT_eX]

[DOI]

Yassine Yaakoubi

François Soumis

CoRR, 2020

Implicit Regularization in Deep Learning: A View from Function Space.

[BibT_eX]

[DOI]

CoRR, 2020

Adaptive Gradient Methods Converge Faster with Over-Parameterization (and you can do a line-search).

[BibT_eX]

[DOI]

CoRR, 2020

To Each Optimizer a Norm, To Each Norm its Generalization.

[BibT_eX]

[DOI]

CoRR, 2020

Differentiable Causal Discovery from Interventional Data.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Adversarial Example Games.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Stochastic Hamiltonian Gradient Methods for Smooth Games.

[BibT_eX]

[DOI]

Nicolas Loizou

Hugo Berard

Pascal Vincent

Ioannis Mitliagkas

Proceedings of the 37th International Conference on Machine Learning, 2020

Gradient-Based Neural DAG Learning.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Learning Representations, 2020

A Closer Look at the Optimization Landscapes of Generative Adversarial Networks.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Learning Representations, 2020

Fast and Furious Convergence: Stochastic Second Order Methods under Interpolation.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

GAIT: A Geometric Approach to Information Theory.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

Accelerating Smooth Games by Manipulating Spectral Shapes.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

A Tight and Unified Analysis of Gradient-Based Methods for a Whole Spectrum of Differentiable Games.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

2019

Scattering Networks for Hybrid Representation Learning.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2019

GEAR: Geometry-Aware Rényi Information.

[BibT_eX]

[DOI]

CoRR, 2019

A Tight and Unified Analysis of Extragradient for a Whole Spectrum of Differentiable Games.

[BibT_eX]

[DOI]

CoRR, 2019

Implicit Regularization of Discrete Gradient Dynamics in Deep Linear Neural Networks.

[BibT_eX]

[DOI]

CoRR, 2019

Centroid Networks for Few-Shot Clustering and Unsupervised Few-Shot Classification.

[BibT_eX]

[DOI]

Gabriel Huang

Hugo Larochelle

CoRR, 2019

Painless Stochastic Gradient: Interpolation, Line-Search, and Convergence Rates.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Implicit Regularization of Discrete Gradient Dynamics in Linear Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Reducing Noise in GAN Training with Variance Reduced Extragradient.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

A Variational Inequality Perspective on Generative Adversarial Networks.

[BibT_eX]

[DOI]

Proceedings of the 7th International Conference on Learning Representations, 2019

Negative Momentum for Improved Game Dynamics.

[BibT_eX]

[DOI]

Reyhane Askari Hemmat

Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

2018

Learning from Narrated Instruction Videos.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2018

Improved Asynchronous Parallel Optimization Analysis for Stochastic Incremental Methods.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2018

A Modern Take on the Bias-Variance Tradeoff in Neural Networks.

[BibT_eX]

[DOI]

CoRR, 2018

Predicting Solution Summaries to Integer Linear Programs under Imperfect Information with Machine Learning.

[BibT_eX]

[DOI]

CoRR, 2018

A Variational Inequality Perspective on Generative Adversarial Nets.

[BibT_eX]

[DOI]

CoRR, 2018

A3T: Adversarially Augmented Adversarial Training.

[BibT_eX]

[DOI]

CoRR, 2018

Adaptive Stochastic Dual Coordinate Ascent for Conditional Random Fields.

[BibT_eX]

[DOI]

Rémi Le Priol

Alexandre Piché

Proceedings of the Thirty-Fourth Conference on Uncertainty in Artificial Intelligence, 2018

Quantifying Learning Guarantees for Convex but Inconsistent Surrogates.

[BibT_eX]

[DOI]

Kirill Struminsky

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

SEARNN: Training RNNs with global-local losses.

[BibT_eX]

[DOI]

Proceedings of the 6th International Conference on Learning Representations, 2018

Parametric Adversarial Divergences are Good Task Losses for Generative Modeling.

[BibT_eX]

[DOI]

Proceedings of the 6th International Conference on Learning Representations, 2018

Frank-Wolfe Splitting via Augmented Lagrangian Method.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2018

2017

Adversarial Divergences are Good Task Losses for Generative Modeling.

[BibT_eX]

[DOI]

CoRR, 2017

Joint Discovery of Object States and Manipulating Actions.

[BibT_eX]

[DOI]

Josef Sivic

Ivan Laptev

CoRR, 2017

Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Optimization.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

On Structured Prediction Theory with Calibrated Convex Surrogate Losses.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

A Closer Look at Memorization in Deep Networks.

[BibT_eX]

[DOI]

Devansh Arpit

Stanislaw Jastrzebski

Proceedings of the 34th International Conference on Machine Learning, 2017

Joint Discovery of Object States and Manipulation Actions.

[BibT_eX]

[DOI]

Josef Sivic

Ivan Laptev

Proceedings of the IEEE International Conference on Computer Vision, 2017

ASAGA: Asynchronous Parallel SAGA.

[BibT_eX]

[DOI]

Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, 2017

Frank-Wolfe Algorithms for Saddle Point Problems.

[BibT_eX]

[DOI]

Tony Jebara

Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, 2017

2016

Convergence Rate of Frank-Wolfe for Non-Convex Objectives.

[BibT_eX]

[DOI]

CoRR, 2016

PAC-Bayesian Theory Meets Bayesian Inference.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Beyond CCA: Moment Matching for Multi-View Models.

[BibT_eX]

[DOI]

Anastasia Podosinnikova

Proceedings of the 33nd International Conference on Machine Learning, 2016

Minding the Gaps for Block Frank-Wolfe Optimization of Structured SVMs.

[BibT_eX]

[DOI]

Isabella Lukasewitz

Puneet Kumar Dokania

Proceedings of the 33nd International Conference on Machine Learning, 2016

Unsupervised Learning from Narrated Instruction Videos.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

2015

Rethinking LDA: Moment Matching for Discrete ICA.

[BibT_eX]

[DOI]

Anastasia Podosinnikova

Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

On the Global Linear Convergence of Frank-Wolfe Optimization Variants.

[BibT_eX]

[DOI]

Martin Jaggi

Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Barrier Frank-Wolfe for Marginal Inference.

[BibT_eX]

[DOI]

Rahul G. Krishnan

David A. Sontag

Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Variance Reduced Stochastic Gradient Descent with Neighbors.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

On pairwise costs for network flow multi-object tracking.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Sequential Kernel Herding: Frank-Wolfe Optimization for Particle Filtering.

[BibT_eX]

[DOI]

Fredrik Lindsten

Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, 2015

2014

On Pairwise Cost for Multi-Object Network Flow Tracking.

[BibT_eX]

[DOI]

CoRR, 2014

SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives.

[BibT_eX]

[DOI]

Aaron Defazio

Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

2013

SIGMa: simple greedy matching for aligning large knowledge bases.

[BibT_eX]

[DOI]

Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2013

Block-Coordinate Frank-Wolfe Optimization for Structural SVMs.

[BibT_eX]

[DOI]

Proceedings of the 30th International Conference on Machine Learning, 2013

2012

A simpler approach to obtaining an O(1/t) convergence rate for the projected stochastic subgradient method

[BibT_eX]

[DOI]

Mark Schmidt

CoRR, 2012

Stochastic Block-Coordinate Frank-Wolfe Optimization for Structural SVMs

[BibT_eX]

[DOI]

CoRR, 2012

On the Equivalence between Herding and Conditional Gradient Algorithms.

[BibT_eX]

[DOI]

Guillaume Obozinski

Proceedings of the 29th International Conference on Machine Learning, 2012

2011

Approximate inference for the loss-calibrated Bayesian.

[BibT_eX]

[DOI]

Ferenc Huszar

Zoubin Ghahramani

Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, 2011

2008

DiscLDA: Discriminative Learning for Dimensionality Reduction and Classification.

[BibT_eX]

[DOI]

Fei Sha

Michael I. Jordan

Proceedings of the Advances in Neural Information Processing Systems 21, 2008

2006

Structured Prediction, Dual Extragradient and Bregman Projections.

[BibT_eX]

[DOI]

Benjamin Taskar

Michael I. Jordan

J. Mach. Learn. Res., 2006

Word Alignment via Quadratic Assignment.

[BibT_eX]

[DOI]

Proceedings of the Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, 2006

2005

Structured Prediction via the Extragradient Method.

[BibT_eX]

[DOI]

Benjamin Taskar

Michael I. Jordan

Proceedings of the Advances in Neural Information Processing Systems 18 [Neural Information Processing Systems, 2005

A Discriminative Matching Approach to Word Alignment.

[BibT_eX]

[DOI]

Benjamin Taskar