Mark Schmidt

Yihan Zhou

Jennifer She

CoRR, 2023

Simplifying Momentum-based Riemannian Submanifold Optimization.

[BibT_eX]

[DOI]

CoRR, 2023

Optimistic Thompson Sampling-based algorithms for episodic reinforcement learning.

[BibT_eX]

[DOI]

Proceedings of the Uncertainty in Artificial Intelligence, 2023

Fast Convergence of Random Reshuffling Under Over-Parameterization and the Polyak-Łojasiewicz Condition.

[BibT_eX]

[DOI]

Chen Fan

Christos Thrampoulidis

Proceedings of the Machine Learning and Knowledge Discovery in Databases: Research Track, 2023

Searching for Optimal Per-Coordinate Step-sizes with Multidimensional Backtracking.

[BibT_eX]

[DOI]

Victor Sanches Portella

Nicholas J. A. Harvey

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Don't be so Monotone: Relaxing Stochastic Line Search in Over-Parameterized Models.

[BibT_eX]

[DOI]

Leonardo Galli

Holger Rauhut

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

BiSLS/SPS: Auto-tune Step Sizes for Stable Bi-level Optimization.

[BibT_eX]

[DOI]

Chen Fan

Gaspard Choné-Ducasse

Christos Thrampoulidis

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Simplifying Momentum-based Positive-definite Submanifold Optimization with Applications to Deep Learning.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Target-based Surrogates for Stochastic Optimization.

[BibT_eX]

[DOI]

Reza Babanezhad Harikandeh

Proceedings of the International Conference on Machine Learning, 2023

Noise Is Not the Main Factor Behind the Gap Between Sgd and Adam on Transformers, But Sign Descent Might Be.

[BibT_eX]

[DOI]

Jacques Chen

Proceedings of the Eleventh International Conference on Learning Representations, 2023

2022

SVRG meets AdaGrad: painless variance reduction.

[BibT_eX]

[DOI]

Benjamin Dubois-Taine

Mach. Learn., 2022

Let's Make Block Coordinate Descent Converge Faster: Faster Greedy Rules, Message-Passing, Active-Set Complexity, and Superlinear Convergence.

[BibT_eX]

[DOI]

Issam H. Laradji

J. Mach. Learn. Res., 2022

Homeomorphic-Invariance of EM: Non-Asymptotic Convergence in KL Divergence for Exponential Families via Mirror Descent (Extended Abstract).

[BibT_eX]

[DOI]

Raunak Kumar

Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

Improved Policy Optimization for Online Imitation Learning.

[BibT_eX]

[DOI]

Proceedings of the Conference on Lifelong Learning Agents, 2022

2021

Structured second-order methods via natural gradient descent.

[BibT_eX]

[DOI]

Frank Nielsen

CoRR, 2021

AutoRetouch: Automatic Professional Face Retouching.

[BibT_eX]

[DOI]

Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2021

Robust Asymmetric Learning in POMDPs.

[BibT_eX]

[DOI]

Andrew Warrington

Adam Scibior

Frank Wood

Proceedings of the 38th International Conference on Machine Learning, 2021

Tractable structured natural-gradient descent using local parameterizations.

[BibT_eX]

[DOI]

Frank Nielsen

Proceedings of the 38th International Conference on Machine Learning, 2021

Homeomorphic-Invariance of EM: Non-Asymptotic Convergence in KL Divergence for Exponential Families via Mirror Descent.

[BibT_eX]

[DOI]

Raunak Kumar

Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

2020

Variance-Reduced Methods for Machine Learning.

[BibT_eX]

[DOI]

Proc. IEEE, 2020

Combining Bayesian optimization and Lipschitz optimization.

[BibT_eX]

[DOI]

Mohamed Osama Ahmed

Mach. Learn., 2020

Adaptive Gradient Methods Converge Faster with Over-Parameterization (and you can do a line-search).

[BibT_eX]

[DOI]

CoRR, 2020

Regret Bounds without Lipschitz Continuity: Online Learning with Relative-Lipschitz Losses.

[BibT_eX]

[DOI]

Yihan Zhou

Victor S. Portella

Nicholas J. A. Harvey

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Handling the Positive-Definite Constraint in the Bayesian Learning Rule.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

Proposal-Based Instance Segmentation With Point Supervision.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Image Processing, 2020

Fast and Furious Convergence: Stochastic Second Order Methods under Interpolation.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

2019

"Active-set complexity" of proximal gradient: How long does it take to find the sparsity pattern?

[BibT_eX]

[DOI]

Warren L. Hare

Optim. Lett., 2019

Stein's Lemma for the Reparameterization Trick with Exponential Family Mixtures.

[BibT_eX]

[DOI]

CoRR, 2019

Instance Segmentation with Point Supervision.

[BibT_eX]

[DOI]

CoRR, 2019

Efficient Deep Gaussian Process Models for Variable-Sized Input.

[BibT_eX]

[DOI]

CoRR, 2019

Painless Stochastic Gradient: Interpolation, Line-Search, and Convergence Rates.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Efficient Deep Gaussian Process Models for Variable-Sized Inputs.

[BibT_eX]

[DOI]

Proceedings of the International Joint Conference on Neural Networks, 2019

Fast and Simple Natural-Gradient Variational Inference with Mixture of Exponential-family Approximations.

[BibT_eX]

[DOI]

Proceedings of the 36th International Conference on Machine Learning, 2019

Efficient Parameter Estimation for DNA Kinetics Modeled as Continuous-Time Markov Chains.

[BibT_eX]

[DOI]

Sedigheh Zolaktaf

Frits Dannenberg

Erik Winfree

Alexandre Bouchard-Côté

Anne Condon

Proceedings of the DNA Computing and Molecular Programming - 25th International Conference, 2019

A Less Biased Evaluation of Out-of-distribution Sample Detectors.

[BibT_eX]

[DOI]

Proceedings of the 30th British Machine Vision Conference 2019, 2019

Where are the Masks: Instance Segmentation with Image-level Supervision.

[BibT_eX]

[DOI]

Issam H. Laradji

David Vázquez

Proceedings of the 30th British Machine Vision Conference 2019, 2019

Fast and Faster Convergence of SGD for Over-Parameterized Models and an Accelerated Perceptron.

[BibT_eX]

[DOI]

Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

Are we there yet? Manifold identification of gradient-related proximal methods.

[BibT_eX]

[DOI]

Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

Distributed Maximization of "Submodular plus Diversity" Functions for Multi-label Feature Selection on Huge Datasets.

[BibT_eX]

[DOI]

Mehrdad Ghadiri

Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

2018

Does Your Model Know the Digit 6 Is Not a Cat? A Less Biased Evaluation of "Outlier" Detectors.

[BibT_eX]

[DOI]

CoRR, 2018

New Insights into Bootstrapping for Bandits.

[BibT_eX]

[DOI]

CoRR, 2018

MASAGA: A Linearly-Convergent Stochastic First-Order Method for Optimization on Manifolds.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2018

SLANG: Fast Structured Covariance Approximations for Bayesian Deep Learning with Natural Gradient.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Online Learning Rate Adaptation with Hypergradient Descent.

[BibT_eX]

[DOI]

Proceedings of the 6th International Conference on Learning Representations, 2018

Where Are the Blobs: Counting by Localization with Point Supervision.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2018, 2018

2017

Erratum to: Minimizing finite sums with the stochastic average gradient.

[BibT_eX]

[DOI]

Math. Program., 2017

Minimizing finite sums with the stochastic average gradient.

[BibT_eX]

[DOI]

Math. Program., 2017

Diffusion Independent Semi-Bandit Influence Maximization.

[BibT_eX]

[DOI]

Laks V. S. Lakshmanan

CoRR, 2017

Model-Independent Online Learning for Influence Maximization.

[BibT_eX]

[DOI]

Laks V. S. Lakshmanan

Proceedings of the 34th International Conference on Machine Learning, 2017

Inferring Parameters for an Elementary Step Model of DNA Structure Kinetics with Locally Context-Dependent Arrhenius Rates.

[BibT_eX]

[DOI]

Proceedings of the DNA Computing and Molecular Programming - 23rd International Conference, 2017

Horde of Bandits using Gaussian Markov Random Fields.

[BibT_eX]

[DOI]

Laks V. S. Lakshmanan

Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, 2017

2016

Convergence Rates for Greedy Kaczmarz Algorithms, and Faster Randomized Kaczmarz Rules Using the Orthogonality Graph.

[BibT_eX]

[DOI]

CoRR, 2016

Fast Patch-based Style Transfer of Arbitrary Style.

[BibT_eX]

[DOI]

Tian Qi Chen

CoRR, 2016

Convergence Rates for Greedy Kaczmarz Algorithms, and Randomized Kaczmarz Rules Using the Orthogonality Graph.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Second Conference on Uncertainty in Artificial Intelligence, 2016

Faster Stochastic Variational Inference using Proximal-Gradient Methods with General Divergence Functions.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Second Conference on Uncertainty in Artificial Intelligence, 2016

Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Łojasiewicz Condition.

[BibT_eX]

[DOI]

Hamed Karimi

Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2016

Play and Learn: Using Video Games to Train Computer Vision Models.

[BibT_eX]

[DOI]

Proceedings of the British Machine Vision Conference 2016, 2016

2015

Hierarchical Maximum-Margin Clustering.

[BibT_eX]

[DOI]

CoRR, 2015

Convergence of Proximal-Gradient Stochastic Variational Inference under Non-Decreasing Step-Size Sequence.

[BibT_eX]

[DOI]

CoRR, 2015

Stop Wasting My Gradients: Practical SVRG.

[BibT_eX]

[DOI]

CoRR, 2015

StopWasting My Gradients: Practical SVRG.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Coordinate Descent Converges Faster with the Gauss-Southwell Rule Than Random Selection.

[BibT_eX]

[DOI]

Issam H. Laradji

Hoyt A. Koepke

Proceedings of the 32nd International Conference on Machine Learning, 2015

Non-Uniform Stochastic Average Gradient Method for Training Conditional Random Fields.

[BibT_eX]

[DOI]

Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, 2015

2014

Convex Optimization for Big Data: Scalable, randomized, and parallel algorithms for big data analytics.

[BibT_eX]

[DOI]

Volkan Cevher

Stephen Becker

IEEE Signal Process. Mag., 2014

Convex Optimization for Big Data.

[BibT_eX]

[DOI]

Volkan Cevher

Stephen Becker

CoRR, 2014

2013

Erratum: Hybrid Deterministic-Stochastic Methods for Data Fitting.

[BibT_eX]

[DOI]

SIAM J. Sci. Comput., 2013

Block-Coordinate Frank-Wolfe Optimization for Structural SVMs.

[BibT_eX]

[DOI]

Proceedings of the 30th International Conference on Machine Learning, 2013

2012

Hybrid Deterministic-Stochastic Methods for Data Fitting.

[BibT_eX]

[DOI]

SIAM J. Sci. Comput., 2012

On Sparse, Spectral and Other Parameterizations of Binary Probabilistic Models.

[BibT_eX]

[DOI]

Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, 2012

A simpler approach to obtaining an O(1/t) convergence rate for the projected stochastic subgradient method

[BibT_eX]

[DOI]

Simon Lacoste-Julien

CoRR, 2012

Stochastic Block-Coordinate Frank-Wolfe Optimization for Structural SVMs

[BibT_eX]

[DOI]

CoRR, 2012

A Stochastic Gradient Method with an Exponential Convergence Rate for Strongly-Convex Optimization with Finite Training Sets

[BibT_eX]

[DOI]

CoRR, 2012

A Stochastic Gradient Method with an Exponential Convergence Rate for Finite Training Sets.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

2011

Generalized Fast Approximate Energy Minimization via Graph Cuts: Alpha-Expansion Beta-Shrink Moves

[BibT_eX]

[DOI]

Karteek Alahari

CoRR, 2011

Generalized Fast Approximate Energy Minimization via Graph Cuts: a-Expansion b-Shrink Moves.

[BibT_eX]

[DOI]

Karteek Alahari

Proceedings of the UAI 2011, 2011

Convergence Rates of Inexact Proximal-Gradient Methods for Convex Optimization.

[BibT_eX]

[DOI]

Gerardo Hermosillo Valadez

Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

2010

Modeling annotator expertise: Learning when everybody knows a bit of something.

[BibT_eX]

[DOI]

Luca Bogoni

Linda Moy

Jennifer G. Dy

Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 2010

Convex Structure Learning in Log-Linear Models: Beyond Pairwise Potentials.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 2010

Causal learning without DAGs.

[BibT_eX]

[DOI]

Proceedings of the Causality: Objectives and Assessment (NIPS 2008 Workshop), 2010

2009

Optimizing Costly Functions with Simple Constraints: A Limited-Memory Projected Quasi-Newton Algorithm.

[BibT_eX]

[DOI]

Ewout van den Berg

Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics, 2009

Modeling Discrete Interventional Data using Directed Cyclic Graphical Models.

[BibT_eX]

[DOI]

Proceedings of the UAI 2009, 2009

Group Sparse Priors for Covariance Estimation.

[BibT_eX]

[DOI]

Benjamin M. Marlin

Proceedings of the UAI 2009, 2009

Increased discrimination in level set methods with embedded conditional random fields.

[BibT_eX]

[DOI]

Dana Cobzas

Proceedings of the 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 2009

2008

An interior-point stochastic approximation method and an L1-regularized delta rule.

[BibT_eX]

[DOI]

Peter Carbonetto

Nando de Freitas

Proceedings of the Advances in Neural Information Processing Systems 21, 2008

Structure learning in random fields for heart motion abnormality detection.

[BibT_eX]

[DOI]

Proceedings of the 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008), 2008

2007

3D Variational Brain Tumor Segmentation using a High Dimensional Feature Set.

[BibT_eX]

[DOI]

Proceedings of the IEEE 11th International Conference on Computer Vision, 2007

Fast Optimization Methods for L1 Regularization: A Comparative Study and Two New Approaches.

[BibT_eX]

[DOI]

Glenn Fung

Rómer Rosales

Proceedings of the Machine Learning: ECML 2007, 2007

Learning Graphical Model Structure Using L1-Regularization Paths.

[BibT_eX]

[DOI]

Alexandru Niculescu-Mizil

Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence, 2007

2006

Learning a Classification-based Glioma Growth Model Using MRI Data.

[BibT_eX]

[DOI]

J. Comput., 2006

Accelerated training of conditional random fields with stochastic gradient methods.

[BibT_eX]

[DOI]

S. V. N. Vishwanathan

Nicol N. Schraudolph

Mark W. Schmidt

Proceedings of the Machine Learning, 2006

A Classification-Based Glioma Diffusion Model Using MRI Data.

[BibT_eX]

[DOI]

Proceedings of the Advances in Artificial Intelligence, 2006

2005

Support Vector Random Fields for Spatial Classification.

[BibT_eX]

[DOI]

Chi-Hoon Lee

Russell Greiner