Manfred K. Warmuth

Affiliations:
  • University of California, Santa Cruz, USA


According to our database1, Manfred K. Warmuth authored at least 184 papers between 1984 and 2024.

Collaborative distances:
  • Dijkstra number2 of three.
  • Erdős number3 of two.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Tempered Calculus for ML: Application to Hyperbolic Model Embedding.
CoRR, 2024

2023
Layerwise Bregman Representation Learning of Neural Networks with Applications to Knowledge Distillation.
Trans. Mach. Learn. Res., 2023

The Tempered Hilbert Simplex Distance and Its Application To Non-linear Embeddings of TEMs.
CoRR, 2023

Optimal Transport with Tempered Exponential Measures.
CoRR, 2023

A Mechanism for Sample-Efficient In-Context Learning for Sparse Retrieval Tasks.
CoRR, 2023

Boosting with Tempered Exponential Measures.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Open Problem: Learning sparse linear concepts by priming the features.
Proceedings of the Thirty Sixth Annual Conference on Learning Theory, 2023

Clustering above Exponential Families with Tempered Exponential Measures.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023

2022
Unbiased estimators for random design regression.
J. Mach. Learn. Res., 2022

Unlabeled sample compression schemes and corner peelings for ample and maximum classes.
J. Comput. Syst. Sci., 2022

Layerwise Bregman Representation Learning with Applications to Knowledge Distillation.
CoRR, 2022

Learning from Randomly Initialized Neural Network Features.
CoRR, 2022

Step-size Adaptation Using Exponentiated Gradient Updates.
CoRR, 2022

LocoProp: Enhancing BackProp via Local Loss Optimization.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

2021
Exponentiated Gradient Reweighting for Robust Training Under Label Noise and Beyond.
CoRR, 2021

A case where a spindly two-layer linear network decisively outperforms any neural network with a fully connected input layer.
Proceedings of the Algorithmic Learning Theory, 2021

2020
A case where a spindly two-layer linear network whips any neural network with a fully connected input layer.
CoRR, 2020

Interpolating Between Gradient Descent and Exponentiated Gradient Using Reparameterized Gradient Descent.
CoRR, 2020

Divergence-Based Motivation for Online EM and Combining Hidden Variable Models.
Proceedings of the Thirty-Sixth Conference on Uncertainty in Artificial Intelligence, 2020

Reparameterizing Mirror Descent as Gradient Descent.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Rank-Smoothed Pairwise Learning In Perceptual Quality Assessment.
Proceedings of the IEEE International Conference on Image Processing, 2020

Winnowing with Gradient Descent.
Proceedings of the Conference on Learning Theory, 2020

An Implicit Form of Krasulina's k-PCA Update without the Orthonormality Constraint.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Mistake bounds on the noise-free multi-armed bandit game.
Inf. Comput., 2019

TriMap: Large-scale Dimensionality Reduction Using Triplets.
CoRR, 2019

Robust Bi-Tempered Logistic Loss Based on Bregman Divergences.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Adaptive Scale-Invariant Online Algorithms for Learning Linear Models.
Proceedings of the 36th International Conference on Machine Learning, 2019

Minimax experimental design: Bridging the gap between statistical and worst-case approaches to least squares regression.
Proceedings of the Conference on Learning Theory, 2019

Online Non-Additive Path Learning under Full and Partial Information.
Proceedings of the Algorithmic Learning Theory, 2019

Correcting the bias in least squares regression with volume-rescaled sampling.
Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

Two-temperature logistic regression based on the Tsallis divergence.
Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

2018
Reverse Iterative Volume Sampling for Linear Regression.
J. Mach. Learn. Res., 2018

Speech Recognition: Keyword Spotting Through Image Recognition.
CoRR, 2018

A more globally accurate dimensionality reduction method using triplets.
CoRR, 2018

Tail bounds for volume sampled linear regression.
CoRR, 2018

Leveraged volume sampling for linear regression.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Subsampling for Ridge Regression via Regularized Volume Sampling.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2018

2017
Online Dynamic Programming.
CoRR, 2017

Two-temperature logistic regression based on the Tsallis divergence.
CoRR, 2017

Online Dynamic Programming.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Unbiased estimates for linear regression via volume sampling.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

2016
Learning rotations with little regret.
Mach. Learn., 2016

Online PCA with Optimal Regret.
J. Mach. Learn. Res., 2016

t-Exponential Triplet Embedding.
CoRR, 2016

Noise Free Multi-armed Bandit Game.
Proceedings of the Language and Automata Theory and Applications, 2016

Labeled Compression Schemes for Extremal Classes.
Proceedings of the Algorithmic Learning Theory - 27th International Conference, 2016

2015
PCA with Gaussian perturbations.
CoRR, 2015

Open Problem: Online Sabotaged Shortest Path.
Proceedings of The 28th Conference on Learning Theory, 2015

On-Line Learning Algorithms for Path Experts with Non-Additive Losses.
Proceedings of The 28th Conference on Learning Theory, 2015

Minimax Fixed-Design Linear Regression.
Proceedings of The 28th Conference on Learning Theory, 2015

2014
Kernelization of matrix updates, when and how?
Theor. Comput. Sci., 2014

Combining initial segments of lists.
Theor. Comput. Sci., 2014

The limits of squared Euclidean distance regularization.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Open Problem: Shifting Experts on Easy Data.
Proceedings of The 27th Conference on Learning Theory, 2014

2013
On-line PCA with Optimal Regrets.
CoRR, 2013

Open Problem: Lower bounds for Boosting with Hadamard Matrices.
Proceedings of the COLT 2013, 2013

Learning a set of directions.
Proceedings of the COLT 2013, 2013

Online PCA with Optimal Regrets.
Proceedings of the Algorithmic Learning Theory - 24th International Conference, 2013

2012
Online variance minimization.
Mach. Learn., 2012

Putting Bayes to sleep.
Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

2011
Minimax Algorithm for Learning Rotations.
Proceedings of the COLT 2011, 2011

Learning Eigenvectors for Free.
Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

2010
Bayesian generalized probability calculus for density matrices.
Mach. Learn., 2010

Repeated Games against Budgeted Adversaries.
Proceedings of the Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 6-9 December 2010, 2010

New combination coefficients for AdaBoost algorithms.
Proceedings of the Sixth International Conference on Natural Computation, 2010

The Blessing and the Curse of the Multiplicative Updates.
Proceedings of the Discovery Science - 13th International Conference, 2010

Hedging Structured Concepts.
Proceedings of the COLT 2010, 2010

On-line Variance Minimization in O(n2) per Trial?
Proceedings of the COLT 2010, 2010

2009
Learning Permutations with Exponential Weights.
J. Mach. Learn. Res., 2009

Tutorial summary: Survey of boosting from an optimization perspective.
Proceedings of the 26th Annual International Conference on Machine Learning, 2009

Minimax Games with Bandits.
Proceedings of the COLT 2009, 2009

2008
Learning Rotations.
Proceedings of the 21st Annual Conference on Learning Theory, 2008

When Random Play is Optimal Against an Adversary.
Proceedings of the 21st Annual Conference on Learning Theory, 2008

Entropy Regularized LPBoost.
Proceedings of the Algorithmic Learning Theory, 19th International Conference, 2008

2007
Unlabeled Compression Schemes for Maximum Classes.
J. Mach. Learn. Res., 2007

Boosting Algorithms for Maximizing the Soft Margin.
Proceedings of the Advances in Neural Information Processing Systems 20, 2007

Winnowing subspaces.
Proceedings of the Machine Learning, 2007

Online kernel PCA with entropic matrix updates.
Proceedings of the Machine Learning, 2007

When Is There a Free Matrix Lunch?
Proceedings of the Learning Theory, 20th Annual Conference on Learning Theory, 2007

2006
The p-norm generalization of the LMS algorithm for adaptive filtering.
IEEE Trans. Signal Process., 2006

A Bayesian Probability Calculus for Density Matrices.
Proceedings of the UAI '06, 2006

Randomized PCA Algorithms with Regret Bounds that are Logarithmic in the Dimension.
Proceedings of the Advances in Neural Information Processing Systems 19, 2006

Totally corrective boosting algorithms that maximize the margin.
Proceedings of the Machine Learning, 2006

Can Entropic Regularization Be Replaced by Squared Euclidean Distance Plus Additional Linear Constraints.
Proceedings of the Learning Theory, 19th Annual Conference on Learning Theory, 2006

Continuous Experts and the Binning Algorithm.
Proceedings of the Learning Theory, 19th Annual Conference on Learning Theory, 2006

2005
Matrix Exponentiated Gradient Updates for On-line Learning and Bregman Projection.
J. Mach. Learn. Res., 2005

Efficient Margin Maximizing with Boosting.
J. Mach. Learn. Res., 2005

A Bayes Rule for Density Matrices.
Proceedings of the Advances in Neural Information Processing Systems 18 [Neural Information Processing Systems, 2005

Leaving the Span.
Proceedings of the Learning Theory, 18th Annual Conference on Learning Theory, 2005

Optimum Follow the Leader Algorithm.
Proceedings of the Learning Theory, 18th Annual Conference on Learning Theory, 2005

2004
Matrix Exponential Gradient Updates for On-line Learning and Bregman Projection.
Proceedings of the Advances in Neural Information Processing Systems 17 [Neural Information Processing Systems, 2004

The Optimal PAC Algorithm.
Proceedings of the Learning Theory, 17th Annual Conference on Learning Theory, 2004

2003
Relative Loss Bounds for Temporal-Difference Learning.
Mach. Learn., 2003

Path Kernels and Multiplicative Updates.
J. Mach. Learn. Res., 2003

Active Learning with Support Vector Machines in the Drug Discovery Process.
J. Chem. Inf. Comput. Sci., 2003

Boosting versus Covering.
Proceedings of the Advances in Neural Information Processing Systems 16 [Neural Information Processing Systems, 2003

Classification with free energy at raised temperatures.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Inline updates for HMMs.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Compressing to VC Dimension Many Points.
Proceedings of the Computational Learning Theory and Kernel Machines, 2003

2002
Predicting nearly as well as the best pruning of a planar decision graph.
Theor. Comput. Sci., 2002

Direct and indirect algorithms for on-line learning of disjunctions.
Theor. Comput. Sci., 2002

Tracking a Small Set of Experts by Mixing Past Posteriors.
J. Mach. Learn. Res., 2002

Relative Expected Instantaneous Loss Bounds.
J. Comput. Syst. Sci., 2002

Adaptive Caching by Refetching.
Proceedings of the Advances in Neural Information Processing Systems 15 [Neural Information Processing Systems, 2002

Maximizing the Margin with Boosting.
Proceedings of the Computational Learning Theory, 2002

2001
Relative Loss Bounds for Multidimensional Regression Problems.
Mach. Learn., 2001

Relative Loss Bounds for On-Line Density Estimation with the Exponential Family of Distributions.
Mach. Learn., 2001

Tracking the Best Linear Predictor.
J. Mach. Learn. Res., 2001

Active Learning in the Drug Discovery Process.
Proceedings of the Advances in Neural Information Processing Systems 14 [Neural Information Processing Systems: Natural and Synthetic, 2001

On the Convergence of Leveraging.
Proceedings of the Advances in Neural Information Processing Systems 14 [Neural Information Processing Systems: Natural and Synthetic, 2001

2000
Learning of Depth Two Neural Networks with Constant Fan-in at the Hidden Nodes
Electron. Colloquium Comput. Complex., 2000

The Minimax Strategy for Gaussian Density Estimation. pp.
Proceedings of the Thirteenth Annual Conference on Computational Learning Theory (COLT 2000), June 28, 2000

Barrier Boosting.
Proceedings of the Thirteenth Annual Conference on Computational Learning Theory (COLT 2000), June 28, 2000

The Last-Step Minimax Algorithm.
Proceedings of the Algorithmic Learning Theory, 11th International Conference, 2000

1999
Relative loss bounds for single neurons.
IEEE Trans. Neural Networks, 1999

Relative Loss Bounds for On-line Density Estirnation with the Exponential Family of Distributions.
Proceedings of the UAI '99: Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence, Stockholm, Sweden, July 30, 1999

Averaging Expert Predictions
Proceedings of the Computational Learning Theory, 4th European Conference, 1999

Boosting as Entropy Projection.
Proceedings of the Twelfth Annual Conference on Computational Learning Theory, 1999

1998
Sequential Prediction of Individual Sequences Under General Loss Functions.
IEEE Trans. Inf. Theory, 1998

Tracking the Best Expert.
Mach. Learn., 1998

Tracking the Best Disjunction.
Mach. Learn., 1998

Efficient Learning With Virtual Threshold Gates.
Inf. Comput., 1998

Batch and On-Line Parameter Estimation of Gaussian Mixtures Based on the Joint Entropy.
Proceedings of the Advances in Neural Information Processing Systems 11, [NIPS Conference, Denver, Colorado, USA, November 30, 1998

Linear Hinge Loss and Average Margin.
Proceedings of the Advances in Neural Information Processing Systems 11, [NIPS Conference, Denver, Colorado, USA, November 30, 1998

Tracking the Best Regressor.
Proceedings of the Eleventh Annual Conference on Computational Learning Theory, 1998

1997
A Comparison of New and Old Algorithms for a Mixture Estimation Problem.
Mach. Learn., 1997

How to use expert advice.
J. ACM, 1997

Exponentiated Gradient Versus Gradient Descent for Linear Predictors.
Inf. Comput., 1997

The Perceptron Algorithm Versus Winnow: Linear Versus Logarithmic Mistake Bounds when Few Input Variables are Relevant (Technical Note).
Artif. Intell., 1997

Using and Combining Predictors That Specialize.
Proceedings of the Twenty-Ninth Annual ACM Symposium on the Theory of Computing, 1997

Relative Loss Bounds, the Minimum Relative Entropy Principle, and EM.
Proceedings of the Advances in Neural Information Processing Systems 10, 1997

Sample Compression, Learnability, and the Vapnik-Chervonenkis Dimension.
Proceedings of the Computational Learning Theory, Third European Conference, 1997

1996
Worst-case quadratic loss bounds for prediction using linear functions and gradient descent.
IEEE Trans. Neural Networks, 1996

On the Worst-Case Analysis of Temporal-Difference Learning Algorithms.
Mach. Learn., 1996

On-line Prediction and Conversion Strategies.
Mach. Learn., 1996

Training Algorithms for Hidden Markov Models using Entropy Based Distance Functions.
Proceedings of the Advances in Neural Information Processing Systems 9, 1996

On-Line Portfolio Selection Using Multiplicative Updates.
Proceedings of the Machine Learning, 1996

Learning of Depth Two Neural Networks with Constant Fan-In at the Hidden Nodes (Extended Abstract).
Proceedings of the Ninth Annual Conference on Computational Learning Theory, 1996

1995
Learning Binary Relations Using Weighted Majority Voting.
Mach. Learn., 1995

Sample Compression, Learnability, and the Vapnik-Chervonenkis Dimension.
Mach. Learn., 1995

On Weak Learning.
J. Comput. Syst. Sci., 1995

On-line Learning of Linear Functions.
Comput. Complex., 1995

Additive versus exponentiated gradient updates for linear prediction.
Proceedings of the Twenty-Seventh Annual ACM Symposium on Theory of Computing, 1995

Worst-case Loss Bounds for Single Neurons.
Proceedings of the Advances in Neural Information Processing Systems 8, 1995

Exponentially many local minima for single neurons.
Proceedings of the Advances in Neural Information Processing Systems 8, 1995

Tight worst-case loss bounds for predicting with expert advice.
Proceedings of the Computational Learning Theory, Second European Conference, 1995

The Perceptron Algorithm vs. Winnow: Linear vs. Logarithmic Mistake Bounds when few Input Variables are Relevant.
Proceedings of the Eigth Annual Conference on Computational Learning Theory, 1995

1994
Predicting \0,1\-Functions on Randomly Drawn Points
Inf. Comput., December, 1994

Composite Geometric Concepts and Polynomial Predictability
Inf. Comput., September, 1994

The Weighted Majority Algorithm
Inf. Comput., February, 1994

The Distributed Bit Complexity of the Ring: From the Anonymous to the Non-anonymous Case
Inf. Comput., January, 1994

Bounds on approximate steepest descent for likelihood maximization in exponential families.
IEEE Trans. Inf. Theory, 1994

1993
Gap Theorems for Distributed Computation.
SIAM J. Comput., 1993

The Minimum Consistent DFA Problem Cannot be Approximated within any Polynomial.
J. ACM, 1993

Using experts for predicting continuous outcomes.
Proceedings of the First European Conference on Computational Learning Theory, 1993

Worst-Case Quadratic Loss Bounds for a Generalization of the Widrow-Hoff Rule.
Proceedings of the Sixth Annual ACM Conference on Computational Learning Theory, 1993

1992
Learning Integer Lattices.
SIAM J. Comput., 1992

On the Computational Complexity of Approximating Distributions by Probabilistic Automata.
Mach. Learn., 1992

Some Weak Learning Results.
Proceedings of the Fifth Annual ACM Conference on Computational Learning Theory, 1992

1991
Equivalence of Models for Polynomial Learnability
Inf. Comput., December, 1991

Polynomial Learnability of Probabilistic Concepts with Respect to the Kullback-Leibler Divergence.
Proceedings of the Fourth Annual Workshop on Computational Learning Theory, 1991

1990
Learning Nested Differences of Intersection-Closed Concept Classes.
Mach. Learn., 1990

NxN Puzzle and Related Relocation Problem.
J. Symb. Comput., 1990

Prediction-Preserving Reducibility.
J. Comput. Syst. Sci., 1990

1989
Parallel Approximation Algorithms for Bin Packing
Inf. Comput., September, 1989

A Fast Algorithm for Multiprocessor Scheduling of Unit-Length Jobs.
SIAM J. Comput., 1989

Learnability and the Vapnik-Chervonenkis dimension.
J. ACM, 1989

Scattered Versus Context-Sensitive Rewriting.
Acta Informatica, 1989

The Minimum Consistent DFA Problem Cannot be Approximated within any Polynomial (abstract).
Proceedings of the Proceedings: Fourth Annual Structure in Complexity Theory Conference, 1989

Towards Representation Independence in PAC Learning.
Proceedings of the Analogical and Inductive Inference, 1989

1988
Computing on an anonymous ring.
J. ACM, 1988

Predicting {0,1}-Functions on Randomly Drawn Points (Extended Abstract)
Proceedings of the 29th Annual Symposium on Foundations of Computer Science, 1988

Predicting {0, 1}-Functions on Randomly Drawn Points.
Proceedings of the First Annual Workshop on Computational Learning Theory, 1988

Reductions among prediction problems: on the difficulty of predicting automata.
Proceedings of the Proceedings: Third Annual Structure in Complexity Theory Conference, 1988

1987
Occam's Razor.
Inf. Process. Lett., 1987

1986
Manipulating Derivation Forests by Scheduling Techniques.
Theor. Comput. Sci., 1986

The Parallel Complexity of Scheduling with Precedence Constraints.
J. Parallel Distributed Comput., 1986

Membership for Growing Context-Sensitive Grammars is Polynomial.
J. Comput. Syst. Sci., 1986

Classifying Learnable Geometric Concepts with the Vapnik-Chervonenkis Dimension (Extended Abstract)
Proceedings of the 18th Annual ACM Symposium on Theory of Computing, 1986

Finding a Shortest Solution for the N × N Extension of the 15-PUZZLE Is Intractable.
Proceedings of the 5th National Conference on Artificial Intelligence. Philadelphia, 1986

1985
Applications of Scheduling Theory to Formal Language Theory.
Theor. Comput. Sci., 1985

Scheduling Flat Graphs.
SIAM J. Comput., 1985

1984
On the Complexity of Iterated Shuffle.
J. Comput. Syst. Sci., 1984

Scheduling Precedence Graphs of Bounded Height.
J. Algorithms, 1984


  Loading...