Philip M. Long

Affiliations:
  • Google, Mountain View, CA, USA


According to our database1, Philip M. Long authored at least 124 papers between 1991 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Corrigendum to "Prediction, learning, uniform convergence, and scale-sensitive dimensions" [J. Comput. Syst. Sci. 56 (2) (1998) 174-190].
J. Comput. Syst. Sci., March, 2024

2023
Deep linear networks can benignly overfit when shallow ones do.
J. Mach. Learn. Res., 2023

Sharpness-Aware Minimization and the Edge of Stability.
CoRR, 2023

2022
The Perils of Being Unhinged: On the Accuracy of Classifiers Minimizing a Noise-Robust Convex Loss.
Neural Comput., 2022

The Interplay Between Implicit Bias and Benign Overfitting in Two-Layer Linear Networks.
J. Mach. Learn. Res., 2022

Foolish Crowds Support Benign Overfitting.
J. Mach. Learn. Res., 2022

The Dynamics of Sharpness-Aware Minimization: Bouncing Across Ravines and Drifting Towards Wide Minima.
CoRR, 2022

2021
Superlinear Integrality Gaps for the Minimum Majority Problem.
SIAM J. Discret. Math., 2021

When Does Gradient Descent with Logistic Loss Find Interpolating Two-Layer Networks?
J. Mach. Learn. Res., 2021

Finite-sample Analysis of Interpolating Linear Classifiers in the Overparameterized Regime.
J. Mach. Learn. Res., 2021

Failures of Model-dependent Generalization Bounds for Least-norm Interpolation.
J. Mach. Learn. Res., 2021

Properties of the After Kernel.
CoRR, 2021

When does gradient descent with logistic loss interpolate using deep networks with smoothed ReLU activations?
Proceedings of the Conference on Learning Theory, 2021

2020
New bounds on the price of bandit feedback for mistake-bounded online multiclass learning.
Theor. Comput. Sci., 2020

Oracle lower bounds for stochastic gradient sampling algorithms.
CoRR, 2020

On the Global Convergence of Training Deep Linear ResNets.
Proceedings of the 8th International Conference on Learning Representations, 2020

Generalization bounds for deep convolutional neural networks.
Proceedings of the 8th International Conference on Learning Representations, 2020

On the Complexity of Proper Distribution-Free Learning of Linear Classifiers.
Proceedings of the Algorithmic Learning Theory, 2020

2019
On the Effect of the Activation Function on the Distribution of Hidden Nodes in a Deep Network.
Neural Comput., 2019

Gradient Descent with Identity Initialization Efficiently Learns Positive-Definite Linear Transformations by Deep Residual Networks.
Neural Comput., 2019

Benign Overfitting in Linear Regression.
CoRR, 2019

Size-free generalization bounds for convolutional neural networks.
CoRR, 2019

Density Estimation for Shift-Invariant Multidimensional Distributions.
Proceedings of the 10th Innovations in Theoretical Computer Science Conference, 2019

The Singular Values of Convolutional Layers.
Proceedings of the 7th International Conference on Learning Representations, 2019

2018
Representing smooth functions as compositions of near-identity functions with implications for deep network optimization.
CoRR, 2018

Gradient descent with identity initialization efficiently learns positive definite linear transformations.
Proceedings of the 35th International Conference on Machine Learning, 2018

Learning Sums of Independent Random Variables with Sparse Collective Support.
Proceedings of the 59th IEEE Annual Symposium on Foundations of Computer Science, 2018

2017
Surprising properties of dropout in deep networks.
J. Mach. Learn. Res., 2017

The Power of Localization for Efficiently Learning Linear Separators with Noise.
J. ACM, 2017

How to select a winner in evolutionary optimization?
Proceedings of the 2017 IEEE Symposium Series on Computational Intelligence, 2017

2016
Dropout Versus Weight Decay for Deep Networks.
CoRR, 2016

2015
On the inductive bias of dropout.
J. Mach. Learn. Res., 2015

Special Issue on New Theoretical Challenges in Machine Learning.
Algorithmica, 2015

2014
On the Weight of Halfspaces over Hamming Balls.
SIAM J. Discret. Math., 2014

Benchmarking large-scale Fine-Grained Categorization.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2014

2013
Algorithms and hardness results for parallel large margin learning.
J. Mach. Learn. Res., 2013

The Power of Localization for Efficiently Learning Linear Separators with Malicious Noise.
CoRR, 2013

Low-weight halfspaces for sparse boolean vectors.
Proceedings of the Innovations in Theoretical Computer Science, 2013

Consistency versus Realizable H-Consistency for Multiclass Classification.
Proceedings of the 30th International Conference on Machine Learning, 2013

Active and passive learning of linear separators under log-concave distributions.
Proceedings of the COLT 2013, 2013

2012
Linear classifiers are nearly optimal when hidden variables have diverse effects.
Mach. Learn., 2012

On the necessity of irrelevant variables.
J. Mach. Learn. Res., 2012

New Bounds for Learning Intervals with Implications for Semi-Supervised Learning.
Proceedings of the COLT 2012, 2012

2011
Learning large-margin halfspaces with more malicious noise.
Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

2010
Random classification noise defeats all convex potential boosters.
Mach. Learn., 2010

Restricted Boltzmann Machines are Hard to Approximately Evaluate or Simulate.
Proceedings of the 27th International Conference on Machine Learning (ICML-10), 2010

Finding Planted Partitions in Nearly Linear Time using Arrested Spectral Clustering.
Proceedings of the 27th International Conference on Machine Learning (ICML-10), 2010

2009
Learning Halfspaces with Malicious Noise.
J. Mach. Learn. Res., 2009

Using the doubling dimension to analyze the generalization of learning algorithms.
J. Comput. Syst. Sci., 2009

Linear Classifiers are Nearly Optimal When Hidden Variables Have Diverse Effect.
Proceedings of the COLT 2009, 2009

Baum's Algorithm Learns Intersections of Halfspaces with Respect to Log-Concave Distributions.
Proceedings of the Approximation, 2009

2008
Preface.
Theor. Comput. Sci., 2008

Guest editors' introduction: Special issue on learning theory.
J. Comput. Syst. Sci., 2008

Adaptive Martingale Boosting.
Proceedings of the Advances in Neural Information Processing Systems 21, 2008

2007
Online Learning of Multiple Tasks with a Shared Loss.
J. Mach. Learn. Res., 2007

Discriminative learning can succeed where generative learning fails.
Inf. Process. Lett., 2007

Boosting the Area under the ROC Curve.
Proceedings of the Advances in Neural Information Processing Systems 20, 2007

One-Pass Boosting.
Proceedings of the Advances in Neural Information Processing Systems 20, 2007

2006
Attribute-efficient learning of decision lists and linear threshold functions under unconcentrated distributions.
Proceedings of the Advances in Neural Information Processing Systems 19, 2006

Learnability and the doubling dimension.
Proceedings of the Advances in Neural Information Processing Systems 19, 2006

Discriminative Learning Can Succeed Where Generative Learning Fails.
Proceedings of the Learning Theory, 19th Annual Conference on Learning Theory, 2006

Online Multitask Learning.
Proceedings of the Learning Theory, 19th Annual Conference on Learning Theory, 2006

Editors' Introduction.
Proceedings of the Algorithmic Learning Theory, 17th International Conference, 2006

Predicting Electricity Distribution Feeder Failures Using Machine Learning Susceptibility Analysis.
Proceedings of the Proceedings, 2006

2005
Performance guarantees for hierarchical clustering.
J. Comput. Syst. Sci., 2005

Unsupervised evidence integration.
Proceedings of the Machine Learning, 2005

Martingale Boosting.
Proceedings of the Learning Theory, 18th Annual Conference on Learning Theory, 2005

2004
Efficient algorithms for learning functions with bounded variation.
Inf. Comput., 2004

Mistake Bounds for Maximum Entropy Discrimination.
Proceedings of the Advances in Neural Information Processing Systems 17 [Neural Information Processing Systems, 2004

2003
Boosting and Microarray Data.
Mach. Learn., 2003

A Theoretical Analysis of Query Selection for Collaborative Filtering.
Mach. Learn., 2003

On the difficulty of approximately maximizing agreements.
J. Comput. Syst. Sci., 2003

An upper bound on the sample complexity of PAC-learning halfspaces with respect to the uniform distribution.
Inf. Process. Lett., 2003

Reinforcement Learning with Immediate Rewards and Linear Hypotheses.
Algorithmica, 2003

Boosting with Diverse Base Classifiers.
Proceedings of the Computational Learning Theory and Kernel Machines, 2003

2002
The Relaxed Online Maximum Margin Algorithm.
Mach. Learn., 2002

Minimum Majority Classification and Boosting.
Proceedings of the Eighteenth National Conference on Artificial Intelligence and Fourteenth Conference on Innovative Applications of Artificial Intelligence, July 28, 2002

2001
The one-inclusion graph algorithm is near-optimal for the prediction model of learning.
IEEE Trans. Inf. Theory, 2001

Improved Bounds on the Sample Complexity of Learning.
J. Comput. Syst. Sci., 2001

Using the Pseudo-Dimension to Analyze Approximation Algorithms for Integer Programming.
Proceedings of the Algorithms and Data Structures, 7th International Workshop, 2001

On Agnostic Learning with {0, *, 1}-Valued and Real-Valued Hypotheses.
Proceedings of the Computational Learning Theory, 2001

A Theoretical Analysis of Query Selection for Collaborative Filtering.
Proceedings of the Computational Learning Theory, 2001

Agnostic Boosting.
Proceedings of the Computational Learning Theory, 2001

2000
Improved bounds about on-line learning of smooth-functions of a single variable.
Theor. Comput. Sci., 2000

On-Line Learning with Linear Loss Constraints.
Inf. Comput., 2000

Apple Tasting.
Inf. Comput., 2000

Approximating Hyper-Rectangles: Learning and Pseudo-random Sets
Electron. Colloquium Comput. Complex., 2000

Simulating Access to Hidden Information while Learning
Electron. Colloquium Comput. Complex., 2000

1999
Text compression via alphabet re-representation.
Neural Networks, 1999

The Complexity of Learning According to Two Models of a Drifting Environment.
Mach. Learn., 1999

Structural Results About On-line Learning Models With and Without Queries.
Mach. Learn., 1999

Dictionary Selection Using Partial Matching.
Inf. Sci., 1999

Adaptive Disk Spindown via Optimal Rent-to-Buy in Probabilistic Environments.
Algorithmica, 1999

Associative Reinforcement Learning using Linear Probabilistic Concepts.
Proceedings of the Sixteenth International Conference on Machine Learning (ICML 1999), Bled, Slovenia, June 27, 1999

1998
Efficient cost measures for motion estimation at low bit rates.
IEEE Trans. Circuits Syst. Video Technol., 1998

PAC Learning Axis-aligned Rectangles with Respect to Product Distributions from Multiple-Instance Examples.
Mach. Learn., 1998

Prediction, Learning, Uniform Convergence, and Scale-Sensitive Dimensions.
J. Comput. Syst. Sci., 1998

Approximating Hyper-Rectangles: Learning and Pseudorandom Sets.
J. Comput. Syst. Sci., 1998

On the Sample Complexity of Learning Functions with Bounded Variation.
Proceedings of the Eleventh Annual Conference on Computational Learning Theory, 1998

1997
Guest Editor's Introduction.
Mach. Learn., 1997

On the Complexity of Learning from Drifting Distributions.
Inf. Comput., 1997

On-line Evaluation and Prediction using Linear Functions.
Proceedings of the Tenth Annual Conference on Computational Learning Theory, 1997

1996
Worst-case quadratic loss bounds for prediction using linear functions and gradient descent.
IEEE Trans. Neural Networks, 1996

Fat-Shattering and the Learnability of Real-Valued Functions.
J. Comput. Syst. Sci., 1996

Efficient Cost Measures for Motion Compensation at Low Bit Rates (Extended Abstract).
Proceedings of the 6th Data Compression Conference (DCC '96), Snowbird, Utah, USA, March 31, 1996

1995
On the sample complexity of PAC learning half-spaces against the uniform distribution.
IEEE Trans. Neural Networks, 1995

On-Line Learning of Smooth Functions of a Single Variable.
Theor. Comput. Sci., 1995

On the Complexity of Function Learning.
Mach. Learn., 1995

A Generalization of Sauer's Lemma.
J. Comb. Theory, Ser. A, 1995

Characterizations of Learnability for Classes of {0, ..., n}-Valued Functions.
J. Comput. Syst. Sci., 1995

On-line Learning of Linear Functions.
Comput. Complex., 1995

Learning to Make Rent-to-Buy Decisions with Systems Applications.
Proceedings of the Machine Learning, 1995

Multiple-Dictionary Coding Using Partial Matching.
Proceedings of the IEEE Data Compression Conference, 1995

More Theorems about Scale-sensitive Dimensions and Learning.
Proceedings of the Eigth Annual Conference on Computational Learning Theory, 1995

1994
Composite Geometric Concepts and Polynomial Predictability
Inf. Comput., September, 1994

Tracking Drifting Concepts By Minimizing Disagreements.
Mach. Learn., 1994

Halfspace Learning, Linear Programming, and Nonmalicious Distributions.
Inf. Process. Lett., 1994

Explicit Bit Minimization for Motion-Compensated Video Coding.
Proceedings of the IEEE Data Compression Conference, 1994

1993
On-Line Learning with Linear Loss Constraints.
Proceedings of the Sixth Annual ACM Conference on Computational Learning Theory, 1993

Worst-Case Quadratic Loss Bounds for a Generalization of the Widrow-Hoff Rule.
Proceedings of the Sixth Annual ACM Conference on Computational Learning Theory, 1993

1992
Apple Tasting and Nearly One-Sided Learning
Proceedings of the 33rd Annual Symposium on Foundations of Computer Science, 1992

The Learning Complexity of Smooth Functions of a Single Variable.
Proceedings of the Fifth Annual ACM Conference on Computational Learning Theory, 1992

Characterizations of Learnability for Classes of {<i>O, ..., n</i>}-Valued Functions.
Proceedings of the Fifth Annual ACM Conference on Computational Learning Theory, 1992

1991
Tracking Drifting Concepts Using Random Examples.
Proceedings of the Fourth Annual Workshop on Computational Learning Theory, 1991


  Loading...