Daniel Hsu

CoRR, October, 2025

Fast attention mechanisms: a tale of parallelism.

[BibT_eX]

[DOI]

CoRR, September, 2025

Survey on Algorithms for multi-index models.

[BibT_eX]

[DOI]

Joan Bruna

CoRR, April, 2025

Efficient Estimation of the Central Mean Subspace via Smoothed Gradient Outer Products.

[BibT_eX]

[DOI]

SIAM J. Math. Data Sci., 2025

Learning Compositional Functions with Transformers from Easy-to-Hard Data.

[BibT_eX]

[DOI]

Proceedings of the Thirty Eighth Annual Conference on Learning Theory, 2025

Learning Gaussian Multi-Index Models with Gradient Flow: Time Complexity and Directional Convergence.

[BibT_eX]

[DOI]

Berfin Simsek

Amire Bendjeddou

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2025

2024

Interactive Machine Teaching by Labeling Rules and Instances.

[BibT_eX]

[DOI]

Trans. Assoc. Comput. Linguistics, 2024

One-layer transformers fail to solve the induction heads task.

[BibT_eX]

[DOI]

CoRR, 2024

Seasonality Patterns in 311-Reported Foodborne Illness Cases and Machine Learning-Identified Indications of Foodborne Illnesses from Yelp Reviews, New York City, 2022-2023.

[BibT_eX]

[DOI]

CoRR, 2024

Polynomial time auditing of statistical subgroup fairness for Gaussian data.

[BibT_eX]

[DOI]

Jizhou Huang

Brendan Juba

CoRR, 2024

Group-wise oracle-efficient algorithms for online multi-group learning.

[BibT_eX]

[DOI]

Samuel Deng

Jingwen Liu

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Transformers Provably Learn Sparse Token Selection While Fully-Connected Nets Cannot.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Transformers, parallel computation, and logarithmic depth.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Multi-group Learning for Hierarchical Groups.

[BibT_eX]

[DOI]

Samuel Deng

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Distribution-Specific Auditing for Subgroup Fairness.

[BibT_eX]

[DOI]

Jizhou Huang

Brendan Juba

Proceedings of the 5th Symposium on Foundations of Responsible Computing, 2024

On the sample complexity of parameter estimation in logistic regression with normal design.

[BibT_eX]

[DOI]

Arya Mazumdar

Proceedings of the Thirty Seventh Annual Conference on Learning Theory, June 30, 2024

2023

On the sample complexity of estimation in logistic regression.

[BibT_eX]

[DOI]

Arya Mazumdar

CoRR, 2023

Representational Strengths and Limitations of Transformers.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Intrinsic dimensionality and generalization properties of the R-norm inductive bias.

[BibT_eX]

[DOI]

Navid Ardeshir

Clayton Hendrick Sanford

Proceedings of the Thirty Sixth Annual Conference on Learning Theory, 2023

2022

Unbiased estimators for random design regression.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2022

Statistical-Computational Trade-offs in Tensor PCA and Related Problems via Communication Complexity.

[BibT_eX]

[DOI]

Rishabh Dudeja

CoRR, 2022

Masked prediction tasks: a parameter identifiability view.

[BibT_eX]

[DOI]

CoRR, 2022

Masked Prediction: A Parameter Identifiability View.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Simple and near-optimal algorithms for hidden stratification and multi-group learning.

[BibT_eX]

[DOI]

Christopher J. Tosh

Proceedings of the International Conference on Machine Learning, 2022

Near-Optimal Statistical Query Lower Bounds for Agnostically Learning Intersections of Halfspaces with Gaussian Marginals.

[BibT_eX]

[DOI]

Emmanouil-Vasileios Vlatakis-Gkaragkounis

Clayton Hendrick Sanford

Rocco A. Servedio

Proceedings of the Conference on Learning Theory, 2-5 July 2022, London, UK., 2022

Learning Tensor Representations for Meta-Learning.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

2021

Consistent Risk Estimation in Moderately High-Dimensional Linear Regression.

[BibT_eX]

[DOI]

IEEE Trans. Inf. Theory, 2021

Contrastive Estimation Reveals Topic Posterior Information to Linear Models.

[BibT_eX]

[DOI]

Akshay Krishnamurthy

J. Mach. Learn. Res., 2021

Classification vs regression in overparameterized regimes: Does the loss function matter?

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2021

Statistical Query Lower Bounds for Tensor PCA.

[BibT_eX]

[DOI]

Rishabh Dudeja

J. Mach. Learn. Res., 2021

Bayesian decision-making under misspecified priors with applications to meta-learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Support vector machines and linear regression coincide with very high-dimensional features.

[BibT_eX]

[DOI]

Navid Ardeshir

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Generalization bounds via distillation.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

On the Approximation Power of Two-Layer Networks of Random ReLUs.

[BibT_eX]

[DOI]

Emmanouil V. Vlatakis-Gkaragkounis

Rocco A. Servedio

Proceedings of the Conference on Learning Theory, 2021

Contrastive learning, multi-view redundancy, and linear models.

[BibT_eX]

[DOI]

Akshay Krishnamurthy

Proceedings of the Algorithmic Learning Theory, 2021

On the proliferation of support vectors in high dimensions.

[BibT_eX]

[DOI]

Vidya Muthukumar

Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

Quantifying the Effects of COVID-19 on Restaurant Reviews.

[BibT_eX]

[DOI]

Ivy Cao

Zizhou Liu

Proceedings of the Ninth International Workshop on Natural Language Processing for Social Media, 2021

2020

Two Models of Double Descent for Weak Features.

[BibT_eX]

[DOI]

Mikhail Belkin

SIAM J. Math. Data Sci., 2020

Biased Programmers? Or Biased Data? A Field Experiment in Operationalizing AI Ethics.

[BibT_eX]

[DOI]

Proceedings of the EC '20: The 21st ACM Conference on Economics and Computation, 2020

Ensuring Fairness Beyond the Training Data.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Cross-Lingual Text Classification with Minimal Resources by Transferring a Sparse Teacher.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

Diameter-based Interactive Structure Discovery.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

Detecting Foodborne Illness Complaints in Multiple Languages Using English Annotations Only.

[BibT_eX]

[DOI]

Ziyi Liu

Proceedings of the 11th International Workshop on Health Text Mining and Information Analysis, 2020

2019

Kernel Approximation Methods for Speech Recognition.

[BibT_eX]

[DOI]

Avner May

Alireza Bagheri Garakani

J. Mach. Learn. Res., 2019

A cryptographic approach to black box adversarial machine learning.

[BibT_eX]

[DOI]

Kevin Shi

Allison Bishop

CoRR, 2019

Diameter-based Interactive Structure Search.

[BibT_eX]

[DOI]

CoRR, 2019

How many variables should be entered in a principal component regression equation?

[BibT_eX]

[DOI]

CoRR, 2019

Certified Robustness to Adversarial Examples with Differential Privacy.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE Symposium on Security and Privacy, 2019

Privacy accounting and quality control in the sage differentially private ML platform.

[BibT_eX]

[DOI]

Proceedings of the 27th ACM Symposium on Operating Systems Principles, 2019

On the number of variables to use in principal component regression.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Teaching a black-box learner.

[BibT_eX]

[DOI]

Proceedings of the 36th International Conference on Machine Learning, 2019

A Gradual, Semi-Discrete Approach to Generative Network Training via Explicit Wasserstein Minimization.

[BibT_eX]

[DOI]

Proceedings of the 36th International Conference on Machine Learning, 2019

Leveraging Just a Few Keywords for Fine-Grained Aspect Detection Through Weakly Supervised Co-Training.

[BibT_eX]

[DOI]

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Attribute-efficient learning of monomials over highly-correlated variables.

[BibT_eX]

[DOI]

Proceedings of the Algorithmic Learning Theory, 2019

Correcting the bias in least squares regression with volume-rescaled sampling.

[BibT_eX]

[DOI]

Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

Weakly Supervised Attention Networks for Fine-Grained Opinion Mining and Public Health.

[BibT_eX]

[DOI]

Proceedings of the 5th Workshop on Noisy User-generated Text, 2019

2018

Discovering foodborne illness in online restaurant reviews.

[BibT_eX]

[DOI]

J. Am. Medical Informatics Assoc., 2018

Reconciling modern machine learning and the bias-variance trade-off.

[BibT_eX]

[DOI]

CoRR, 2018

Tail bounds for volume sampled linear regression.

[BibT_eX]

[DOI]

José Manuel Zorrilla Matilla

CoRR, 2018

On the Connection between Differential Privacy and Adversarial Robustness in Machine Learning.

[BibT_eX]

[DOI]

CoRR, 2018

Non-Gaussian information from weak lensing data via deep learning.

[BibT_eX]

[DOI]

Arushi Gupta

Zoltán Haiman

CoRR, 2018

Benefits of over-parameterization with EM.

[BibT_eX]

[DOI]

Arian Maleki

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Leveraged volume sampling for linear regression.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Overfitting or perfect fitting? Risk bounds for classification and regression rules that interpolate.

[BibT_eX]

[DOI]

Mikhail Belkin

Partha Mitra

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Learning Single-Index Models in Gaussian Space.

[BibT_eX]

[DOI]

Rishabh Dudeja

Proceedings of the Conference On Learning Theory, 2018

2017

Greedy Approaches to Symmetric Orthogonal Tensor Decomposition.

[BibT_eX]

[DOI]

Cun Mu

Donald Goldfarb

SIAM J. Matrix Anal. Appl., 2017

Mixing time estimation in reversible Markov chains from a single sample path.

[BibT_eX]

[DOI]

CoRR, 2017

Coding with asymmetric prior knowledge.

[BibT_eX]

[DOI]

CoRR, 2017

Linear regression without correspondence.

[BibT_eX]

[DOI]

Kevin Shi

Xiaorui Sun

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

FairTest: Discovering Unwarranted Associations in Data-Driven Applications.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE European Symposium on Security and Privacy, 2017

Correspondence retrieval.

[BibT_eX]

[DOI]

Proceedings of the 30th Conference on Learning Theory, 2017

Parameter identification in Markov chain choice models.

[BibT_eX]

[DOI]

Arushi Gupta

Proceedings of the International Conference on Algorithmic Learning Theory, 2017

2016

Unsupervised Part-Of-Speech Tagging with Anchor Hidden Markov Models.

[BibT_eX]

[DOI]

Karl Stratos

Michael Collins

Trans. Assoc. Comput. Linguistics, 2016

Loss Minimization and Parameter Estimation with Heavy Tails.

[BibT_eX]

[DOI]

Sivan Sabato

J. Mach. Learn. Res., 2016

AI's 10 to Watch.

[BibT_eX]

[DOI]

Shivaram Kalyanakrishnan

IEEE Intell. Syst., 2016

Greedy bi-criteria approximations for k-medians and k-means.

[BibT_eX]

[DOI]

CoRR, 2016

Global Analysis of Expectation Maximization for Mixtures of Two Gaussians.

[BibT_eX]

[DOI]

Arian Maleki

Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Search Improves Label for Active Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Compact kernel models for acoustic modeling via random feature selection.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015

Successive Rank-One Approximations for Nearly Orthogonally Decomposable Symmetric Tensors.

[BibT_eX]

[DOI]

Cun Mu

Donald Goldfarb

SIAM J. Matrix Anal. Appl., 2015

Learning sparse low-threshold linear classifiers.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2015

Discovering Unwarranted Associations in Data-Driven Applications with the FairTest Testing Toolkit.

[BibT_eX]

[DOI]

CoRR, 2015

Method of moments learning for left-to-right Hidden Markov models.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2015

Efficient and Parsimonious Agnostic Active Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Mixing Time Estimation in Reversible Markov Chains from a Single Sample Path.

[BibT_eX]

[DOI]

Aryeh Kontorovich

Csaba Szepesvári

Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Sunlight: Fine-grained Targeting Detection at Scale with Statistical Confidence.

[BibT_eX]

[DOI]

Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, 2015

Tensor Decompositions for Learning Latent Variable Models (A Survey for ALT).

[BibT_eX]

[DOI]

Proceedings of the Algorithmic Learning Theory - 26th International Conference, 2015

Model-based Word Embeddings from Decompositions of Count Matrices.

[BibT_eX]

[DOI]

Karl Stratos

Michael Collins

Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, 2015

2014

Tensor decompositions for learning latent variable models.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2014

A tensor approach to learning mixed membership community models.

[BibT_eX]

[DOI]

Rong Ge

J. Mach. Learn. Res., 2014

Weighted sampling of outer products.

[BibT_eX]

[DOI]

CoRR, 2014

Scalable Nonlinear Learning with Adaptive Polynomial Expansions.

[BibT_eX]

[DOI]

CoRR, 2014

A Spectral Algorithm for Learning Class-Based n-gram Models of Natural Language.

[BibT_eX]

[DOI]

Proceedings of the Thirtieth Conference on Uncertainty in Artificial Intelligence, 2014

The Large Margin Mechanism for Differentially Private Maximization.

[BibT_eX]

[DOI]

Shuang Song

Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Scalable Non-linear Learning with Adaptive Polynomial Expansions.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Heavy-tailed regression with a generalized median-of-means.

[BibT_eX]

[DOI]

Sivan Sabato

Proceedings of the 31th International Conference on Machine Learning, 2014

Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits.

[BibT_eX]

[DOI]

Proceedings of the 31th International Conference on Machine Learning, 2014

2013

Approximate loss minimization with heavy tails.

[BibT_eX]

[DOI]

Sivan Sabato

CoRR, 2013

Contrastive Learning Using Spectral Methods.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

When are Overcomplete Topic Models Identifiable? Uniqueness of Tensor Tucker Decompositions with Structured Sparsity.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

Learning mixtures of spherical gaussians: moment methods and spectral decompositions.

[BibT_eX]

[DOI]

Proceedings of the Innovations in Theoretical Computer Science, 2013

Learning Linear Bayesian Networks with Latent Variables.

[BibT_eX]

[DOI]

Adel Javanmard

Proceedings of the 30th International Conference on Machine Learning, 2013

A Tensor Spectral Approach to Learning Mixed Membership Community Models.

[BibT_eX]

[DOI]

Rong Ge

Proceedings of the COLT 2013, 2013

2012

Random Design Analysis of Ridge Regression.

[BibT_eX]

[DOI]

Proceedings of the COLT 2012, 2012

A Method of Moments for Mixture Models and Hidden Markov Models.

[BibT_eX]

[DOI]

Proceedings of the COLT 2012, 2012

Analysis of a randomized approximation scheme for matrix multiplication

[BibT_eX]

[DOI]

CoRR, 2012

Learning Gaussian Mixture Models: Moment Methods and Spectral Decompositions

[BibT_eX]

[DOI]

CoRR, 2012

Two SVDs Suffice: Spectral decompositions for probabilistic topic modeling and latent Dirichlet allocation

[BibT_eX]

[DOI]

CoRR, 2012

Learning High-Dimensional Mixtures of Graphical Models

[BibT_eX]

[DOI]

CoRR, 2012

Identifiability and Unmixing of Latent Parse Trees.

[BibT_eX]

[DOI]

Percy Liang

Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

Learning Mixtures of Tree Graphical Models.

[BibT_eX]

[DOI]

Furong Huang

Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

A Spectral Algorithm for Latent Dirichlet Allocation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

Convergence Rates for Differentially Private Statistical Estimation.

[BibT_eX]

[DOI]

Proceedings of the 29th International Conference on Machine Learning, 2012

2011

Robust Matrix Decomposition With Sparse Corruptions.

[BibT_eX]

[DOI]

IEEE Trans. Inf. Theory, 2011

Sample Complexity Bounds for Differentially Private Learning.

[BibT_eX]

[DOI]

Proceedings of the COLT 2011, 2011

A tail inequality for quadratic forms of subgaussian random vectors

[BibT_eX]

[DOI]

CoRR, 2011

An Analysis of Random Design Linear Regression

[BibT_eX]

[DOI]

CoRR, 2011

Parallel Online Learning

[BibT_eX]

[DOI]

CoRR, 2011

Dimension-free tail inequalities for sums of random matrices.

[BibT_eX]

[DOI]

CoRR, 2011

Efficient Optimal Learning for Contextual Bandits.

[BibT_eX]

[DOI]

Proceedings of the UAI 2011, 2011

Spectral Methods for Learning Multivariate Latent Tree Structure.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

Stochastic convex optimization with bandit feedback.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

2010

Algorithms for active learning.

[BibT_eX]

[DOI]

PhD thesis, 2010

Robust Matrix Decomposition with Outliers

[BibT_eX]

[DOI]

CoRR, 2010

An Online Learning-based Framework for Tracking.

[BibT_eX]

[DOI]

Proceedings of the UAI 2010, 2010

Agnostic Active Learning Without Constraints.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 6-9 December 2010, 2010

2009

Tracking using explanation-based modeling

[BibT_eX]

[DOI]

CoRR, 2009

Multi-Label Prediction via Compressed Sensing.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, 2009

A Parameter-free Hedging Algorithm.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, 2009

A Spectral Algorithm for Learning Hidden Markov Models.

[BibT_eX]

[DOI]

Proceedings of the COLT 2009, 2009

2008

A new Hedging algorithm and its application to inferring latent random variables

[BibT_eX]

[DOI]