Bin Yu

Affiliations:
  • University of California, Berkeley, Department of Statistics, CA, USA
  • University of Wisconsin at Madison, Department of Statistics, WI, USA


According to our database1, Bin Yu authored at least 124 papers between 1992 and 2023.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2023
Tell Your Model Where to Attend: Post-hoc Attention Steering for LLMs.
CoRR, 2023

Prominent Roles of Conditionally Invariant Components in Domain Adaptation: Theory and Algorithms.
CoRR, 2023

The Effect of SGD Batch Size on Autoencoder Learning: Sparsity, Sharpness, and Feature Learning.
CoRR, 2023

MDI+: A Flexible Random Forest-Based Feature Importance Framework.
CoRR, 2023

An Investigation into the Effects of Pre-training Data Distributions for Pathology Report Classification.
CoRR, 2023

Explaining black box text modules in natural language with language models.
CoRR, 2023

Bridging Discrete and Backpropagation: Straight-Through and Beyond.
CoRR, 2023

2022
Towards Robust Waveform-Based Acoustic Models.
IEEE ACM Trans. Audio Speech Lang. Process., 2022

VeridicalFlow: a Python package for building trustworthy data science pipelines with PCS.
J. Open Source Softw., 2022

A Mixing Time Lower Bound for a Simplified Version of BART.
CoRR, 2022

Group Probability-Weighted Tree Sums for Interpretable Modeling of Heterogeneous Data.
CoRR, 2022

Hierarchical Shrinkage: improving the accuracy and interpretability of tree-based methods.
CoRR, 2022

Fast Interpretable Greedy-Tree Sums (FIGS).
CoRR, 2022

Hierarchical Shrinkage: Improving the accuracy and interpretability of tree-based models.
Proceedings of the International Conference on Machine Learning, 2022

A cautionary tale on fitting decision trees to data from additive models: generalization lower bounds.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

2021
imodels: a python package for fitting interpretable models.
J. Open Source Softw., 2021

Supervised line attention for tumor attribute classification from pathology reports: Higher performance with less data.
J. Biomed. Informatics, 2021

Structural Compression of Convolutional Neural Networks with Applications in Interpretability.
Frontiers Big Data, 2021

Adaptive wavelet distillation from neural networks through interpretations.
CoRR, 2021

Adaptive wavelet distillation from neural networks through interpretations.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

2020
Fast mixing of Metropolized Hamiltonian Monte Carlo: Benefits of multi-step gradients.
J. Mach. Learn. Res., 2020

Unique Sharp Local Minimum in L1-minimization Complete Dictionary Learning.
J. Mach. Learn. Res., 2020

Enriched Annotations for Tumor Attribute Classification from Pathology Reports with Limited Labeled Data.
CoRR, 2020

Stable discovery of interpretable subgroups via calibration in causal studies.
CoRR, 2020

Revisiting complexity and the bias-variance tradeoff.
CoRR, 2020

Instability, Computational Efficiency and Statistical Accuracy.
CoRR, 2020

Curating a COVID-19 data repository and forecasting county-level death counts in the United States.
CoRR, 2020

Transformation Importance with Applications to Cosmology.
CoRR, 2020

Veridical Data Science.
Proceedings of the WSDM '20: The Thirteenth ACM International Conference on Web Search and Data Mining, 2020

Interpreting and Improving Deep-Learning Models with Reality Checks.
Proceedings of the xxAI - Beyond Explainable AI, 2020

Interpretations are Useful: Penalizing Explanations to Align Neural Networks with Prior Knowledge.
Proceedings of the 37th International Conference on Machine Learning, 2020

Sharp Analysis of Expectation-Maximization for Weakly Identifiable Models.
Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

2019
Log-concave sampling: Metropolis-Hastings algorithms are fast.
J. Mach. Learn. Res., 2019

Disentangled Attribution Curves for Interpreting Random Forests and Boosted Trees.
CoRR, 2019

Unique Sharp Local Minimum in $\ell_1$-minimization Complete Dictionary Learning.
CoRR, 2019

Challenges with EM in application to weakly identifiable mixture models.
CoRR, 2019

Three principles of data science: predictability, computability, and stability (PCS).
CoRR, 2019

Interpretable machine learning: definitions, methods, and applications.
CoRR, 2019

A Debiased MDI Feature Importance Measure for Random Forests.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Hierarchical interpretations for neural network predictions.
Proceedings of the 7th International Conference on Learning Representations, 2019

2018
Artificial intelligence and statistics.
Frontiers Inf. Technol. Electron. Eng., 2018

iRF: extracting interactions from random forests.
J. Open Source Softw., 2018

Fast MCMC Sampling Algorithms on Polytopes.
J. Mach. Learn. Res., 2018

Refining interaction search through signed iterative Random Forests.
CoRR, 2018

Stability and Convergence Trade-off of Iterative Optimization Algorithms.
CoRR, 2018

Beyond Word Importance: Contextual Decomposition to Extract Interactions from LSTMs.
Proceedings of the 6th International Conference on Learning Representations, 2018

Three principles of data science: predictability, computability, and stability (PCS).
Proceedings of the IEEE International Conference on Big Data (IEEE BigData 2018), 2018

2017
Interpreting Convolutional Neural Networks Through Compression.
CoRR, 2017

Structural Compression of Convolutional Neural Networks Based on Greedy Filter Pruning.
CoRR, 2017

Three Principles of Data Science: Predictability, Stability and Computability.
Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, August 13, 2017

Vaidya walk: A sampling algorithm based on the volumetric barrier.
Proceedings of the 55th Annual Allerton Conference on Communication, 2017

2016
Ten Simple Rules for Effective Statistical Practice.
PLoS Comput. Biol., 2016

Formulas for Counting the Sizes of Markov Equivalence Classes of Directed Acyclic Graphs.
CoRR, 2016

DataLab: a version data management and analytics system.
Proceedings of the 2nd International Workshop on BIG Data Software Engineering, 2016

Supervised Neighborhoods for Distributed Nonparametric Regression.
Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, 2016

Do retinal ganglion cells project natural scenes to their principal subspace and whiten them?
Proceedings of the 50th Asilomar Conference on Signals, Systems and Computers, 2016

2015
A statistical perspective on algorithmic leveraging.
J. Mach. Learn. Res., 2015

Counting and exploring sizes of Markov equivalence classes of directed acyclic graphs.
J. Mach. Learn. Res., 2015

2014
Early stopping and non-parametric regression: an optimal data-dependent stopping rule.
J. Mach. Learn. Res., 2014

Error Rate Bounds and Iterative Weighted Majority Voting for Crowdsourcing.
CoRR, 2014

Concise comparative summaries (CCS) of large text corpora with a human experiment.
CoRR, 2014

Statistical guarantees for the EM algorithm: From population to sample-based analysis.
CoRR, 2014

Changepoint Analysis for Efficient Variant Calling.
Proceedings of the Research in Computational Molecular Biology, 2014

Impact of regularization on spectral clustering.
Proceedings of the 2014 Information Theory and Applications Workshop, 2014

2013
Supervised feature selection in graphs with path coding penalties and network flows.
J. Mach. Learn. Res., 2013

Error Rate Bounds in Crowdsourcing Models.
CoRR, 2013

2012
Minimax-Optimal Rates For Sparse Additive Models Over Kernel Classes Via Convex Programming.
J. Mach. Learn. Res., 2012

Reversible MCMC on Markov equivalence classes of sparse directed acyclic graphs
CoRR, 2012

Multiple-kernel learning-based unmixing algorithm for estimation of cloud fractions with MODIS and CloudSat data.
Proceedings of the 2012 IEEE International Geoscience and Remote Sensing Symposium, 2012

Complexity Analysis of the Lasso Regularization Path.
Proceedings of the 29th International Conference on Machine Learning, 2012

2011
Minimax Rates of Estimation for High-Dimensional Linear Regression Over <sub>q</sub> -Balls.
IEEE Trans. Inf. Theory, 2011

Combined Features and Kernel Design for Noise Robust Phoneme Classification Using Support Vector Machines.
IEEE Trans. Speech Audio Process., 2011

Preface.
Math. Program., 2011

Distributed modal identification using restricted auto regressive models.
Int. J. Syst. Sci., 2011

SBA-term: Sparse Bilingual Association for Terms.
Proceedings of the 5th IEEE International Conference on Semantic Computing (ICSC 2011), 2011

Early stopping for non-parametric regression: An optimal data-dependent stopping rule.
Proceedings of the 49th Annual Allerton Conference on Communication, 2011

2010
Restricted Eigenvalue Properties for Correlated Gaussian Designs.
J. Mach. Learn. Res., 2010

Predicting Execution Time of Computer Programs Using Sparse Polynomial Regression.
Proceedings of the Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 6-9 December 2010, 2010

Discovering word associations in news media via feature selection and sparse classification.
Proceedings of the 11th ACM SIGMM International Conference on Multimedia Information Retrieval, 2010

Multi-Task Sparse Discriminant Analysis (MtSDA) with Overlapping Categories.
Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, 2010

2009
Information in the Nonstationary Case.
Neural Comput., 2009

Minimax rates of estimation for high-dimensional linear regression over $\ell_q$-balls.
CoRR, 2009

Lower bounds on minimax rates for nonparametric regression with additive sparsity and smoothness.
Proceedings of the Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, 2009

A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers.
Proceedings of the Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, 2009

Some statistical issues in estimating information in neural spike trains.
Proceedings of the IEEE International Conference on Acoustics, 2009

Minimax rates of convergence for high-dimensional regression under ℓq-ball sparsity.
Proceedings of the 47th Annual Allerton Conference on Communication, 2009

2008
Information In The Non-Stationary Case
CoRR, 2008

Nonparametric sparse hierarchical models describe V1 fMRI responses to natural images.
Proceedings of the Advances in Neural Information Processing Systems 21, 2008

Model Selection in Gaussian Graphical Models: High-Dimensional Consistency of l<sub>1</sub>-regularized MLE.
Proceedings of the Advances in Neural Information Processing Systems 21, 2008

Data spectroscopy: learning mixture models using eigenspaces of convolution operators.
Proceedings of the Machine Learning, 2008

Combined PLP - Acoustic waveform classification for robust phoneme recognition using support vector machines.
Proceedings of the 2008 16th European Signal Processing Conference, 2008

Towards robust phoneme classification: Augmentation of PLP models with acoustic waveforms.
Proceedings of the 2008 16th European Signal Processing Conference, 2008

2007
Embracing Statistical Challenges in the Information Technology Age.
Technometrics, 2007

Stagewise Lasso.
J. Mach. Learn. Res., 2007

2006
A fast lightweight approach to origin-destination IP traffic estimation using partial measurements.
IEEE Trans. Inf. Theory, 2006

On Model Selection Consistency of Lasso.
J. Mach. Learn. Res., 2006

Sparse Boosting.
J. Mach. Learn. Res., 2006

Approximation Lasso Methods for Language Modeling.
Proceedings of the ACL 2006, 2006

2004
Guest Editorial: Special Issue on Machine Learning Methods in Signal Processing.
IEEE Trans. Signal Process., 2004

Maximum entropy models: convergence rates and applications in dynamic system monitoring.
Proceedings of the 2004 IEEE International Symposium on Information Theory, 2004

2003
Maximum pseudo likelihood estimation in network tomography.
IEEE Trans. Signal Process., 2003

Microarray image compression: SLOCO and the effect of information loss.
Signal Process., 2003

Simultaneous Gene Clustering and Subset Selection for Sample Classification Via MDL.
Bioinform., 2003

Pseudo Likelihood Estimation in Network Tomography.
Proceedings of the Proceedings IEEE INFOCOM 2003, The 22nd Annual Joint Conference of the IEEE Computer and Communications Societies, San Franciso, CA, USA, March 30, 2003

On the Convergence of Boosting Procedures.
Proceedings of the Machine Learning, 2003

2002
Perceptual audio coding using adaptive pre- and post-filters and lossless compression.
IEEE Trans. Speech Audio Process., 2002

Compression of cDNA microarray images.
Proceedings of the 2002 IEEE International Symposium on Biomedical Imaging, 2002

Compression of cDNA and inkjet microarray images.
Proceedings of the 2002 International Conference on Image Processing, 2002

2001
Lossless coding of audio signals using cascaded prediction.
Proceedings of the IEEE International Conference on Acoustics, 2001

Low Delay Perpetually Lossless Coding of Audio Signals.
Proceedings of the Data Compression Conference, 2001

2000
Iterated logarithmic expansions of the pathwise code lengths for exponential families.
IEEE Trans. Inf. Theory, 2000

Wavelet thresholding via MDL for natural images.
IEEE Trans. Inf. Theory, 2000

Wavelet thresholding for multiple noisy image copies.
IEEE Trans. Image Process., 2000

Adaptive wavelet thresholding for image denoising and compression.
IEEE Trans. Image Process., 2000

Spatially adaptive wavelet thresholding with context modeling for image denoising.
IEEE Trans. Image Process., 2000

1999
Image subband coding using context-based classification and adaptive quantization.
IEEE Trans. Image Process., 1999

Penalized discriminant analysis of in situ hyperspectral data for conifer species recognition.
IEEE Trans. Geosci. Remote. Sens., 1999

1998
The Minimum Description Length Principle in Coding and Modeling.
IEEE Trans. Inf. Theory, 1998

Multiple Copy Image Denoising via Wavelet Thresholding.
Proceedings of the 1998 IEEE International Conference on Image Processing, 1998

1997
Image Denoising via Lossy Compression and Wavelet Thresholding.
Proceedings of the Proceedings 1997 International Conference on Image Processing, 1997

1996
Lower Bounds on Expected Redundancy for Nonparametric Classes.
IEEE Trans. Inf. Theory, 1996

Adaptive quantization of image subbands with efficient overhead rate selection.
Proceedings of the Proceedings 1996 International Conference on Image Processing, 1996

1993
A rate of convergence result for a universal D-semifaithful code.
IEEE Trans. Inf. Theory, 1993

1992
Density estimation by stochastic complexity.
IEEE Trans. Inf. Theory, 1992


  Loading...