Beidi Chen

According to our database1, Beidi Chen authored at least 48 papers between 2016 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding.
CoRR, 2024

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection.
CoRR, 2024

LLM Inference Unveiled: Survey and Roofline Model Insights.
CoRR, 2024

Sequoia: Scalable, Robust, and Hardware-aware Speculative Decoding.
CoRR, 2024

Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference.
CoRR, 2024

Learn To be Efficient: Build Structured Sparsity in Large Language Models.
CoRR, 2024

KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache.
CoRR, 2024

2023
HexGen: Generative Inference of Foundation Model over Heterogeneous Decentralized Environment.
CoRR, 2023

JoMA: Demystifying Multilayer Transformers via JOint Dynamics of MLP and Attention.
CoRR, 2023

Efficient Streaming Language Models with Attention Sinks.
CoRR, 2023

H<sub>2</sub>O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.
CoRR, 2023

InRank: Incremental Low-Rank Learning.
CoRR, 2023

Compress, Then Prompt: Improving Accuracy-Efficiency Trade-off of LLM Inference with Transferable Prompt.
CoRR, 2023

High-throughput Generative Inference of Large Language Models with a Single GPU.
CoRR, 2023

Modeling Scattering Coefficients using Self-Attentive Complex Polynomials with Image-based Representation.
CoRR, 2023

H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Scan and Snap: Understanding Training Dynamics and Token Composition in 1-layer Transformer.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Laughing Hyena Distillery: Extracting Compact Recurrences From Convolutions.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

CocktailSGD: Fine-tuning Foundation Models over 500Mbps Networks.
Proceedings of the International Conference on Machine Learning, 2023

Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time.
Proceedings of the International Conference on Machine Learning, 2023

FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU.
Proceedings of the International Conference on Machine Learning, 2023

Fast Algorithms for a New Relaxation of Optimal Transport.
Proceedings of the Thirty Sixth Annual Conference on Learning Theory, 2023

2022
Fine-tuning Language Models over Slow Networks using Activation Compression with Guarantees.
CoRR, 2022

Decentralized Training of Foundation Models in Heterogeneous Environments.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Fine-tuning Language Models over Slow Networks using Activation Quantization with Guarantees.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

HALOS: Hashing Large Output Space for Cheap Inference.
Proceedings of Machine Learning and Systems 2022, 2022

Monarch: Expressive Structured Matrices for Efficient and Accurate Training.
Proceedings of the International Conference on Machine Learning, 2022

Pixelated Butterfly: Simple and Efficient Sparse training for Neural Network Models.
Proceedings of the Tenth International Conference on Learning Representations, 2022

2021
Satellite Images and Deep Learning to Identify Discrepancy in Mailing Addresses with Applications to Census 2020 in Houston.
CoRR, 2021

Scatterbrain: Unifying Sparse and Low-rank Attention Approximation.
CoRR, 2021

Locality Sensitive Teaching.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Scatterbrain: Unifying Sparse and Low-rank Attention.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

A Tale of Two Efficient and Informative Negative Sampling Distributions.
Proceedings of the 38th International Conference on Machine Learning, 2021

SOLAR: Sparse Orthogonal Learned and Random Embeddings.
Proceedings of the 9th International Conference on Learning Representations, 2021

MONGOOSE: A Learnable LSH Framework for Efficient Neural Network Training.
Proceedings of the 9th International Conference on Learning Representations, 2021

2020
A Constant-time Adaptive Negative Sampling.
CoRR, 2020

Climbing the WOL: Training for Cheaper Inference.
CoRR, 2020

SLIDE : In Defense of Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning Systems.
Proceedings of Machine Learning and Systems 2020, 2020

Angular Visual Hardness.
Proceedings of the 37th International Conference on Machine Learning, 2020

2019
Sub-Linear Privacy-Preserving Near-Neighbor Search.
IACR Cryptol. ePrint Arch., 2019

Lsh-sampling Breaks the Computation Chicken-and-egg Loop in Adaptive Stochastic Gradient Estimation.
CoRR, 2019

SLIDE : In Defense of Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning Systems.
CoRR, 2019

Fast and Accurate Stochastic Gradient Estimation.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

2018
Densified Winner Take All (WTA) Hashing for Sparse Datasets.
Proceedings of the Thirty-Fourth Conference on Uncertainty in Artificial Intelligence, 2018

Lsh-Sampling breaks the Computational chicken-and-egg Loop in adaptive stochastic Gradient estimation.
Proceedings of the 6th International Conference on Learning Representations, 2018

2017
Unique Entity Estimation with Application to the Syrian Conflict.
CoRR, 2017

2016
Sub-linear Privacy-preserving Search with Untrusted Server and Semi-honest Parties.
CoRR, 2016

Revisiting Winner Take All (WTA) Hashing for Sparse Datasets.
CoRR, 2016


  Loading...