Yujun Lin

Orcid: 0000-0001-6314-1722

Affiliations:
  • Massachusetts Institute of Technology, Cambridge, USA
  • Tsinghua University, Department of Electronic Engineering, Beijing, China (former)


According to our database1, Yujun Lin authored at least 35 papers between 2016 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
Radial Attention: O(n log n) Sparse Attention with Energy Decay for Long Video Generation.
CoRR, June, 2025

Sparse VideoGen2: Accelerate Video Generation with Sparse Attention via Semantic-Aware Permutation.
CoRR, May, 2025

LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention.
CoRR, February, 2025

Sparse VideoGen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity.
CoRR, February, 2025

SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer.
CoRR, January, 2025

SANA: Efficient High-Resolution Text-to-Image Synthesis with Linear Diffusion Transformers.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

SVDQuant: Absorbing Outliers by Low-Rank Component for 4-Bit Diffusion Models.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

LEGO: Spatial Accelerator Generation and Optimization for Tensor Applications.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2025

2024
Algorithm-System-Hardware Co-Design for Efficient 3D Deep Learning.
World Sci. Annu. Rev. Artif. Intell., 2024

SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models.
CoRR, 2024

SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers.
CoRR, 2024

QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving.
CoRR, 2024

2022
Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications.
ACM Trans. Design Autom. Electr. Syst., 2022

TorchSparse: Efficient Point Cloud Inference Engine.
Proceedings of the Fifth Conference on Machine Learning and Systems, 2022

QuantumNAS: Noise-Adaptive Search for Robust Quantum Circuits.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022

2021
Delayed Gradient Averaging: Tolerate the Communication Latency for Federated Learning.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

PointAcc: Efficient Point Cloud Accelerator.
Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021

NAAS: Neural Accelerator Architecture Search.
Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021

2020
Long Live TIME: Improving Lifetime and Security for NVM-Based Training-in-Memory Systems.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2020

AutoML for Architecting Efficient and Specialized Neural Networks.
IEEE Micro, 2020

Hardware-Centric AutoML for Mixed-Precision Quantization.
Int. J. Comput. Vis., 2020

MCUNet: Tiny Deep Learning on IoT Devices.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Lite Transformer with Long-Short Range Attention.
Proceedings of the 8th International Conference on Learning Representations, 2020

Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution.
Proceedings of the Computer Vision - ECCV 2020, 2020

APQ: Joint Search for Network Architecture, Pruning and Quantization Policy.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019
Design Automation for Efficient Deep Learning Computing.
CoRR, 2019

Point-Voxel CNN for Efficient 3D Deep Learning.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

A Fine-Grained Sparse Accelerator for Multi-Precision DNN.
Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2019

A Configurable Multi-Precision CNN Computing Framework Based on Single Bit RRAM.
Proceedings of the 56th Annual Design Automation Conference 2019, 2019

HAQ: Hardware-Aware Automated Quantization With Mixed Precision.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018
HAQ: Hardware-Aware Automated Quantization.
CoRR, 2018

Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training.
Proceedings of the 6th International Conference on Learning Representations, 2018

Long live TIME: improving lifetime for training-in-memory engines by structured gradient sparsification.
Proceedings of the 55th Annual Design Automation Conference, 2018

2017
On the Understanding of Interdependency of Mobile App Usage.
Proceedings of the 14th IEEE International Conference on Mobile Ad Hoc and Sensor Systems, 2017

2016
Big Data Driven Mobile Traffic Understanding and Forecasting: A Time Series Approach.
IEEE Trans. Serv. Comput., 2016


  Loading...