Donglin Zhuang

Orcid: 0000-0003-3355-407X

According to our database¹, Donglin Zhuang authored at least 16 papers between 2020 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

OSCAR: Offline Spectral Covariance-Aware Rotation for 2-bit KV Cache Quantization.

[BibT_eX]

[DOI]

CoRR, May, 2026

Introspective Diffusion Language Models.

[BibT_eX]

[DOI]

CoRR, April, 2026

2025

Kitty: Accurate and Efficient 2-bit KV Cache Quantization with Dynamic Channel-wise Precision Boost.

[BibT_eX]

[DOI]

CoRR, November, 2025

2024

FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design.

[BibT_eX]

[DOI]

CoRR, 2024

Quant-LLM: Accelerating the Serving of Large Language Models via FP6-Centric Algorithm-System Co-Design on Modern GPUs.

[BibT_eX]

[DOI]

Proceedings of the 2024 USENIX Annual Technical Conference, 2024

MonoNN: Enabling a New Monolithic Optimization Space for Neural Network Inference Tasks on Modern GPU-Centric Architectures.

[BibT_eX]

[DOI]

Proceedings of the 18th USENIX Symposium on Operating Systems Design and Implementation, 2024

2023

Flash-LLM: Enabling Low-Cost and Highly-Efficient Large Generative Model Inference With Unstructured Sparsity.

[BibT_eX]

[DOI]

Proc. VLDB Endow., 2023

Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity.

[BibT_eX]

[DOI]

CoRR, 2023

2022

DynamAP: Architectural Support for Dynamic Graph Traversal on the Automata Processor.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2022

Randomness in Neural Network Training: Characterizing the Impact of Tooling.

[BibT_eX]

[DOI]

Proceedings of the Fifth Conference on Machine Learning and Systems, 2022

Bring orders into uncertainty: enabling efficient uncertain graph processing via novel path sampling on multi-accelerator systems.

[BibT_eX]

[DOI]

Proceedings of the ICS '22: 2022 International Conference on Supercomputing, Virtual Event, June 28, 2022

2021

Enabling Highly Efficient Capsule Networks Processing Through Software-Hardware Co-Design.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2021

An efficient uncertain graph processing framework for heterogeneous architectures.

[BibT_eX]

[DOI]

Proceedings of the PPoPP '21: 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2021

η-LSTM: Co-Designing Highly-Efficient Large LSTM Training via Exploiting Memory-Saving and Architectural Design Opportunities.

[BibT_eX]

[DOI]

Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021

ClickTrain: efficient and accurate end-to-end deep learning training via fine-grained architecture-preserving pruning.

[BibT_eX]

[DOI]

Proceedings of the ICS '21: 2021 International Conference on Supercomputing, 2021

2020

An Efficient End-to-End Deep Learning Training Framework via Fine-Grained Pattern-Based Pruning.

[BibT_eX]

[DOI]

CoRR, 2020

Donglin Zhuang

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...