Dayou Du

According to our database1, Dayou Du authored at least 12 papers between 2018 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
MoE-CAP: Benchmarking Cost, Accuracy and Performance of Sparse Mixture-of-Experts Systems.
CoRR, May, 2025

BitDecoding: Unlocking Tensor Cores for Long-Context LLMs Decoding with Low-Bit KV Cache.
CoRR, March, 2025

Dissecting the NVIDIA Hopper Architecture through Microbenchmarking and Multiple Level Analysis.
CoRR, January, 2025

STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

2024
SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs.
CoRR, 2024

STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs.
CoRR, 2024

Model Quantization and Hardware Acceleration for Vision Transformers: A Comprehensive Survey.
CoRR, 2024

Resiliency at Scale: Managing Google's TPUv4 Machine Learning Supercomputer.
Proceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation, 2024

Benchmarking and Dissecting the Nvidia Hopper GPU Architecture.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024

AFPQ: Asymmetric Floating Point Quantization for LLMs.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

BitDistiller: Unleashing the Potential of Sub-4-Bit LLMs via Self-Distillation.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2018
Exploiting Sparsity to Accelerate Fully Connected Layers of CNN-Based Applications on Mobile SoCs.
ACM Trans. Embed. Comput. Syst., 2018


  Loading...