Weiming Hu
Orcid: 0009-0003-5115-0498Affiliations:
- Shanghai Jiao Tong University, Shanghai, China
According to our database1,
Weiming Hu authored at least 8 papers
between 2022 and 2026.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2026
M2XFP: A Metadata-Augmented Microscaling Data Format for Efficient Low-bit Quantization.
CoRR, January, 2026
M<sup>2</sup>XFP: A Metadata-Augmented Microscaling Data Format for Efficient Low-bit Quantization.
Proceedings of the 31st ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2026
2025
CoRR, June, 2025
Yggdrasil: Bridging Dynamic Speculation and Static Runtime for Latency-Optimal Tree-Based LLM Decoding.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025
M-ANT: Efficient Low-bit Group Quantization for LLMs via Mathematically Adaptive Numerical Type.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2025
2024
2023
OliVe: Accelerating Large Language Models via Hardware-friendly Outlier-Victim Pair Quantization.
Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023
2022
Cache-locality Based Adaptive Warp Scheduling for Neural Network Acceleration on GPGPUs.
Proceedings of the 35th IEEE International System-on-Chip Conference, 2022