Weiming Hu

Orcid: 0009-0003-5115-0498

Affiliations:
  • Shanghai Jiao Tong University, Shanghai, China


According to our database1, Weiming Hu authored at least 8 papers between 2022 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

Online presence:

On csauthors.net:

Bibliography

2026
M2XFP: A Metadata-Augmented Microscaling Data Format for Efficient Low-bit Quantization.
CoRR, January, 2026

M<sup>2</sup>XFP: A Metadata-Augmented Microscaling Data Format for Efficient Low-bit Quantization.
Proceedings of the 31st ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2026

2025
eLLM: Elastic Memory Management Framework for Efficient LLM Serving.
CoRR, June, 2025

Yggdrasil: Bridging Dynamic Speculation and Static Runtime for Latency-Optimal Tree-Based LLM Decoding.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

M-ANT: Efficient Low-bit Group Quantization for LLMs via Mathematically Adaptive Numerical Type.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2025

2024
vTensor: Flexible Virtual Tensor Management for Efficient LLM Serving.
CoRR, 2024

2023
OliVe: Accelerating Large Language Models via Hardware-friendly Outlier-Victim Pair Quantization.
Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023

2022
Cache-locality Based Adaptive Warp Scheduling for Neural Network Acceleration on GPGPUs.
Proceedings of the 35th IEEE International System-on-Chip Conference, 2022


  Loading...