Ziming Miao

Orcid: 0000-0001-7466-2128

According to our database1, Ziming Miao authored at least 19 papers between 2018 and 2026.

Collaborative distances:

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Unifying Sparse Attention with Hierarchical Memory for Scalable Long-Context LLM Serving.
CoRR, April, 2026

MetaAttention: A Unified and Performant Attention Framework across Hardware Backends.
Proceedings of the 31st ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2026

2025
rStar2-Agent: Agentic Reasoning Technical Report.
CoRR, August, 2025

Reinforcement Learning with Verifiable Rewards Implicitly Incentivizes Correct Reasoning in Base LLMs.
CoRR, June, 2025

AttentionEngine: A Versatile Framework for Efficient Attention Mechanisms on Diverse Hardware Platforms.
CoRR, February, 2025

WaferLLM: A Wafer-Scale LLM Inference System.
CoRR, February, 2025

WaferLLM: Large Language Model Inference at Wafer Scale.
Proceedings of the 19th USENIX Symposium on Operating Systems Design and Implementation, 2025

MoE-CAP: Benchmarking Cost, Accuracy and Performance of Sparse Mixture-of-Experts Systems.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

NeuStream: Bridging Deep Learning Serving and Stream Processing.
Proceedings of the Twentieth European Conference on Computer Systems, 2025

2024
MoE-CAP: Cost-Accuracy-Performance Benchmarking for Mixture-of-Experts Systems.
CoRR, 2024

Scaling Deep Learning Computation over the Inter-Core Connected Intelligence Processor.
CoRR, 2024

Scaling Deep Learning Computation over the Inter-Core Connected Intelligence Processor with T10.
Proceedings of the ACM SIGOPS 30th Symposium on Operating Systems Principles, 2024

Ladder: Enabling Efficient Low-Precision Deep Learning Computing through Hardware-aware Tensor Transformation.
Proceedings of the 18th USENIX Symposium on Operating Systems Design and Implementation, 2024

2023
Cocktailer: Analyzing and Optimizing Dynamic Control Flow in Deep Learning.
Proceedings of the 17th USENIX Symposium on Operating Systems Design and Implementation, 2023

Welder: Scheduling Deep Learning Memory Access via Tile-graph.
Proceedings of the 17th USENIX Symposium on Operating Systems Design and Implementation, 2023

Model-enhanced Vector Index.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

2022
A Neural Corpus Indexer for Document Retrieval.
CoRR, 2022

A Neural Corpus Indexer for Document Retrieval.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

2018
Automatic Water-Body Segmentation From High-Resolution Satellite Images via Deep Networks.
IEEE Geosci. Remote. Sens. Lett., 2018


  Loading...