Xianzhi Yu
Orcid: 0000-0002-1497-5525
According to our database1,
Xianzhi Yu authored at least 41 papers
between 2020 and 2026.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2026
CoRR, April, 2026
BATQuant: Outlier-resilient MXFP4 Quantization via Learnable Block-wise Optimization.
CoRR, March, 2026
Unleashing Low-Bit Inference on Ascend NPUs: A Comprehensive Evaluation of HiFloat Formats.
CoRR, February, 2026
What Makes Low-Bit Quantization-Aware Training Work for Reasoning LLMs? A Systematic Study.
CoRR, January, 2026
Benchmarking Post-Training Quantization of Large Language Models under Microscaling Floating Point Formats.
CoRR, January, 2026
Revisiting Judge Decoding from First Principles via Training-Free Distributional Divergence.
CoRR, January, 2026
2025
CoRR, December, 2025
E<sup>3</sup>-Pruner: Towards Efficient, Economical, and Effective Layer Pruning for Large Language Models.
CoRR, November, 2025
Scaling Up, Speeding Up: A Benchmark of Speculative Decoding for Efficient LLM Test-Time Scaling.
CoRR, September, 2025
EAQuant: Enhancing Post-Training Quantization for MoE Models via Expert-Aware Optimization.
CoRR, June, 2025
CoRR, May, 2025
CoRR, May, 2025
L-MTP: Leap Multi-Token Prediction Beyond Adjacent Context for Large Language Models.
CoRR, May, 2025
TrimR: Verifier-based Training-Free Thinking Compression for Efficient Test-Time Scaling.
CoRR, May, 2025
CoRR, April, 2025
CoRR, February, 2025
CoRR, February, 2025
CoRR, February, 2025
Proceedings of the 31th IEEE International Conference on Parallel and Distributed Systems, 2025
Proceedings of the Forty-second International Conference on Machine Learning, 2025
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025
Proceedings of the 62nd ACM/IEEE Design Automation Conference, 2025
2024
ACM Trans. Archit. Code Optim., December, 2024
2023
Proceedings of the 52nd International Conference on Parallel Processing, 2023
2022
ACM Trans. Archit. Code Optim., 2022
Proceedings of the Seventh Conference on Machine Translation, 2022
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
2021
Optimizing the LINPACK Algorithm for Large-Scale PCIe-Based CPU-GPU Heterogeneous Systems.
IEEE Trans. Parallel Distributed Syst., 2021
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2021
2020
Proceedings of the PPoPP '20: 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2020