Wenheng Ma
Orcid: 0000-0003-2349-7286
According to our database1,
Wenheng Ma authored at least 4 papers
between 2024 and 2026.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2026
CD-LLM: A Heterogeneous Multi-FPGA System for Batched Decoding of 70B+ LLMs Using a Compute-Dedicated Architecture.
ACM Trans. Reconfigurable Technol. Syst., March, 2026
2025
FMC-LLM: Enabling FPGAs for Efficient Batched Decoding of 70B+ LLMs with a Memory-Centric Streaming Architecture.
Proceedings of the 2025 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2025
PARO: Hardware-Software Co-design with Pattern-aware Reorder-based Attention Quantization in Video Generation Models.
Proceedings of the 62nd ACM/IEEE Design Automation Conference, 2025
2024
FlightLLM: Efficient Large Language Model Inference with a Complete Mapping Flow on FPGAs.
Proceedings of the 2024 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2024