Zifan He

Orcid: 0009-0008-3482-838X

According to our database1, Zifan He authored at least 23 papers between 2022 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
H<sup>2</sup>MT: Semantic Hierarchy-Aware Hierarchical Memory Transformer.
CoRR, May, 2026

Understand and Accelerate Memory Processing Pipeline for Disaggregated LLM Inference.
CoRR, March, 2026

FlexLLM: Composable HLS Library for Flexible Hybrid LLM Accelerator Design.
CoRR, January, 2026

Enabling Context-Switchable Monolithic 3D FPGA Design Using Bistable Ferroelectric Inverters.
Proceedings of the 34th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2026

LUT-LLM: Efficient Language Model Inference with Memory-based Computations on FPGAs.
Proceedings of the 34th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2026

2025
LUT-LLM: Efficient Large Language Model Inference with Memory-based Computations on FPGAs.
CoRR, November, 2025

A Versatile Foundation Model for AI-enabled Mammogram Interpretation.
CoRR, September, 2025

Genome-Anchored Foundation Model Embeddings Improve Molecular Prediction from Histology Images.
CoRR, June, 2025

Towards Hierarchical Multi-Step Reward Models for Enhanced Reasoning in Large Language Models.
CoRR, March, 2025

Monolithic 3D FPGAs Utilizing Back-End-of-Line Configuration Memories.
CoRR, January, 2025

HMT: Hierarchical Memory Transformer for Efficient Long Context Language Processing.
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

Optimized Multi-Token Joint Decoding With Auxiliary Model for LLM Inference.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Automated Design Space Exploration in High-Level Physical Synthesis.
Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2025

InTRRA: Inter-Task Resource-Repurposing Accelerator for Efficient Transformer Inference on FPGAs.
Proceedings of the 2025 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2025

NoH: NoC Compilation in High-Level Synthesis.
Proceedings of the 33rd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2025

InTAR: Inter-Task Auto-Reconfigurable Accelerator Design for High Data Volume Variation in DNNs.
Proceedings of the 33rd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2025

Monolithic 3D FPGA Design and Synthesis with Back-End-of-Line Configuration Memories.
Proceedings of the 62nd ACM/IEEE Design Automation Conference, 2025

Dynamic-Width Speculative Beam Decoding for LLM Inference.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024
Dynamic-Width Speculative Beam Decoding for Efficient LLM Inference.
CoRR, 2024

Multi-Token Joint Speculative Decoding for Accelerating Large Language Model Inference.
CoRR, 2024

HMT: Hierarchical Memory Transformer for Long Context Language Processing.
CoRR, 2024

LevelST: Stream-based Accelerator for Sparse Triangular Solver.
Proceedings of the 2024 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2024

2022
Optimization of Assisted Search Over Server-Mediated Peer-to-peer Networks.
Proceedings of the IEEE Global Communications Conference, 2022


  Loading...