Zaifeng Pan

Orcid: 0000-0002-6759-2616

According to our database¹, Zaifeng Pan authored at least 24 papers between 2021 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

SkillSmith: Compiling Agent Skills into Boundary-Guided Runtime Interfaces.

[BibT_eX]

[DOI]

CoRR, May, 2026

ChipMATE: Multi-Agent Training via Reinforcement Learning for Enhanced RTL Generation.

[BibT_eX]

[DOI]

CoRR, May, 2026

TLX: Hardware-Native, Evolvable MIMW GPU Compiler for Large-scale Production Environments.

[BibT_eX]

[DOI]

Nicholas J. Riasanovsky

CoRR, May, 2026

FlashEvolve: Accelerating Agent Self-Evolution with Asynchronous Stage Orchestration.

[BibT_eX]

[DOI]

CoRR, May, 2026

AMMA: A Multi-Chiplet Memory-Centric Architecture for Low-Latency 1M Context Attention Serving.

[BibT_eX]

[DOI]

Zhongkai Yu

Haotian Ye

Chenyang Zhou

Ohm Rishabh Venkatachalam

CoRR, April, 2026

JigsawRL: Assembling RL Pipelines for Efficient LLM Post-Training.

[BibT_eX]

[DOI]

CoRR, April, 2026

Pancake: Hierarchical Memory System for Multi-Agent LLM Serving.

[BibT_eX]

[DOI]

CoRR, February, 2026

ScaleSim: Serving Large-Scale Multi-Agent Simulation with Invocation Distance-Based Memory Management.

[BibT_eX]

[DOI]

CoRR, January, 2026

ChipBench: A Next-Step Benchmark for Evaluating LLM Performance in AI-Aided Chip Design.

[BibT_eX]

[DOI]

CoRR, January, 2026

2025

HedraRAG: Coordinating LLM Generation and Database Retrieval in Heterogeneous RAG Serving.

[BibT_eX]

[DOI]

CoRR, July, 2025

PluS: Highly Efficient and Expandable ML Compiler with Pluggable Graph Schedules.

[BibT_eX]

[DOI]

Proceedings of the 2025 USENIX Annual Technical Conference, 2025

HedraRAG: Co-Optimizing Generation and Retrieval for Heterogeneous RAG Workflows.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGOPS 31st Symposium on Operating Systems Principles, 2025

Mercury: Unlocking Multi-GPU Operator Optimization for LLMs via Remote Memory Scheduling.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGOPS 31st Symposium on Operating Systems Principles, 2025

WLB-LLM: Workload-Balanced 4D Parallelism for Large Language Model Training.

[BibT_eX]

[DOI]

Proceedings of the 19th USENIX Symposium on Operating Systems Design and Implementation, 2025

KVFlow: Efficient Prefix Caching for Accelerating LLM-Based Multi-Agent Workflows.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

Yggdrasil: Bridging Dynamic Speculation and Static Runtime for Latency-Optimal Tree-Based LLM Decoding.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

FastTree: Optimizing Attention Kernel and Runtime for Tree-Structured LLM Inference.

[BibT_eX]

[DOI]

Proceedings of the Eighth Conference on Machine Learning and Systems, 2025

2024

Compressed data direct computing for Chinese dataset on DCU.

[BibT_eX]

[DOI]

CCF Trans. High Perform. Comput., April, 2024

RecFlex: Enabling Feature Heterogeneity-Aware Optimization for Deep Recommendation Models with Flexible Schedules.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2024

2023

BladeDISC: Optimizing Dynamic Shape Machine Learning Workloads via Compiler Approach.

[BibT_eX]

[DOI]

Proc. ACM Manag. Data, September, 2023

RECom: A Compiler Approach to Accelerating Recommendation Model Inference with Massive Embedding Columns.

[BibT_eX]

[DOI]

Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

2022

Exploring Data Analytics Without Decompression on Embedded GPU Systems.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2022

G-SLIDE: A GPU-Based Sub-Linear Deep Learning Engine via LSH Sparsification.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2022

2021

G-TADOC: Enabling Efficient GPU-Based Text Analytics without Decompression.

[BibT_eX]

[DOI]

Proceedings of the 37th IEEE International Conference on Data Engineering, 2021

Zaifeng Pan

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...