Kezhao Huang

Orcid: 0009-0006-7273-0952

According to our database1, Kezhao Huang authored at least 16 papers between 2020 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
ChituDiffusion: A Data-Characteristic-Aware Serving System for Diffusion Models.
Proceedings of the 31st ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2026

Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models.
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2026

2025
mTuner: Accelerating Parameter-Efficient Fine-Tuning on Multi-GPU Servers with Elastic Tensor.
Proceedings of the 2025 USENIX Annual Technical Conference, 2025

HypeReca: Distributed Heterogeneous In-Memory Embedding Database for Training Recommender Models.
Proceedings of the 2025 USENIX Annual Technical Conference, 2025

HSampler : Optimizing Multi-GPU GNN Sampling with Collision-Avoid Selection.
Proceedings of the Network and Parallel Computing, 2025

IntelliGen: Instruction-Level Auto-tuning for Tensor Program with Monotonic Memory Optimization.
Proceedings of the 23rd ACM/IEEE International Symposium on Code Generation and Optimization, 2025

2024
FreshGNN: Reducing Memory Access via Stable Historical Embeddings for Graph Neural Network Training.
Proc. VLDB Endow., February, 2024

PUZZLE: Efficiently Aligning Large Language Models through Light-Weight Context Switch.
Proceedings of the 2024 USENIX Annual Technical Conference, 2024

WiseGraph: Optimizing GNN with Joint Workload Partition of Graph and Operations.
Proceedings of the Nineteenth European Conference on Computer Systems, 2024

2023
PowerFusion: A Tensor Compiler with Explicit Data Movement Description and Instruction-level Graph IR.
CoRR, 2023

ReFresh: Reducing Memory Access from Exploiting Stable Historical Embeddings for Graph Neural Network Training.
CoRR, 2023

EINNET: Optimizing Tensor Programs with Derivation-Based Transformations.
Proceedings of the 17th USENIX Symposium on Operating Systems Design and Implementation, 2023

2022
OLLIE: Derivation-based Tensor Program Optimizer.
CoRR, 2022

2021
Critique of "Planetary Normal Mode Computation: Parallel Algorithms, Performance, and Reproducibility" by SCC Team From Tsinghua University.
IEEE Trans. Parallel Distributed Syst., 2021

Understanding and bridging the gaps in current GNN performance optimizations.
Proceedings of the PPoPP '21: 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2021

2020
A Comprehensive Evaluation of RDMA-enabled Concurrency Control Protocols.
CoRR, 2020


  Loading...