Hailin Zhang

Orcid: 0009-0000-4188-7742

Affiliations:
  • Peking University, Beijing, China


According to our database1, Hailin Zhang authored at least 18 papers between 2021 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
PQCache: Product Quantization-based KVCache for Long Context LLM Inference.
Proc. ACM Manag. Data, June, 2025

Malleus: Straggler-Resilient Hybrid Parallel Training of Large-scale Models via Malleable Data and Model Parallelization.
Proc. ACM Manag. Data, June, 2025

Efficient and scalable huge embedding model training via distributed cache management.
VLDB J., May, 2025

CAFE+: Towards Compact, Adaptive, and Fast Embedding for Large-scale Online Recommendation Models.
ACM Trans. Inf. Syst., May, 2025

SALE : Low-bit Estimation for Efficient Sparse Attention in Long-context LLM Prefilling.
CoRR, May, 2025

MEMO: Fine-grained Tensor Management For Ultra-long Context LLM Training.
Proc. ACM Manag. Data, February, 2025

2024
A Unified Framework for Mining Batch and Periodic Batch in Data Streams.
IEEE Trans. Knowl. Data Eng., November, 2024

CAFE: Towards Compact, Adaptive, and Fast Embedding for Large-scale Recommendation Models.
Proc. ACM Manag. Data, February, 2024

Efficiently Training 7B LLM with 1 Million Sequence Length on 8 GPUs.
CoRR, 2024

Retrieval-Augmented Generation for AI-Generated Content: A Survey.
CoRR, 2024

Enabling Parallelism Hot Switching for Efficient Training of Large Language Models.
Proceedings of the ACM SIGOPS 30th Symposium on Operating Systems Principles, 2024

Surge Phenomenon in Optimal Learning Rate and Batch Size Scaling.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

2023
Experimental Analysis of Large-scale Learnable Vector Storage Compression.
Proc. VLDB Endow., December, 2023

Hetu: a highly efficient automatic parallel distributed deep learning system.
Sci. China Inf. Sci., January, 2023

Model-enhanced Vector Index.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

2022
Galvatron: Efficient Transformer Training over Multiple GPUs Using Automatic Parallelism.
Proc. VLDB Endow., 2022

HET-GMP: A Graph-based System Approach to Scaling Large Embedding Model Training.
Proceedings of the SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12, 2022

2021
HET: Scaling out Huge Embedding Model Training via Cache-enabled Distributed Framework.
Proc. VLDB Endow., 2021


  Loading...