Kai Lu

Orcid: 0000-0002-7757-4083

Affiliations:
  • Huazhong University of Science and Technology, Wuhan National Laboratory for Optoelectronics, Wuhan, China


According to our database1, Kai Lu authored at least 27 papers between 2020 and 2026.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

Online presence:

On csauthors.net:

Bibliography

2026
ScoutAttention: Efficient KV Cache Offloading via Layer-Ahead CPU Pre-computation for LLM Inference.
CoRR, March, 2026

CEMU: Enabling Full-System Emulation of Computational Storage Beyond Hardware Limits.
Proceedings of the 31st ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2026

2025
DFlush: DPU-Offloaded Flush for Disaggregated LSM-based Key-Value Stores.
Proc. ACM Manag. Data, June, 2025

NStore: A High-Performance NUMA-Aware Key-Value Store for Hybrid Memory.
IEEE Trans. Computers, March, 2025

DShuffle: DPU-Optimized Shuffle Framework for Large-scale Data Processing.
Proceedings of the 2025 USENIX Annual Technical Conference, 2025

BAGNet: A Boundary-Aware Graph Attention Network for 3D Point Cloud Semantic Segmentation.
Proceedings of the International Joint Conference on Neural Networks, 2025

MADLLM: Multivariate Anomaly Detection via Pre-trained LLMs.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2025

MoQAE: Mixed-Precision Quantization for Long-Context LLM Inference via Mixture of Quantization-Aware Experts.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

RATE-Nav: Region-Aware Termination Enhancement for Zero-shot Object Navigation with Vision-Language Models.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

RUNA: Object-Level Out-of-Distribution Detection via Regional Uncertainty Alignment of Multimodal Representations.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024
PeakFS: An Ultra-High Performance Parallel File System via Computing-Network-Storage Co-Optimization for HPC Applications.
IEEE Trans. Parallel Distributed Syst., October, 2024

Scythe: A Low-latency RDMA-enabled Distributed Transaction System for Disaggregated Memory.
ACM Trans. Archit. Code Optim., September, 2024

D<sup>2</sup>Comp: Efficient Offload of LSM-tree Compaction with Data Processing Units on Disaggregated Storage.
ACM Trans. Archit. Code Optim., September, 2024

WIPE: A Write-Optimized Learned Index for Persistent Memory.
ACM Trans. Archit. Code Optim., June, 2024

A Contract-aware and Cost-effective LSM Store for Cloud Storage with Low Latency Spikes.
ACM Trans. Storage, May, 2024

Rcmp: Reconstructing RDMA-Based Memory Disaggregation via CXL.
ACM Trans. Archit. Code Optim., March, 2024

SepHash: A Write-Optimized Hash Index On Disaggregated Memory via Separate Segment Structure.
Proc. VLDB Endow., January, 2024

Hammer: Towards Efficient Hot-Cold Data Identification via Online Learning.
CoRR, 2024

TrickleKV: A High-Performance Key-Value Store on Disaggregated Storage With Low Network Traffic.
IEEE Access, 2024

Value-Driven Mixed-Precision Quantization for Patch-Based Inference on Microcontrollers.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2024

2023
DComp: Efficient Offload of LSM-tree Compaction with Data Processing Units.
Proceedings of the 52nd International Conference on Parallel Processing, 2023

Shoggoth: Towards Efficient Edge-Cloud Collaborative Real-Time Video Inference via Adaptive Online Learning.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

2022
TridentKV: A Read-Optimized LSM-Tree Based KV Store via Adaptive Indexing and Space-Efficient Partitioning.
IEEE Trans. Parallel Distributed Syst., 2022

RLRP: High-Efficient Data Placement with Reinforcement Learning for Modern Distributed Storage Systems.
Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022

ADSTS: Automatic Distributed Storage Tuning System Using Deep Reinforcement Learning.
Proceedings of the 51st International Conference on Parallel Processing, 2022

2020
RangeKV: An Efficient Key-Value Store Based on Hybrid DRAM-NVM-SSD Storage Structure.
IEEE Access, 2020

Disperse Access Considered Energy Inefficiency in Intel Optane DC Persistent Memory Servers.
Proceedings of the 40th IEEE International Conference on Distributed Computing Systems, 2020


  Loading...