Chaoyi Jiang

Orcid: 0009-0008-8235-9303

According to our database1, Chaoyi Jiang authored at least 12 papers between 2024 and 2026.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Fast NF4 Dequantization Kernels for Large Language Model Inference.
CoRR, April, 2026

2025
Striking the Right Balance between Compute and Copy: Improving LLM Inferencing Under Speculative Decoding.
CoRR, November, 2025

DuetServe: Harmonizing Prefill and Decode for LLM Serving via Adaptive GPU Multiplexing.
CoRR, November, 2025

DELTA: Dynamic Layer-Aware Token Attention for Efficient Long-Context Reasoning.
CoRR, October, 2025

MARché: Fast Masked Autoregressive Image Generation with Cache-Aware Attention.
CoRR, June, 2025

DEL: Context-Aware Dynamic Exit Layer for Efficient Self-Speculative Decoding.
CoRR, April, 2025

LEAF: Lightweight, Efficient, Adaptive and Flexible Embedding for Large-Scale Recommendation Models.
Proceedings of the Nineteenth ACM Conference on Recommender Systems, 2025

Efficient Processing of Dynamic Rank-Happiness Maximization Queries.
Proceedings of the Web and Big Data - 9th International Joint Conference, 2025

KVPR: Efficient LLM Inference with I/O-Aware KV Cache Partial Recomputation.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

Balancing Fairness Among User Groups in Happiness Maximization Queries.
Proceedings of the Web Information Systems and Applications, 2025

2024
Efficient LLM Inference with I/O-Aware Partial KV Cache Recomputation.
CoRR, 2024

CADC: Encoding User-Item Interactions for Compressing Recommendation Model Training Data.
CoRR, 2024


  Loading...