Chengquan Jiang

Orcid: 0009-0004-9356-6034

According to our database1, Chengquan Jiang authored at least 11 papers between 2021 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Attn-QAT: 4-Bit Attention With Quantization-Aware Training.
CoRR, March, 2026

CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation.
CoRR, February, 2026

SwiftSpec: Disaggregated Speculative Decoding and Fused Kernels for Low-Latency LLM Inference.
Proceedings of the 31st ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2026

2025
LiquidGEMM: Hardware-Efficient W4A8 GEMM Kernel for High-Performance LLM Serving.
CoRR, September, 2025

SwiftSpec: Ultra-Low Latency LLM Decoding by Scaling Asynchronous Speculative Decoding.
CoRR, June, 2025

ReTool: Reinforcement Learning for Strategic Tool Use in LLMs.
CoRR, April, 2025

LiquidGEMM: Hardware-Efficient W4A8 GEMM Kernel for High-Performance LLM Serving.
Proceedings of the International Conference for High Performance Computing, 2025

COMET: Fine-grained Computation-communication Overlapping for Mixture-of-Experts.
Proceedings of the Eighth Conference on Machine Learning and Systems, 2025

2024
FLUX: Fast Software-based Communication Overlap On GPUs Through Kernel Fusion.
CoRR, 2024

2023
ByteTransformer: A High-Performance Transformer Boosted for Variable-Length Inputs.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023

2021
CoSam: An Efficient Collaborative Adaptive Sampler for Recommendation.
ACM Trans. Inf. Syst., 2021


  Loading...