Mingcong Han

Orcid: 0009-0008-1536-7485

According to our database1, Mingcong Han authored at least 15 papers between 2021 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
TensorHub: Scalable and Elastic Weight Transfer for LLM RL Training.
CoRR, April, 2026

Real-time, Work-conserving GPU Scheduling for Concurrent DNN Inference.
ACM Trans. Comput. Syst., February, 2026

DistRS: Disaggregated Reward Service for RLVR with Batch-Level Constraint.
Proceedings of the 23rd USENIX Symposium on Networked Systems Design and Implementation, 2026

Accurate and Ultra-Fast Launch-Time Validation of Idempotency for GPU Kernels.
Proceedings of the 21st European Conference on Computer Systems, 2026

2025
Fast LLM Post-training via Decoupled and Fastest-of-N Speculation.
CoRR, November, 2025

Holistic Heterogeneous Scheduling for Autonomous Applications using Fine-grained, Multi-XPU Abstraction.
CoRR, August, 2025

Colocating ML Inference and Training with Fast GPU Memory Handover.
Proceedings of the 2025 USENIX Annual Technical Conference, 2025

PhoenixOS: Concurrent OS-level GPU Checkpoint and Restore with Validated Speculation.
Proceedings of the ACM SIGOPS 31st Symposium on Operating Systems Principles, 2025

XSched: Preemptive Scheduling for Diverse XPUs.
Proceedings of the 19th USENIX Symposium on Operating Systems Design and Implementation, 2025

2024
Microsecond-scale Dynamic Validation of Idempotency for GPU Kernels.
CoRR, 2024

PARALLELGPUOS: A Concurrent OS-level GPU Checkpoint and Restore System using Validated Speculation.
CoRR, 2024

2023
DArray: A High Performance RDMA-Based Distributed Array.
Proceedings of the 52nd International Conference on Parallel Processing, 2023

2022
Microsecond-scale Preemption for Concurrent GPU-accelerated DNN Inferences.
Proceedings of the 16th USENIX Symposium on Operating Systems Design and Implementation, 2022

FNotify: A Low-Latency and Scalable Publish/Subscribe System using RDMA.
Proceedings of the 24th IEEE Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, 2022

2021
ShadowVM: accelerating data plane for data analytics with bare metal CPUs and GPUs.
Proceedings of the PPoPP '21: 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2021


  Loading...