Ruibo Fan

Orcid: 0000-0002-5528-8633

According to our database1, Ruibo Fan authored at least 12 papers between 2023 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Dissecting Outlier Dynamics in LLM NVFP4 Pretraining.
CoRR, February, 2026

ROME: Maximizing GPU Efficiency for All-Pairs Shortest Path via Taming Fine-Grained Irregularities.
Proceedings of the 31st ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2026

ZipServ: Fast and Memory-Efficient LLM Inference with Hardware-Aware Lossless Compression.
Proceedings of the 31st ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2026

2025
Dissecting the NVIDIA Hopper Architecture through Microbenchmarking and Multiple Level Analysis.
CoRR, January, 2025

STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

SpInfer: Leveraging Low-Level Sparsity for Efficient Large Language Model Inference on GPUs.
Proceedings of the Twentieth European Conference on Computer Systems, 2025

2024
TVRPCA+: Low-rank and sparse decomposition based on spectral norm and structural sparsity-inducing norm.
Signal Process., April, 2024

Benchmarking and Dissecting the Nvidia Hopper GPU Architecture.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024

DTC-SpMM: Bridging the Gap in Accelerating General Sparse Matrix Multiplication with Tensor Cores.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

2023
Weighted Schatten p-norm and Laplacian scale mixture-based low-rank and sparse decomposition for foreground-background separation.
J. Electronic Imaging, 2023

Dissecting the Runtime Performance of the Training, Fine-tuning, and Inference of Large Language Models.
CoRR, 2023

Fast Sparse GPU Kernels for Accelerated Training of Graph Neural Networks.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023


  Loading...