Rui Ma

Orcid: 0000-0001-9611-5870

Affiliations:
  • Microsoft Research, Beijing, China
  • University of Texas at Austin, TX, USA (former)
  • Mangoboost, Bellevue, WA, USA (former)


According to our database1, Rui Ma authored at least 15 papers between 2017 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

Online presence:

On csauthors.net:

Bibliography

2026
Understand and Accelerate Memory Processing Pipeline for Disaggregated LLM Inference.
CoRR, March, 2026

LUMINA: LLM-Guided GPU Architecture Exploration via Bottleneck Analysis.
CoRR, March, 2026

SmartNIC-Enabled Live Migration for Storage-Optimized VMs with PYROCUMULUS.
Proceedings of the 23rd USENIX Symposium on Networked Systems Design and Implementation, 2026

LUT-LLM: Efficient Language Model Inference with Memory-based Computations on FPGAs.
Proceedings of the 34th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2026

2025
LUT-LLM: Efficient Large Language Model Inference with Memory-based Computations on FPGAs.
CoRR, November, 2025

TENET: An Efficient Sparsity-Aware LUT-Centric Architecture for Ternary LLM Inference On Edge.
CoRR, September, 2025

Beyond the Time Domain: Recent Advances on Frequency Transforms in Time Series Analysis.
CoRR, April, 2025

Miniature: Fast AI Supercomputer Networks Simulation on FPGAs.
Proceedings of the 9th Asia-Pacific Workshop on Networking, 2025

2024
Primate: A Framework to Automatically Generate Soft Processors for Network Applications.
IEEE Comput. Archit. Lett., 2024

2022
FPGA-based AI Smart NICs for Scalable Distributed AI Training Systems.
CoRR, 2022

FPGA-Based AI Smart NICs for Scalable Distributed AI Training Systems.
IEEE Comput. Archit. Lett., 2022

2021
DO-GPU: Domain Optimizable Soft GPUs.
Proceedings of the 31st International Conference on Field-Programmable Logic and Applications, 2021

2020
Beyond Peak Performance: Comparing the Real Performance of AI-Optimized FPGAs and GPUs.
Proceedings of the International Conference on Field-Programmable Technology, 2020

2019
Specializing FGPU for Persistent Deep Learning.
Proceedings of the 29th International Conference on Field Programmable Logic and Applications, 2019

2017
OCEAN: An on-chip incremental-learning enhanced processor with gated recurrent neural network accelerators.
Proceedings of the 43rd IEEE European Solid State Circuits Conference, 2017


  Loading...