Ranggi Hwang

Orcid: 0000-0003-2343-586X

According to our database1, Ranggi Hwang authored at least 11 papers between 2020 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
SpecMoE: A Fast and Efficient Mixture-of-Experts Inference via Self-Assisted Speculative Decoding.
CoRR, April, 2026

Exploring KV Cache Quantization in Multimodal Large Language Model Inference.
IEEE Comput. Archit. Lett., 2026

PASCAL: A Phase-Aware Scheduling Algorithm for Serving Reasoning-based Large Language Models.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2026

2025
Debunking the CUDA Myth Towards GPU-based AI Systems.
CoRR, January, 2025

Debunking the CUDA Myth Towards GPU-based AI Systems: Evaluation of the Performance and Programmability of Intel's Gaudi NPU for AI Model Serving.
Proceedings of the 52nd Annual International Symposium on Computer Architecture, 2025

2024
Pre-gated MoE: An Algorithm-System Co-Design for Fast and Scalable Mixture-of-Expert Inference.
Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024

LazyDP: Co-Designing Algorithm-Software for Scalable Training of Differentially Private Recommendation Models.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

2023
HAMMER: Hardware-Friendly Approximate Computing for Self-Attention With Mean-Redistribution And Linearization.
IEEE Comput. Archit. Lett., 2023

GROW: A Row-Stationary Sparse-Dense GEMM Accelerator for Memory-Efficient Graph Convolutional Neural Networks.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2023

2022
DiVa: An Accelerator for Differentially Private Machine Learning.
Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

2020
Centaur: A Chiplet-based, Hybrid Sparse-Dense Accelerator for Personalized Recommendations.
Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020


  Loading...