Yuanwei Fang
Orcid: 0000-0001-5600-026X
According to our database1,
Yuanwei Fang
authored at least 13 papers
between 2012 and 2025.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2025
KPerfIR: Towards an Open and Compiler-centric Ecosystem for GPU Kernel Performance Tooling on Modern AI Workloads.
CoRR, May, 2025
Mercury: Unlocking Multi-GPU Operator Optimization for LLMs via Remote Memory Scheduling.
Proceedings of the ACM SIGOPS 31st Symposium on Operating Systems Principles, 2025
KPerfIR: Towards a Open and Compiler-centric Ecosystem for GPU Kernel Performance Tooling on Modern AI Workloads.
Proceedings of the 19th USENIX Symposium on Operating Systems Design and Implementation, 2025
2023
CoRR, 2023
2022
184QPS/W 64Mb/mm<sup>2</sup>3D Logic-to-DRAM Hybrid Bonding with Process-Near-Memory Engine for Recommendation System.
Proceedings of the IEEE International Solid-State Circuits Conference, 2022
Hyperscale FPGA-as-a-service architecture for large-scale distributed graph neural network.
Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022
2019
Proc. VLDB Endow., 2019
Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops, 2019
2017
Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, 2017
2016
Proceedings of the IEEE International Instrumentation and Measurement Technology Conference, 2016
2015
10x10: A Case Study in Highly-Programmable and Energy-Efficient Heterogeneous Federated Architecture.
SIGARCH Comput. Archit. News, 2015
Proceedings of the 48th International Symposium on Microarchitecture, 2015
2012
Proceedings of the IEEE Asia Pacific Conference on Circuits and Systems, 2012