Youwei Zhuo

Orcid: 0000-0002-1557-2613

According to our database1, Youwei Zhuo authored at least 18 papers between 2017 and 2023.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2023
Klotski: DNN Model Orchestration Framework for Dataflow Architecture Accelerators.
Proceedings of the IEEE/ACM International Conference on Computer Aided Design, 2023

2020
SympleGraph: distributed graph processing with precise loop-carried dependency guarantee.
Proceedings of the 41st ACM SIGPLAN International Conference on Programming Language Design and Implementation, 2020

AccPar: Tensor Partitioning for Heterogeneous Deep Learning Accelerators.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2020

Prague: High-Performance Heterogeneity-Aware Asynchronous Decentralized Training.
Proceedings of the ASPLOS '20: Architectural Support for Programming Languages and Operating Systems, 2020

2019
Distributed Graph Processing System and Processing-in-memory Architecture with Precise Loop-carried Dependency Guarantee.
ACM Trans. Comput. Syst., 2019

Heterogeneity-Aware Asynchronous Decentralized Training.
CoRR, 2019

GraphQ: Scalable PIM-Based Graph Processing.
Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019

HyPar: Towards Hybrid Parallelism for Deep Learning Accelerator Array.
Proceedings of the 25th IEEE International Symposium on High Performance Computer Architecture, 2019

E-RNN: Design Optimization for Efficient Recurrent Neural Networks in FPGAs.
Proceedings of the 25th IEEE International Symposium on High Performance Computer Architecture, 2019

Hop: Heterogeneity-aware Decentralized Training.
Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, 2019

2018
Performance Evaluation and Optimization of HBM-Enabled GPU for Data-Intensive Applications.
IEEE Trans. Very Large Scale Integr. Syst., 2018

CSE: Parallel Finite State Machines with Convergence Set Enumeration.
Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, 2018

GraphP: Reducing Communication for PIM-Based Graph Processing with Efficient Data Partition.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2018

GraphR: Accelerating Graph Processing Using ReRAM.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2018

Wonderland: A Novel Abstraction-Based Out-Of-Core Graph Processing System.
Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, 2018

2017
CirCNN: Accelerating and Compressing Deep Neural Networks Using Block-CirculantWeight Matrices.
CoRR, 2017

CirCNN: accelerating and compressing deep neural networks using block-circulant weight matrices.
Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, 2017

Scalable Graph Traversal on Sunway TaihuLight with Ten Million Cores.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium, 2017


  Loading...