Wei Wu
Orcid: 0000-0002-2750-6365Affiliations:
- Los Alamos National Laboratory, NM, USA
- University of Tennessee, Department of Electrical Engineering and Computer Science, Knoxville, TN, USA (PhD 2017)
- Purdue University Calumet, Department of Electrical and Computer Engineering, Hammond, IN, USA (until 2010)
According to our database1,
Wei Wu
authored at least 24 papers
between 2009 and 2022.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on scopus.com
-
on orcid.org
On csauthors.net:
Bibliography
2022
IEEE Trans. Parallel Distributed Syst., 2022
Unity: Accelerating DNN Training Through Joint Optimization of Algebraic Transformations and Parallelization.
Proceedings of the 16th USENIX Symposium on Operating Systems Design and Implementation, 2022
2021
O3BNN-R: An Out-of-Order Architecture for High-Performance and Regularized BNN Inference.
IEEE Trans. Parallel Distributed Syst., 2021
2020
Proceedings of the International Conference for High Performance Computing, 2020
FFT-based Gradient Sparsification for the Distributed Training of Deep Neural Networks.
Proceedings of the HPDC '20: The 29th International Symposium on High-Performance Parallel and Distributed Computing, 2020
Proceedings of the IEEE International Conference on Cluster Computing, 2020
Proceedings of the IEEE International Conference on Cluster Computing, 2020
2019
O3BNN: an out-of-order architecture for high-performance binarized neural network inference with fine-grained pruning.
Proceedings of the ACM International Conference on Supercomputing, 2019
2018
SuperNeurons: FFT-based Gradient Sparsification in the Distributed Training of Deep Neural Networks.
CoRR, 2018
Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2018
Proceedings of the 27th International Symposium on High-Performance Parallel and Distributed Computing, 2018
2017
Proceedings of the on Thematic Workshops of ACM Multimedia 2017, Mountain View, CA, USA, October 23, 2017
A Web-based Visual Analytic Framework for Understanding Large-scale Environmental Models: A Use Case for The Community Land Model.
Proceedings of the International Conference on Computational Science, 2017
Compiler technologies for understanding legacy scientific code: A case study on an ACME land module.
Proceedings of the International Conference on Computational Science, 2017
2016
Proceedings of the 9th Annual Workshop on General Purpose Processing using Graphics Processing Unit, 2016
BLASX: A High Performance Level-3 BLAS Library for Heterogeneous Multi-GPU Computing.
Proceedings of the 2016 International Conference on Supercomputing, 2016
Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing, 2016
2015
BLASX: A High Performance Level-3 BLAS Library for Heterogeneous Multi-GPU Computing.
CoRR, 2015
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium, 2015
A Scientific Function Test Framework for Modular Environmental Model Development: Application to the Community Land Model.
Proceedings of the 1st IEEE/ACM International Workshop on Software Engineering for High Performance Computing in Science, 2015
2013
Comput. Methods Programs Biomed., 2013
2010
Proceedings of the 8th IEEE International Conference on Control and Automation, 2010
2009
Computer Based Simulation of Chromosome Abnormality.
Proceedings of the International Conference on Bioinformatics & Computational Biology, 2009