Wei Wu

Orcid: 0000-0002-2750-6365

Affiliations:
  • Los Alamos National Laboratory, NM, USA
  • University of Tennessee, Department of Electrical Engineering and Computer Science, Knoxville, TN, USA (PhD 2017)
  • Purdue University Calumet, Department of Electrical and Computer Engineering, Hammond, IN, USA (until 2010)


According to our database1, Wei Wu authored at least 24 papers between 2009 and 2022.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2022
Evaluating Data Redistribution in PaRSEC.
IEEE Trans. Parallel Distributed Syst., 2022

Unity: Accelerating DNN Training Through Joint Optimization of Algebraic Transformations and Parallelization.
Proceedings of the 16th USENIX Symposium on Operating Systems Design and Implementation, 2022

2021
O3BNN-R: An Out-of-Order Architecture for High-Performance and Regularized BNN Inference.
IEEE Trans. Parallel Distributed Syst., 2021

2020
Task bench: a parameterized benchmark for evaluating parallel runtime performance.
Proceedings of the International Conference for High Performance Computing, 2020

FFT-based Gradient Sparsification for the Distributed Training of Deep Neural Networks.
Proceedings of the HPDC '20: The 29th International Symposium on High-Performance Parallel and Distributed Computing, 2020

HAN: a Hierarchical AutotuNed Collective Communication Framework.
Proceedings of the IEEE International Conference on Cluster Computing, 2020

Flexible Data Redistribution in a Task-Based Runtime System.
Proceedings of the IEEE International Conference on Cluster Computing, 2020

2019
O3BNN: an out-of-order architecture for high-performance binarized neural network inference with fine-grained pruning.
Proceedings of the ACM International Conference on Supercomputing, 2019

2018
SuperNeurons: FFT-based Gradient Sparsification in the Distributed Training of Deep Neural Networks.
CoRR, 2018

Superneurons: dynamic GPU memory management for training deep neural networks.
Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2018

ADAPT: an event-based adaptive collective communication framework.
Proceedings of the 27th International Symposium on High-Performance Parallel and Distributed Computing, 2018

2017
Efficient Communications in Training Large Scale Neural Networks.
Proceedings of the on Thematic Workshops of ACM Multimedia 2017, Mountain View, CA, USA, October 23, 2017

A Web-based Visual Analytic Framework for Understanding Large-scale Environmental Models: A Use Case for The Community Land Model.
Proceedings of the International Conference on Computational Science, 2017

Compiler technologies for understanding legacy scientific code: A case study on an ACME land module.
Proceedings of the International Conference on Computational Science, 2017

2016
Implementing directed acyclic graphs with the heterogeneous system architecture.
Proceedings of the 9th Annual Workshop on General Purpose Processing using Graphics Processing Unit, 2016

BLASX: A High Performance Level-3 BLAS Library for Heterogeneous Multi-GPU Computing.
Proceedings of the 2016 International Conference on Supercomputing, 2016

GPU-Aware Non-contiguous Data Movement In Open MPI.
Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing, 2016

2015
Large Scale Artificial Neural Network Training Using Multi-GPUs.
CoRR, 2015

BLASX: A High Performance Level-3 BLAS Library for Heterogeneous Multi-GPU Computing.
CoRR, 2015

Hierarchical DAG Scheduling for Hybrid Distributed Systems.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium, 2015

A Scientific Function Test Framework for Modular Environmental Model Development: Application to the Community Land Model.
Proceedings of the 1st IEEE/ACM International Workshop on Software Engineering for High Performance Computing in Science, 2015

2013
Algorithms for modeling structural changes in human chromosomes.
Comput. Methods Programs Biomed., 2013

2010
Virtual chromosome modeling for learning human cytogenetics.
Proceedings of the 8th IEEE International Conference on Control and Automation, 2010

2009
Computer Based Simulation of Chromosome Abnormality.
Proceedings of the International Conference on Bioinformatics & Computational Biology, 2009


  Loading...