Rengan Xu

According to our database1, Rengan Xu authored at least 16 papers between 2013 and 2019.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2019
Densifying Assumed-Sparse Tensors - Improving Memory Efficiency and MPI Collective Performance During Tensor Accumulation for Parallelized Training of Neural Machine Translation Models.
Proceedings of the High Performance Computing - 34th International Conference, 2019

2018
The OpenACC data model: Preliminary study on its major challenges and implementations.
Parallel Comput., 2018

Deep Learning at Scale on NVIDIA V100 Accelerators.
Proceedings of the 2018 IEEE/ACM Performance Modeling, 2018

2017
Implementing the OpenACC Data Model.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium Workshops, 2017

2016
Compiler transformation of nested loops for general purpose GPUs.
Concurr. Comput. Pract. Exp., 2016

An Analytical Model-Based Auto-tuning Framework for Locality-Aware Loop Scheduling.
Proceedings of the High Performance Computing - 31st International Conference, 2016

Optimizing GPU Register Usage: Extensions to OpenACC and Compiler Optimizations.
Proceedings of the 45th International Conference on Parallel Processing, 2016

2015
Multi-GPU Support on Single Node Using Directive-Based Programming Model.
Sci. Program., 2015

2014
Accelerating Kirchhoff migration on GPU using directives.
Proceedings of the First Workshop on Accelerator Programming using Directives, 2014

SPEC ACCEL: A Standard Application Suite for Measuring Hardware Accelerator Performance.
Proceedings of the High Performance Computing Systems. Performance Modeling, Benchmarking, and Simulation, 2014

Reduction Operations in Parallel Loops for GPGPUs.
Proceedings of the 2014 PPOPP International Workshop on Programming Models and Applications for Multicores and Manycores, 2014

NAS Parallel Benchmarks for GPGPUs Using a Directive-Based Programming Model.
Proceedings of the Languages and Compilers for Parallel Computing, 2014

A Validation Testsuite for OpenACC 1.0.
Proceedings of the 2014 IEEE International Parallel & Distributed Processing Symposium Workshops, 2014

2013
Compiling a High-Level Directive-Based Programming Model for GPGPUs.
Proceedings of the Languages and Compilers for Parallel Computing, 2013

Exploring Programming Multi-GPUs Using OpenMP and OpenACC-Based Hybrid Model.
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

Filesystem Aware Scalable I/O Framework for Data-Intensive Parallel Applications.
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013


  Loading...