Nan Wu

Affiliations:
  • National University of Defense Technology, Changsha


According to our database1, Nan Wu authored at least 43 papers between 2004 and 2019.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2019
Metaflow: A Better Traffic Abstraction for Distributed Applications.
Proceedings of the 21st IEEE International Conference on High Performance Computing and Communications; 17th IEEE International Conference on Smart City; 5th IEEE International Conference on Data Science and Systems, 2019

2015
Parallel performance modeling of irregular applications in cell-centered finite volume methods over unstructured tetrahedral meshes.
J. Parallel Distributed Comput., 2015

Towards simulation of subcellular calcium dynamics at nanometre resolution.
Int. J. High Perform. Comput. Appl., 2015

Enabling a Uniform OpenCL Device View for Heterogeneous Platforms.
IEICE Trans. Inf. Syst., 2015

2014
High efficient sedimentary basin simulations on hybrid CPU-GPU clusters.
Clust. Comput., 2014

Utilizing Multiple Xeon Phi Coprocessors on One Compute Node.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2014

Automated Transformation of GPU-Specific OpenCL Kernels Targeting Performance Portability on Multi-Core/Many-Core CPUs.
Proceedings of the Euro-Par 2014 Parallel Processing, 2014

2013
Accelerating thread-intensive and explicit memory management programs with dynamic partial reconfiguration.
J. Supercomput., 2013

Resource-efficient utilization of CPU/GPU-based heterogeneous supercomputers for Bayesian phylogenetic inference.
J. Supercomput., 2013

Simulating Cardiac Electrophysiology in the Era of GPU-Cluster Computing.
IEICE Trans. Inf. Syst., 2013

On the GPU Performance of 3D Stencil Computations Implemented in OpenCL.
Proceedings of the Supercomputing - 28th International Supercomputing Conference, 2013

On the GPU performance of cell-centered finite volume method over unstructured tetrahedral meshes.
Proceedings of the 3rd Workshop on Irregular Applications - Architectures and Algorithms, 2013

On the GPU-CPU Performance Portability of OpenCL for 3D Stencil Computations.
Proceedings of the 19th IEEE International Conference on Parallel and Distributed Systems, 2013

Performance of Sediment Transport Simulations on NVIDIA's Kepler Architecture.
Proceedings of the International Conference on Computational Science, 2013

Solving the Cardiac Model Using Multi-core CPU and Many Integrated Cores (MIC).
Proceedings of the 10th IEEE International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing, 2013

ACF: Networks-on-Chip Deadlock Recovery with Accurate Detection and Elastic Credit.
Proceedings of the Advanced Parallel Processing Technologies, 2013

2012
A Parallel H.264 Encoder with CUDA: Mapping and Evaluation.
Proceedings of the 18th IEEE International Conference on Parallel and Distributed Systems, 2012

Extending BORPH for shared memory reconfigurable computers.
Proceedings of the 22nd International Conference on Field Programmable Logic and Applications (FPL), 2012

The masala machine: accelerating thread-intensive and explicit memory management programs with dynamically reconfigurable FPGAs (abstract only).
Proceedings of the ACM/SIGDA 20th International Symposium on Field Programmable Gate Arrays, 2012

Using 1000+ GPUs and 10000+ CPUs for Sedimentary Basin Simulations.
Proceedings of the 2012 IEEE International Conference on Cluster Computing, 2012

2011
Tiled Multi-Core Stream Architecture.
Trans. High Perform. Embed. Archit. Compil., 2011

A high-efficient software parallel CAVCL encoder based on GPU.
Proceedings of the 34th International Conference on Telecommunications and Signal Processing (TSP 2011), 2011

High-efficient software parallel CAVLC encoder based on programmable stream processor.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

A Multilevel Parallel Intra Coding for H.264/AVC Based on CUDA.
Proceedings of the Sixth International Conference on Image and Graphics, 2011

2010
A Parallel Streaming Motion Estimation for Real-Time HD H.264 Encoding on Programmable Processors.
Proceedings of the Fifth International Conference on Frontier of Computer Science and Technology, 2010

Software Managed Instruction Scratchpad Memory Optimization in Stream Architecture Based on Hot Code Analysis of Kernels.
Proceedings of the 13th Euromicro Conference on Digital System Design, 2010

SAT: A Stream Architecture Template for Embedded Applications.
Proceedings of the 10th IEEE International Conference on Computer and Information Technology, 2010

2009
Streaming HD H.264 encoder on programmable processors.
Proceedings of the 17th International Conference on Multimedia 2009, 2009

Cache streamization for high performance stream processor.
Proceedings of the 16th International Conference on High Performance Computing, 2009

Software parallel CAVLC encoder based on stream processing.
Proceedings of the 7th IEEE/ACM/IFIP Workshop on Embedded Systems for Real-Time Multimedia, 2009

2008
On-Chip Memory System Optimization Design for the FT64 Scientific Stream Accelerator.
IEEE Micro, 2008

Load scheduling: Reducing pressure on distributed register files for free.
Proceedings of the 13th Asia South Pacific Design Automation Conference, 2008

FPGA-based Equivalent Simulation Technology (FEST) for clustered stream architecture.
Proceedings of the 13th Asia-Pacific Computer Systems Architecture Conference, 2008

2007
FT64: Scientific Computing with Streams.
Proceedings of the High Performance Computing, 2007

A Stream System-on-Chip Architecture for High Speed Target Recognition Based on Biologic Vision.
Proceedings of the Advances in Computer Systems Architecture, 2007

2006
Register Allocation on Stream Processor with Local Register File.
Proceedings of the Advances in Computer Systems Architecture, 11th Asia-Pacific Conference, 2006

Optimization and Evaluating of StreamYGX2 on MASA Stream Processor.
Proceedings of the Advances in Computer Systems Architecture, 11th Asia-Pacific Conference, 2006

Analysis and Performance Results of a fluid dynamics Application on MASA Stream Processor.
Proceedings of the 5th Annual IEEE/ACIS International Conference on Computer and Information Science (ICIS 2006) and 1st IEEE/ACIS International Workshop on Component-Based Software Engineering, 2006

2005
Multiple-Morphs Adaptive Stream Architecture.
J. Comput. Sci. Technol., 2005

Accelerated Motion Estimation of H.264 on Imagine Stream Processor.
Proceedings of the Image Analysis and Recognition, Second International Conference, 2005

A Stream Architecture Supporting Multiple Stream Execution Models.
Proceedings of the Advances in Computer Systems Architecture, 10th Asia-Pacific Conference, 2005

2004
A Parallel Reed-Solomon Decoder on the Imagine Stream Processor.
Proceedings of the Parallel and Distributed Processing and Applications, 2004

Multiple-Dimension Scalable Adaptive Stream Architecture.
Proceedings of the Advances in Computer Systems Architecture, 9th Asia-Pacific Conference, 2004


  Loading...