Proceedings of the 21st IEEE International Conference on High Performance Computing and Communications; 17th IEEE International Conference on Smart City; 5th IEEE International Conference on Data Science and Systems, 2019

2015

Parallel performance modeling of irregular applications in cell-centered finite volume methods over unstructured tetrahedral meshes.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 2015

Towards simulation of subcellular calcium dynamics at nanometre resolution.

[BibT_eX]

[DOI]

Int. J. High Perform. Comput. Appl., 2015

Enabling a Uniform OpenCL Device View for Heterogeneous Platforms.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2015

2014

High efficient sedimentary basin simulations on hybrid CPU-GPU clusters.

[BibT_eX]

[DOI]

Clust. Comput., 2014

Utilizing Multiple Xeon Phi Coprocessors on One Compute Node.

[BibT_eX]

[DOI]

Proceedings of the Algorithms and Architectures for Parallel Processing, 2014

Automated Transformation of GPU-Specific OpenCL Kernels Targeting Performance Portability on Multi-Core/Many-Core CPUs.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2014 Parallel Processing, 2014

2013

Accelerating thread-intensive and explicit memory management programs with dynamic partial reconfiguration.

[BibT_eX]

[DOI]

J. Supercomput., 2013

Resource-efficient utilization of CPU/GPU-based heterogeneous supercomputers for Bayesian phylogenetic inference.

[BibT_eX]

[DOI]

J. Supercomput., 2013

Simulating Cardiac Electrophysiology in the Era of GPU-Cluster Computing.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2013

On the GPU Performance of 3D Stencil Computations Implemented in OpenCL.

[BibT_eX]

[DOI]

Proceedings of the Supercomputing - 28th International Supercomputing Conference, 2013

On the GPU performance of cell-centered finite volume method over unstructured tetrahedral meshes.

[BibT_eX]

[DOI]

Proceedings of the 3rd Workshop on Irregular Applications - Architectures and Algorithms, 2013

On the GPU-CPU Performance Portability of OpenCL for 3D Stencil Computations.

[BibT_eX]

[DOI]

Proceedings of the 19th IEEE International Conference on Parallel and Distributed Systems, 2013

Performance of Sediment Transport Simulations on NVIDIA's Kepler Architecture.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Computational Science, 2013

Solving the Cardiac Model Using Multi-core CPU and Many Integrated Cores (MIC).

[BibT_eX]

[DOI]

Proceedings of the 10th IEEE International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing, 2013

ACF: Networks-on-Chip Deadlock Recovery with Accurate Detection and Elastic Credit.

[BibT_eX]

[DOI]

Proceedings of the Advanced Parallel Processing Technologies, 2013

2012

A Parallel H.264 Encoder with CUDA: Mapping and Evaluation.

[BibT_eX]

[DOI]

Proceedings of the 18th IEEE International Conference on Parallel and Distributed Systems, 2012

Extending BORPH for shared memory reconfigurable computers.

[BibT_eX]

[DOI]

Proceedings of the 22nd International Conference on Field Programmable Logic and Applications (FPL), 2012

The masala machine: accelerating thread-intensive and explicit memory management programs with dynamically reconfigurable FPGAs (abstract only).

[BibT_eX]

[DOI]

Proceedings of the ACM/SIGDA 20th International Symposium on Field Programmable Gate Arrays, 2012

Using 1000+ GPUs and 10000+ CPUs for Sedimentary Basin Simulations.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Conference on Cluster Computing, 2012

2011

Tiled Multi-Core Stream Architecture.

[BibT_eX]

[DOI]

Trans. High Perform. Embed. Archit. Compil., 2011

A high-efficient software parallel CAVCL encoder based on GPU.

[BibT_eX]

[DOI]

Proceedings of the 34th International Conference on Telecommunications and Signal Processing (TSP 2011), 2011

High-efficient software parallel CAVLC encoder based on programmable stream processor.

[BibT_eX]

[DOI]

Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

A Multilevel Parallel Intra Coding for H.264/AVC Based on CUDA.

[BibT_eX]

[DOI]

Proceedings of the Sixth International Conference on Image and Graphics, 2011

2010

A Parallel Streaming Motion Estimation for Real-Time HD H.264 Encoding on Programmable Processors.

[BibT_eX]

[DOI]

Proceedings of the Fifth International Conference on Frontier of Computer Science and Technology, 2010

Software Managed Instruction Scratchpad Memory Optimization in Stream Architecture Based on Hot Code Analysis of Kernels.

[BibT_eX]

[DOI]

Proceedings of the 13th Euromicro Conference on Digital System Design, 2010

SAT: A Stream Architecture Template for Embedded Applications.

[BibT_eX]

[DOI]

Proceedings of the 10th IEEE International Conference on Computer and Information Technology, 2010

2009

Streaming HD H.264 encoder on programmable processors.

[BibT_eX]

[DOI]

Proceedings of the 17th International Conference on Multimedia 2009, 2009

Cache streamization for high performance stream processor.

[BibT_eX]

[DOI]

Proceedings of the 16th International Conference on High Performance Computing, 2009

Software parallel CAVLC encoder based on stream processing.

[BibT_eX]

[DOI]

Proceedings of the 7th IEEE/ACM/IFIP Workshop on Embedded Systems for Real-Time Multimedia, 2009

2008

On-Chip Memory System Optimization Design for the FT64 Scientific Stream Accelerator.

[BibT_eX]

[DOI]

IEEE Micro, 2008

Load scheduling: Reducing pressure on distributed register files for free.

[BibT_eX]

[DOI]

Proceedings of the 13th Asia South Pacific Design Automation Conference, 2008

FPGA-based Equivalent Simulation Technology (FEST) for clustered stream architecture.

[BibT_eX]

[DOI]

Proceedings of the 13th Asia-Pacific Computer Systems Architecture Conference, 2008

2007

FT64: Scientific Computing with Streams.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing, 2007

A Stream System-on-Chip Architecture for High Speed Target Recognition Based on Biologic Vision.

[BibT_eX]

[DOI]

Proceedings of the Advances in Computer Systems Architecture, 2007

2006

[BibT_eX]

[DOI]

Proceedings of the Advances in Computer Systems Architecture, 11th Asia-Pacific Conference, 2006

Optimization and Evaluating of StreamYGX2 on MASA Stream Processor.

[BibT_eX]

[DOI]

Proceedings of the Advances in Computer Systems Architecture, 11th Asia-Pacific Conference, 2006

Analysis and Performance Results of a fluid dynamics Application on MASA Stream Processor.

[BibT_eX]

[DOI]

Proceedings of the 5th Annual IEEE/ACIS International Conference on Computer and Information Science (ICIS 2006) and 1st IEEE/ACIS International Workshop on Component-Based Software Engineering, 2006

2005

Multiple-Morphs Adaptive Stream Architecture.

[BibT_eX]

[DOI]

J. Comput. Sci. Technol., 2005

Accelerated Motion Estimation of H.264 on Imagine Stream Processor.

[BibT_eX]

[DOI]