Xing Cai

According to our database1, Xing Cai authored at least 61 papers between 1997 and 2019.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

Homepage:

On csauthors.net:

Bibliography

2019
Performance Optimization and Modeling of Fine-Grained Irregular Communication in UPC.
Scientific Programming, 2019

PDNet: Prior-Model Guided Depth-Enhanced Network for Salient Object Detection.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2019

Combining Algorithmic Rethinking and AVX-512 Intrinsics for Efficient Simulation of Subcellular Calcium Signaling.
Proceedings of the Computational Science - ICCS 2019, 2019

2018
Memory Bandwidth Contention: Communication vs Computation Tradeoffs in Supercomputers with Multicore Architectures.
Proceedings of the 24th IEEE International Conference on Parallel and Distributed Systems, 2018

SingleGAN: Image-to-Image Translation by a Single-Generator Network Using Multiple Generative Adversarial Learning.
Proceedings of the Computer Vision - ACCV 2018, 2018

2017
Panda: A Compiler Framework for Concurrent CPU + GPU Execution of 3D Stencil Computations on GPU-accelerated Supercomputers.
International Journal of Parallel Programming, 2017

Accelerating Detailed Tissue-Scale 3D Cardiac Simulations Using Heterogeneous CPU-Xeon Phi Computing.
International Journal of Parallel Programming, 2017

Porting Tissue-Scale Cardiac Simulations to the Knights Landing Platform.
Proceedings of the High Performance Computing, 2017

2016
Matlab2cpp: A Matlab-to-C++ code translator.
Proceedings of the 11th System of Systems Engineering Conference, 2016

On the performance and energy efficiency of the PGAS programming model on multicore architectures.
Proceedings of the International Conference on High Performance Computing & Simulation, 2016

Enabling Tissue-Scale Cardiac Simulations Using Heterogeneous Computing on Tianhe-2.
Proceedings of the 22nd IEEE International Conference on Parallel and Distributed Systems, 2016


2015
An analytical GPU performance model for 3D stencil computations from the angle of data traffic.
The Journal of Supercomputing, 2015

Scalable Heterogeneous CPU-GPU Computations for Unstructured Tetrahedral Meshes.
IEEE Micro, 2015

Parallel performance modeling of irregular applications in cell-centered finite volume methods over unstructured tetrahedral meshes.
J. Parallel Distrib. Comput., 2015

Towards simulation of subcellular calcium dynamics at nanometre resolution.
IJHPCA, 2015

Enabling a Uniform OpenCL Device View for Heterogeneous Platforms.
IEICE Transactions, 2015

Communication-hiding programming for clusters with multi-coprocessor nodes.
Concurrency and Computation: Practice and Experience, 2015

Multi-GPU Implementations of Parallel 3D Sweeping Algorithms with Application to Geological Folding.
Proceedings of the International Conference on Computational Science, 2015

Towards Detailed Tissue-Scale 3D Simulations of Electrical Activity and Calcium Handling in the Human Cardiac Ventricle.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2015

CPU+GPU Programming of Stencil Computations for Resource-Efficient Use of GPU Clusters.
Proceedings of the 18th IEEE International Conference on Computational Science and Engineering, 2015

2014
Time-fractional heat equations and negative absolute temperatures.
Computers & Mathematics with Applications, 2014

High efficient sedimentary basin simulations on hybrid CPU-GPU clusters.
Cluster Computing, 2014

Fast distributed MPC based on active set method.
Computers & Chemical Engineering, 2014

Effective multi-GPU communication using multiple CUDA streams and threads.
Proceedings of the 20th IEEE International Conference on Parallel and Distributed Systems, 2014

Heterogeneous CPU-GPU computing for the finite volume method on 3D unstructured meshes.
Proceedings of the 20th IEEE International Conference on Parallel and Distributed Systems, 2014

Utilizing Multiple Xeon Phi Coprocessors on One Compute Node.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2014

Automated Transformation of GPU-Specific OpenCL Kernels Targeting Performance Portability on Multi-Core/Many-Core CPUs.
Proceedings of the Euro-Par 2014 Parallel Processing, 2014

2013
Resource-efficient utilization of CPU/GPU-based heterogeneous supercomputers for Bayesian phylogenetic inference.
The Journal of Supercomputing, 2013

Simulating Cardiac Electrophysiology in the Era of GPU-Cluster Computing.
IEICE Transactions, 2013

On the GPU Performance of 3D Stencil Computations Implemented in OpenCL.
Proceedings of the Supercomputing - 28th International Supercomputing Conference, 2013

On the GPU performance of cell-centered finite volume method over unstructured tetrahedral meshes.
Proceedings of the 3rd Workshop on Irregular Applications - Architectures and Algorithms, 2013

On the GPU-CPU Performance Portability of OpenCL for 3D Stencil Computations.
Proceedings of the 19th IEEE International Conference on Parallel and Distributed Systems, 2013

Performance of Sediment Transport Simulations on NVIDIA's Kepler Architecture.
Proceedings of the International Conference on Computational Science, 2013

2012
A New Parallel 3D Front Propagation Algorithm for Fast Simulation of Geological folds.
Proceedings of the International Conference on Computational Science, 2012

Accelerating a 3D Finite-Difference Earthquake Simulation with a C-to-CUDA Translator.
Computing in Science and Engineering, 2012

Using 1000+ GPUs and 10000+ CPUs for Sedimentary Basin Simulations.
Proceedings of the 2012 IEEE International Conference on Cluster Computing, 2012

2011
An OpenMP-enabled parallel simulator for particle transport in fluid flows.
Proceedings of the International Conference on Computational Science, 2011

Mint: realizing CUDA performance in 3D stencil methods with annotated C.
Proceedings of the 25th International Conference on Supercomputing, 2011, Tucson, AZ, USA, May 31, 2011

2010
Numerical Analysis of a Dual-Sediment Transport Model Applied to Lake Okeechobee, Florida.
Proceedings of the Ninth International Symposium on Parallel and Distributed Computing, 2010

Past and Future Perspectives on Scientific Software.
Proceedings of the Simula Research Laboratory, by Thinking Constantly about it, 2010

2009
Evolution of Intracellular Ca2+ Waves from about 10, 000 RyR Clusters: Towards Solving a Computationally Daunting Task.
Proceedings of the Functional Imaging and Modeling of the Heart, 2009

2007
An order optimal solver for the discretized bidomain equations.
Numerical Lin. Alg. with Applic., 2007

A note on the efficiency of the conjugate gradient method for a class of time-dependent problems.
Numerical Lin. Alg. with Applic., 2007

2006
Making Hybrid Tsunami Simulators in a Parallel Software Framework.
Proceedings of the Applied Parallel Computing. State of the Art in Scientific Computing, 2006

Software Tools for Parallel CFD Applications: Minisymposium Abstract.
Proceedings of the Applied Parallel Computing. State of the Art in Scientific Computing, 2006

On the Efficiency of Python for High-Performance Computing: A Case Study Involving Stencil Updates for Partial Differential Equations.
Proceedings of the Modeling, 2006

2005
On the performance of the Python programming language for serial and parallel scientific computations.
Scientific Programming, 2005

Parallel Simulation of Tsunamis Using a Hybrid Software Approach.
Proceedings of the Parallel Computing: Current & Future Issues of High-End Computing, 2005

Message from the Chairs.
Proceedings of the 34th International Conference on Parallel Processing Workshops (ICPP 2005 Workshops), 2005

2004
Using the parallel algebraic recursive multilevel solver in modern physical applications.
Future Generation Comp. Syst., 2004

Improving the Performance of Large-Scale Unstructured PDE Applications.
Proceedings of the Applied Parallel Computing, 2004

2003
Parallel Solution of the Bidomain Equations with High Resolutions.
Proceedings of the Parallel Computing: Software Technology, 2003

A Numerical Study of Some Parallel Algebraic Preconditioners.
Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003

2002
Enabling Numerical and Software Technologies for Studying the Electrical Activity in Human Heart.
Proceedings of the Applied Parallel Computing Advanced Scientific Computing, 2002

Parallel Iterative Methods in Modern Physical Applications.
Proceedings of the Computational Science - ICCS 2002, 2002

2001
On the Performance of PC Clusters in Solving Partial Differential Equations.
Proceedings of the Tenth SIAM Conference on Parallel Processing for Scientific Computing, 2001

2000
Partition of Unstructured Finite Element Meshes by a Multilevel Approach.
Proceedings of the Applied Parallel Computing, 2000

Parallel Simulation of 3D Nonlinear Acoustic Fields on a Linux-cluster.
Proceedings of the 2000 IEEE International Conference on Cluster Computing (CLUSTER 2000), November 28th, 2000

1998
Numerical Simulation of 3D Fully Nonlinear Water Waves on Parallel Computers.
Proceedings of the Applied Parallel Computing, 1998

1997
Numerical Solution of PDEs on Parallel Computers Utilizing Sequential Simulators.
Proceedings of the Scientific Computing in Object-Oriented Parallel Environments, 1997


  Loading...