Xing Cai

Orcid: 0000-0003-3706-4414

Affiliations:
  • Simula Research Lab


According to our database1, Xing Cai authored at least 82 papers between 1997 and 2023.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2023
Targeting performance and user-friendliness: GPU-accelerated finite element computation with automated code generation in FEniCS.
Parallel Comput., November, 2023

Multi-strategy competitive-cooperative co-evolutionary algorithm and its application.
Inf. Sci., July, 2023

Detailed Modeling of Heterogeneous and Contention-Constrained Point-to-Point MPI Communication.
IEEE Trans. Parallel Distributed Syst., May, 2023

Dynamic hybrid mechanism-based differential evolution algorithm and its application.
Expert Syst. Appl., 2023

2022
On Memory Traffic and Optimisations for Low-order Finite Element Assembly Algorithms on Multi-core CPUs.
ACM Trans. Math. Softw., 2022

DKNAS: A Practical Deep Keypoint Extraction Framework Based on Neural Architecture Search.
Proceedings of the 2022 International Conference on Robotics and Automation, 2022

Comparative analysis of stochastic vehicle-loads charging model concerning randomized characteristic based on particle swarm optimization.
Proceedings of the 5th IEEE International Conference on Information Systems and Computer Aided Education, 2022

Optimization of intelligent charging control strategy for collaborative response of electric vehicle based on PSO.
Proceedings of the 5th IEEE International Conference on Information Systems and Computer Aided Education, 2022

2021
An improved differential evolution algorithm and its application in optimization problem.
Soft Comput., 2021

Quantum differential evolution with cooperative coevolution framework and hybrid mutation strategy for large scale optimization.
Knowl. Based Syst., 2021

An improved quantum-inspired cooperative co-evolution algorithm with muli-strategy and its application.
Expert Syst. Appl., 2021

A Newcomer In The PGAS World - UPC++ vs UPC: A Comparative Study.
CoRR, 2021

iPUG for Multiple Graphcore IPUs: Optimizing Performance and Scalability of Parallel Breadth-First Search.
Proceedings of the 28th IEEE International Conference on High Performance Computing, 2021

Research on Improved Multi-Sensor Data Fusion Algorithm Based on D-S Evidence Theory.
Proceedings of the AIPR 2021: 4th International Conference on Artificial Intelligence and Pattern Recognition, Xiamen, China, September 24, 2021

2020
Cache simulation for irregular memory traffic on multi-core CPUs: Case study on performance models for sparse matrix-vector multiplication.
J. Parallel Distributed Comput., 2020

Neural saliency algorithm guide bi-directional visual perception style transfer.
CAAI Trans. Intell. Technol., 2020

VONAS: Network Design in Visual Odometry using Neural Architecture Search.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Towards Loss Balance and Consistent Model in Self-supervised Monocular Depth Estimation.
Proceedings of the 32nd IEEE International Conference on Tools with Artificial Intelligence, 2020

Twinvo: Unsupervised Learning of Monocular Visual Odometry Using Bi-Direction Twin Network.
Proceedings of the 2020 IEEE International Conference on Multimedia & Expo Workshops, 2020

2019
Performance Optimization and Modeling of Fine-Grained Irregular Communication in UPC.
Sci. Program., 2019

PDNet: Prior-Model Guided Depth-Enhanced Network for Salient Object Detection.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2019

Combining Algorithmic Rethinking and AVX-512 Intrinsics for Efficient Simulation of Subcellular Calcium Signaling.
Proceedings of the Computational Science - ICCS 2019, 2019

Towards Detailed Real-Time Simulations of Cardiac Arrhythmia.
Proceedings of the 46th Computing in Cardiology, 2019

2018
Memory Bandwidth Contention: Communication vs Computation Tradeoffs in Supercomputers with Multicore Architectures.
Proceedings of the 24th IEEE International Conference on Parallel and Distributed Systems, 2018

SingleGAN: Image-to-Image Translation by a Single-Generator Network Using Multiple Generative Adversarial Learning.
Proceedings of the Computer Vision - ACCV 2018, 2018

2017
Panda: A Compiler Framework for Concurrent CPU + GPU Execution of 3D Stencil Computations on GPU-accelerated Supercomputers.
Int. J. Parallel Program., 2017

Accelerating Detailed Tissue-Scale 3D Cardiac Simulations Using Heterogeneous CPU-Xeon Phi Computing.
Int. J. Parallel Program., 2017

Porting Tissue-Scale Cardiac Simulations to the Knights Landing Platform.
Proceedings of the High Performance Computing, 2017

2016
Matlab2cpp: A Matlab-to-C++ code translator.
Proceedings of the 11th System of Systems Engineering Conference, 2016

On the performance and energy efficiency of the PGAS programming model on multicore architectures.
Proceedings of the International Conference on High Performance Computing & Simulation, 2016

Enabling Tissue-Scale Cardiac Simulations Using Heterogeneous Computing on Tianhe-2.
Proceedings of the 22nd IEEE International Conference on Parallel and Distributed Systems, 2016


2015
An analytical GPU performance model for 3D stencil computations from the angle of data traffic.
J. Supercomput., 2015

Scalable Heterogeneous CPU-GPU Computations for Unstructured Tetrahedral Meshes.
IEEE Micro, 2015

Parallel performance modeling of irregular applications in cell-centered finite volume methods over unstructured tetrahedral meshes.
J. Parallel Distributed Comput., 2015

Towards simulation of subcellular calcium dynamics at nanometre resolution.
Int. J. High Perform. Comput. Appl., 2015

Enabling a Uniform OpenCL Device View for Heterogeneous Platforms.
IEICE Trans. Inf. Syst., 2015

Communication-hiding programming for clusters with multi-coprocessor nodes.
Concurr. Comput. Pract. Exp., 2015

Multi-GPU Implementations of Parallel 3D Sweeping Algorithms with Application to Geological Folding.
Proceedings of the International Conference on Computational Science, 2015

Towards Detailed Tissue-Scale 3D Simulations of Electrical Activity and Calcium Handling in the Human Cardiac Ventricle.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2015

CPU+GPU Programming of Stencil Computations for Resource-Efficient Use of GPU Clusters.
Proceedings of the 18th IEEE International Conference on Computational Science and Engineering, 2015

2014
Time-fractional heat equations and negative absolute temperatures.
Comput. Math. Appl., 2014

High efficient sedimentary basin simulations on hybrid CPU-GPU clusters.
Clust. Comput., 2014

Fast distributed MPC based on active set method.
Comput. Chem. Eng., 2014

Effective multi-GPU communication using multiple CUDA streams and threads.
Proceedings of the 20th IEEE International Conference on Parallel and Distributed Systems, 2014

Heterogeneous CPU-GPU computing for the finite volume method on 3D unstructured meshes.
Proceedings of the 20th IEEE International Conference on Parallel and Distributed Systems, 2014

Utilizing Multiple Xeon Phi Coprocessors on One Compute Node.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2014

Automated Transformation of GPU-Specific OpenCL Kernels Targeting Performance Portability on Multi-Core/Many-Core CPUs.
Proceedings of the Euro-Par 2014 Parallel Processing, 2014

2013
Resource-efficient utilization of CPU/GPU-based heterogeneous supercomputers for Bayesian phylogenetic inference.
J. Supercomput., 2013

Simulating Cardiac Electrophysiology in the Era of GPU-Cluster Computing.
IEICE Trans. Inf. Syst., 2013

On the GPU Performance of 3D Stencil Computations Implemented in OpenCL.
Proceedings of the Supercomputing - 28th International Supercomputing Conference, 2013

On the GPU performance of cell-centered finite volume method over unstructured tetrahedral meshes.
Proceedings of the 3rd Workshop on Irregular Applications - Architectures and Algorithms, 2013

On the GPU-CPU Performance Portability of OpenCL for 3D Stencil Computations.
Proceedings of the 19th IEEE International Conference on Parallel and Distributed Systems, 2013

Performance of Sediment Transport Simulations on NVIDIA's Kepler Architecture.
Proceedings of the International Conference on Computational Science, 2013

2012
A New Parallel 3D Front Propagation Algorithm for Fast Simulation of Geological folds.
Proceedings of the International Conference on Computational Science, 2012

Accelerating a 3D Finite-Difference Earthquake Simulation with a C-to-CUDA Translator.
Comput. Sci. Eng., 2012

Using 1000+ GPUs and 10000+ CPUs for Sedimentary Basin Simulations.
Proceedings of the 2012 IEEE International Conference on Cluster Computing, 2012

2011
An OpenMP-enabled parallel simulator for particle transport in fluid flows.
Proceedings of the International Conference on Computational Science, 2011

Mint: realizing CUDA performance in 3D stencil methods with annotated C.
Proceedings of the 25th International Conference on Supercomputing, 2011, Tucson, AZ, USA, May 31, 2011

2010
Simplifying Parallelization of Scientific Codes by a Function-Centric Approach in Python
CoRR, 2010

Numerical Analysis of a Dual-Sediment Transport Model Applied to Lake Okeechobee, Florida.
Proceedings of the Ninth International Symposium on Parallel and Distributed Computing, 2010

Past and Future Perspectives on Scientific Software.
Proceedings of the Simula Research Laboratory, by Thinking Constantly about it, 2010

2009
Evolution of Intracellular Ca<sup>2+</sup> Waves from about 10, 000 RyR Clusters: Towards Solving a Computationally Daunting Task.
Proceedings of the Functional Imaging and Modeling of the Heart, 2009

2007
An order optimal solver for the discretized bidomain equations.
Numer. Linear Algebra Appl., 2007

A note on the efficiency of the conjugate gradient method for a class of time-dependent problems.
Numer. Linear Algebra Appl., 2007

2006
Making Hybrid Tsunami Simulators in a Parallel Software Framework.
Proceedings of the Applied Parallel Computing. State of the Art in Scientific Computing, 2006

Software Tools for Parallel CFD Applications: Minisymposium Abstract.
Proceedings of the Applied Parallel Computing. State of the Art in Scientific Computing, 2006

On the Efficiency of Python for High-Performance Computing: A Case Study Involving Stencil Updates for Partial Differential Equations.
Proceedings of the Modeling, 2006

2005
On the performance of the Python programming language for serial and parallel scientific computations.
Sci. Program., 2005

Parallel Simulation of Tsunamis Using a Hybrid Software Approach.
Proceedings of the Parallel Computing: Current & Future Issues of High-End Computing, 2005

Message from the Chairs.
Proceedings of the 34th International Conference on Parallel Processing Workshops (ICPP 2005 Workshops), 2005

2004
Using the parallel algebraic recursive multilevel solver in modern physical applications.
Future Gener. Comput. Syst., 2004

Improving the Performance of Large-Scale Unstructured PDE Applications.
Proceedings of the Applied Parallel Computing, 2004

2003
Parallel Solution of the Bidomain Equations with High Resolutions.
Proceedings of the Parallel Computing: Software Technology, 2003

A Numerical Study of Some Parallel Algebraic Preconditioners.
Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003

2002
Enabling Numerical and Software Technologies for Studying the Electrical Activity in Human Heart.
Proceedings of the Applied Parallel Computing Advanced Scientific Computing, 2002

Parallel Iterative Methods in Modern Physical Applications.
Proceedings of the Computational Science - ICCS 2002, 2002

2001
On the Performance of PC Clusters in Solving Partial Differential Equations.
Proceedings of the Tenth SIAM Conference on Parallel Processing for Scientific Computing, 2001

2000
Partition of Unstructured Finite Element Meshes by a Multilevel Approach.
Proceedings of the Applied Parallel Computing, 2000

Parallel Simulation of 3D Nonlinear Acoustic Fields on a Linux-cluster.
Proceedings of the 2000 IEEE International Conference on Cluster Computing (CLUSTER 2000), November 28th, 2000

1998
Numerical Simulation of 3D Fully Nonlinear Water Waves on Parallel Computers.
Proceedings of the Applied Parallel Computing, 1998

1997
Numerical Solution of PDEs on Parallel Computers Utilizing Sequential Simulators.
Proceedings of the Scientific Computing in Object-Oriented Parallel Environments, 1997


  Loading...