Canqun Yang

Orcid: 0009-0008-4757-2475

According to our database1, Canqun Yang authored at least 99 papers between 2005 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
SNCL: a supernode OpenCL implementation for hybrid computing arrays.
J. Supercomput., May, 2024

2023
A large scale parallel fluid-structure interaction computing platform for simulating structural responses to a detonation shock.
Softw. Pract. Exp., 2023

Free energy perturbation-based large-scale virtual screening for effective drug discovery against COVID-19.
Int. J. High Perform. Comput. Appl., 2023

An Improved Parallel Overset Grid Method for Fluid Simulation with Moving Boundary.
Proceedings of the 52nd International Conference on Parallel Processing, 2023

BioNet+: A Comprehensive Interaction Model of Finding Therapeutics for Diseases with Graph Deep Learning.
Proceedings of the IEEE International Conference on High Performance Computing & Communications, 2023

The Optimization of Multi-physics Application Simulated by Lattice Boltzmann Method Based on Domestic Processors.
Proceedings of the 2nd International Conference on Networks, 2023

Accelerating Type Confusion Detection by Identifying Harmless Type Castings.
Proceedings of the 20th ACM International Conference on Computing Frontiers, 2023

DrugProtKGE: Weakly Supervised Knowledge Graph Embedding for Highly-Effective Drug-Protein Interaction Representation.
Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2023

2022
VecDualSPHysics: A vectorized implementation of Smoothed Particle Hydrodynamics method for simulating fluid flows on multi-core processors.
J. Comput. Phys., 2022

ParaCopasi: A package for parallel biochemical simulation and analysis.
J. Comput. Chem., 2022

HFN: Heterogeneous Feature Network for Multivariate Time Series Anomaly Detection.
CoRR, 2022

Private and Shared Feature Extractors Based on Hierarchical Neighbor Encoder for Adaptive Few-Shot Knowledge Graph Completion.
Proceedings of the 34th IEEE International Conference on Tools with Artificial Intelligence, 2022

ParallelDualSPHysics: supporting efficient parallel fluid simulations through MPI-enabled SPH method.
Proceedings of the 51st International Conference on Parallel Processing, 2022

Stgat-Mad : Spatial-Temporal Graph Attention Network For Multivariate Time Series Anomaly Detection.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
BALS: Blocked Alternating Least Squares for Parallel Sparse Matrix Factorization on GPUs.
IEEE Trans. Parallel Distributed Syst., 2021

Mining microbe-disease interactions from literature via a transfer learning model.
BMC Bioinform., 2021

VISPR-online: a web-based interactive tool to visualize CRISPR screening experiments.
BMC Bioinform., 2021

CNN+LSTM Accelerated Turbulent Flow Simulation with Link-Wise Artificial Compressibility Method.
Proceedings of the ICPP 2021: 50th International Conference on Parallel Processing, Lemont, IL, USA, August 9, 2021

Drug-drug interaction extraction from biomedical texts based on multi-attention mechanism.
Proceedings of the ICBRA 2021: 2021 8th International Conference on Bioinformatics Research and Applications, Berlin Germany, September 11, 2021

Large-Scale Parallel Alignment Algorithm for SMRT Reads.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2021

2020
Optimizing Streaming Parallelism on Heterogeneous Many-Core Architectures.
IEEE Trans. Parallel Distributed Syst., 2020

High-Scalable Collaborated Parallel Framework for Large-Scale Molecular Dynamic Simulation on Tianhe-2 Supercomputer.
IEEE ACM Trans. Comput. Biol. Bioinform., 2020

Correlation maximization machine for multi-modalities multiclass classification.
Pattern Anal. Appl., 2020

clMF: A fine-grained and portable alternating least squares algorithm for parallel matrix factorization.
Future Gener. Comput. Syst., 2020

Optimizing Streaming Parallelism on Heterogeneous Many-Core Architectures: A Machine Learning Based Approach.
CoRR, 2020

CGINet: graph convolutional network-based model for identifying chemical-gene interaction in an integrated multi-relational graph.
BMC Bioinform., 2020

Segment Medical Image Using U-Net Combining Recurrent Residuals and Attention.
Proceedings of 2020 International Conference on Medical Imaging and Computer-Aided Diagnosis, 2020

Numerical Study of Fluid-Structure Interaction Dynamics under High-explosive Detonation on Massively Parallel Computers.
Proceedings of the 22nd IEEE International Conference on High Performance Computing and Communications; 18th IEEE International Conference on Smart City; 6th IEEE International Conference on Data Science and Systems, 2020

Improving performance for simulating complex fluids on massively parallel computers by component loop-unrolling and communication hiding.
Proceedings of the 22nd IEEE International Conference on High Performance Computing and Communications; 18th IEEE International Conference on Smart City; 6th IEEE International Conference on Data Science and Systems, 2020

2019
Application-aware NoC management in GPUs multitasking.
J. Supercomput., 2019

Toward fault-tolerant hybrid programming over large-scale heterogeneous clusters via checkpointing/restart optimization.
J. Supercomput., 2019

SCP: Shared Cache Partitioning for High-Performance GEMM.
ACM Trans. Archit. Code Optim., 2019

Low-Cost Image Compressive Sensing with Multiple Measurement Rates for Object Detection.
Sensors, 2019

GARDENIA: A Graph Processing Benchmark Suite for Next-Generation Accelerators.
ACM J. Emerg. Technol. Comput. Syst., 2019

Reverse Offload Programming on Heterogeneous Systems.
IEEE Access, 2019

The Communication-Overlapped Hybrid Decomposition Parallel Algorithm for Multi-Scale Fluid Simulations.
Proceedings of the 48th International Conference on Parallel Processing, 2019

A Lightweight Collective Communication Based Parallel Algorithm for the Greedy Point Selection in RBF Mesh Deformation.
Proceedings of the 21st IEEE International Conference on High Performance Computing and Communications; 17th IEEE International Conference on Smart City; 5th IEEE International Conference on Data Science and Systems, 2019

APOAL: An Adaptive Parallel Optimization Algorithm for LBM Fluid Simulations.
Proceedings of the 21st IEEE International Conference on High Performance Computing and Communications; 17th IEEE International Conference on Smart City; 5th IEEE International Conference on Data Science and Systems, 2019

2018
Orchestrating parallel detection of strongly connected components on GPUs.
Parallel Comput., 2018

Moving from exascale to zettascale computing: challenges and techniques.
Frontiers Inf. Technol. Electron. Eng., 2018

A hybrid deep learning CNN-ELM for age and gender classification.
Neurocomputing, 2018

Tuning Streamed Applications on Intel Xeon Phi: A Machine Learning Based Approach.
CoRR, 2018

Collaborative Subspace Graph Hashing for Cross-modal Retrieval.
Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval, 2018

Auto-tuning Streamed Applications on Intel Xeon Phi.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium, 2018

UHCL-Darknet: An OpenCL-based Deep Neural Network Framework for Heterogeneous Multi-/Many-core Clusters.
Proceedings of the 47th International Conference on Parallel Processing, 2018

MOCL: an efficient openCL implementation for the matrix-2000 architecture.
Proceedings of the 15th ACM International Conference on Computing Frontiers, 2018

2017
多核/众核平台上推荐算法的实现与性能评估 (Implementation and Performance Evaluation of Recommender Algorithms Based on Multi-/Many-core Platforms).
计算机科学, 2017

面向存储层次设计优化的GPU程序性能分析 (Performance Analysis of GPU Programs Towards Better Memory Hierarchy Design).
计算机科学, 2017

GARDENIA: A Domain-specific Benchmark Suite for Next-generation Accelerators.
CoRR, 2017

Efficient and high-quality sparse graph coloring on GPUs.
Concurr. Comput. Pract. Exp., 2017

LU factorization on heterogeneous systems: an energy-efficient approach towards high performance.
Computing, 2017

P-Hint-Hunt: a deep parallelized whole genome DNA methylation detection tool.
BMC Genom., 2017

Dependency-based long short term memory network for drug-drug interaction extraction.
BMC Bioinform., 2017

High Performance Detection of Strongly Connected Components in Sparse Graphs on GPUs.
Proceedings of the 8th International Workshop on Programming Models and Applications for Multicores and Manycores, 2017

Projective Hard Thresholding Pursuit for Nonnegative Sparse Recovery.
Proceedings of the 2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC), 2017

Delay Compensated Asynchronous Adam Algorithm for Deep Neural Networks.
Proceedings of the 2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC), 2017

Implementing and Evaluating OpenCL on an ARMv8 Multi-Core CPU.
Proceedings of the 2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC), 2017

Efficient and Portable ALS Matrix Factorization for Recommender Systems.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium Workshops, 2017

Heterogeneous acceleration for CNN training with many integrated core.
Proceedings of the 2017 IEEE International Conference on Signal Processing, 2017

Automatic density clustering with multiple kernels for high-dimension bioinformatics data.
Proceedings of the 2017 IEEE International Conference on Bioinformatics and Biomedicine, 2017

2016
Evaluating Multiple Streams on Heterogeneous Platforms.
Parallel Process. Lett., 2016

623 Tflop/s HPCG run on Tianhe-2: Leveraging millions of hybrid cores.
Int. J. High Perform. Comput. Appl., 2016

Efficient and High-quality Sparse Graph Coloring on the GPU.
CoRR, 2016

Accurate, validated and fast evaluation of elementary symmetric functions and its application.
Appl. Math. Comput., 2016

Bilateral Sampling Randomized Singular Value Decomposition.
Proceedings of the 17th International Conference on Parallel and Distributed Computing, 2016

Accurate Evaluation of Bivariate Polynomials.
Proceedings of the 17th International Conference on Parallel and Distributed Computing, 2016

Accelerator-Centered Programming on Heterogeneous Systems.
Proceedings of the 17th International Conference on Parallel and Distributed Computing, 2016

Streaming Applications on Heterogeneous Platforms.
Proceedings of the Network and Parallel Computing, 2016

Monaural Speech Separation on Many Integrated Core Architecture.
Proceedings of the Computer Engineering and Technology - 20th CCF Conference, 2016

Accelerating Nyström Kernel Independent Component Analysis with Many Integrated Core Architecture.
Proceedings of the Computer Engineering and Technology - 20th CCF Conference, 2016

Evaluating the Performance Impact of Multiple Streams on the MIC-Based Heterogeneous Platform.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016

High Performance Parallel Graph Coloring on GPGPUs.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016

An Energy-Efficient Implementation of LU Factorization on Heterogeneous Systems.
Proceedings of the 22nd IEEE International Conference on Parallel and Distributed Systems, 2016

mAMBER: A CPU/MIC collaborated parallel framework for AMBER on Tianhe-2 supercomputer.
Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2016

2015
TI DSP C语言编译器正确性测试 (Correctness Test of TI DSP C Compiler).
计算机科学, 2015

An Efficient Clique-Based Algorithm of Compute Nodes Allocation for In-memory Checkpoint System.
Proceedings of the High Performance Computing - 30th International Conference, 2015

Large-Scale Neo-Heterogeneous Programming and Optimization of SNP Detection on Tianhe-2.
Proceedings of the High Performance Computing - 30th International Conference, 2015

Design and Implementation of a Highly Efficient DGEMM for 64-Bit ARMv8 Multi-core Processors.
Proceedings of the 44th International Conference on Parallel Processing, 2015

Implementation of an Accurate and Efficient Compensated DGEMM for 64-bit ARMv8 Multi-Core Processors.
Proceedings of the 21st IEEE International Conference on Parallel and Distributed Systems, 2015

FT-Offload: A Scalable Fault-Tolerance Programing Model on MIC Cluster.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2015

The Challenge of Scaling Genome Big Data Analysis Software on TH-2 Supercomputer.
Proceedings of the 15th IEEE/ACM International Symposium on Cluster, 2015

mAMBER: Accelerating Explicit Solvent Molecular Dynamic with Intel Xeon Phi Many-Integrated Core Coprocessors.
Proceedings of the 15th IEEE/ACM International Symposium on Cluster, 2015

2014
OpenMC: Towards Simplifying Programming for TianHe Supercomputers.
J. Comput. Sci. Technol., 2014

MilkyWay-2 supercomputer: system and application.
Frontiers Comput. Sci., 2014

HPCG: Preliminary Evaluation and Optimization on Tianhe-2 CPU-only Nodes.
Proceedings of the 26th IEEE International Symposium on Computer Architecture and High Performance Computing, 2014

2013
Exploiting hierarchy parallelism for molecular dynamics on a petascale heterogeneous system.
J. Parallel Distributed Comput., 2013

OpenACC to Intel Offload: Automatic Translation and Optimization.
Proceedings of the Computer Engineering and Technology - 17th CCF Conference, 2013

MIC acceleration of short-range molecular dynamics simulations.
Proceedings of the First International Workshop on Code Optimisation for Multi and Many Cores, 2013

2012
Parallelizing SOR for GPGPUs using alternate loop tiling.
Parallel Comput., 2012

A Fast Parallel Implementation of Molecular Dynamics with the Morse Potential on a Heterogeneous Petascale Supercomputer.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

2011
Optimizing Linpack Benchmark on GPU-Accelerated Petascale Supercomputer.
J. Comput. Sci. Technol., 2011

2010
TH-1: China's first petaflop supercomputer.
Frontiers Comput. Sci. China, 2010

Adaptive Optimization for Petascale Heterogeneous CPU/GPU Computing.
Proceedings of the 2010 IEEE International Conference on Cluster Computing, 2010

2009
Solving 2D Nonlinear Unsteady Convection-Diffusion Equations on Heterogenous Platforms with Multiple GPUs.
Proceedings of the 15th IEEE International Conference on Parallel and Distributed Systems, 2009

GPU Acceleration of High-Speed Collision Molecular Dynamics Simulation.
Proceedings of the Ninth IEEE International Conference on Computer and Information Technology, 2009

2008
Low Power Optimization for MPI Collective Operations.
Proceedings of the 9th International Conference for Young Computer Scientists, 2008

Exploiting Energy Saving Opportunity of Barrier Operation in MPI Programs.
Proceedings of the Second Asia International Conference on Modelling and Simulation, 2008

OSS: Efficient Compiler Approach for Selecting Optimal Strip Size on the Imagine Stream Processor.
Proceedings of the 22nd International Conference on Advanced Information Networking and Applications, 2008

2005
Improving the Performance of GCC by Exploiting IA-64 Architectural Features.
Proceedings of the Advances in Computer Systems Architecture, 10th Asia-Pacific Conference, 2005


  Loading...