Yunquan Zhang

According to our database1, Yunquan Zhang authored at least 68 papers between 2007 and 2019.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

On csauthors.net:

Bibliography

2019
Efficient parallel optimizations of a high-performance SIFT on GPUs.
J. Parallel Distrib. Comput., 2019

Mining concise patterns on graph-connected itemsets.
Neurocomputing, 2019

Tessellating Star Stencils.
Proceedings of the 48th International Conference on Parallel Processing, 2019

2018
Cache-Oblivious MPI All-to-All Communications Based on Morton Order.
IEEE Trans. Parallel Distrib. Syst., 2018

Rolling Forecasting Forward by Boosting Heterogeneous Kernels.
Proceedings of the Advances in Knowledge Discovery and Data Mining, 2018

Communication-Avoiding for Dynamical Core of Atmospheric General Circulation Model.
Proceedings of the 47th International Conference on Parallel Processing, 2018

Massively Scaling the Metal Microscopic Damage Simulation on Sunway TaihuLight Supercomputer.
Proceedings of the 47th International Conference on Parallel Processing, 2018

AGCM3D: A Highly Scalable Finite-Difference Dynamical Core of Atmospheric General Circulation Model Based on 3D Decomposition.
Proceedings of the 24th IEEE International Conference on Parallel and Distributed Systems, 2018

Implementation and Optimization of Multi-dimensional Real FFT on ARMv8 Platform.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2018

2017
Special Issue on Network and Parallel Computing.
International Journal of Parallel Programming, 2017

Hybrid-optimization strategy for the communication of large-scale Kinetic Monte Carlo simulation.
Computer Physics Communications, 2017

Tessellating stencils.
Proceedings of the International Conference for High Performance Computing, 2017

POSTER: Cache-Oblivious MPI All-to-All Communications on Many-Core Architectures.
Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2017

HartSift: A High-Accuracy and Real-Time SIFT Based on GPU.
Proceedings of the 23rd IEEE International Conference on Parallel and Distributed Systems, 2017

2016
A Cross-Platform SpMV Framework on Many-Core Architectures.
TACO, 2016

Parallel Processing Systems for Big Data: A Survey.
Proceedings of the IEEE, 2016

Workshop on high performance data intensive computing.
Concurrency and Computation: Practice and Experience, 2016

Efficient Management for Hybrid Memory in Managed Language Runtime.
Proceedings of the Network and Parallel Computing, 2016

2015
Automatic tuning of sparse matrix-vector multiplication on multicore clusters.
SCIENCE CHINA Information Sciences, 2015

AsHES Introduction and Committees.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 2015

Optimizing Image Sharpening Algorithm on GPU.
Proceedings of the 44th International Conference on Parallel Processing, 2015

Fast Convolution Operations on Many-Core Architectures.
Proceedings of the 17th IEEE International Conference on High Performance Computing and Communications, 2015

Optimized Password Recovery for Encrypted RAR on GPUs.
Proceedings of the 17th IEEE International Conference on High Performance Computing and Communications, 2015

Analyzing MPI-3.0 Process-Level Shared Memory: A Case Study with Stencil Computations.
Proceedings of the 15th IEEE/ACM International Symposium on Cluster, 2015

Parallel Solving Method of SOR Based on the Numerical Marine Forecasting Model.
Proceedings of the 15th IEEE/ACM International Symposium on Cluster, 2015

2014
Function Prediction of Proteins in Yeast Networks Based on the MCL Algorithm.
JSW, 2014

yaSpMV: yet another SpMV framework on GPUs.
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2014

AsHES Introduction and Committees.
Proceedings of the 2014 IEEE International Parallel & Distributed Processing Symposium Workshops, 2014

Physically based parallel ray tracer for the Metropolis light transport algorithm on the Tianhe-2 supercomputer.
Proceedings of the 20th IEEE International Conference on Parallel and Distributed Systems, 2014

Research on Mahalanobis Distance Algorithm Optimization Based on OpenCL.
Proceedings of the 2014 IEEE International Conference on High Performance Computing and Communications, 2014

2013
MPFFT: An Auto-Tuning FFT Library for OpenCL GPUs.
J. Comput. Sci. Technol., 2013

AUGEM: automatically generate high performance dense linear algebra kernels on x86 CPUs.
Proceedings of the International Conference for High Performance Computing, 2013

StreamScan: fast scan algorithms for GPUs without global barrier synchronization.
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2013

pVOCL: Power-Aware Dynamic Placement and Migration in Virtualized GPU Environments.
Proceedings of the IEEE 33rd International Conference on Distributed Computing Systems, 2013

H-DB: Yet Another Big Data Hybrid System of Hadoop and DBMS.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2013

Large Scale Satellite Imagery Simulations with Physically Based Ray Tracing on Tianhe-1A Supercomputer.
Proceedings of the 10th IEEE International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing, 2013

CLSIFT: An Optimization Study of the Scale Invariance Feature Transform on GPUs.
Proceedings of the 10th IEEE International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing, 2013

P-DOT: A model of computation for big data.
Proceedings of the 2013 IEEE International Conference on Big Data, 2013

2012
Implementing High-performance Intensity Model with Blur Effect on GPUs for Large-scale Star Image Simulation.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

Modeling the Locality in Graph Traversals.
Proceedings of the 41st International Conference on Parallel Processing, 2012

Model-driven Level 3 BLAS Performance Optimization on Loongson 3A Processor.
Proceedings of the 18th IEEE International Conference on Parallel and Distributed Systems, 2012

An Insightful Program Performance Tuning Chain for GPU Computing.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2012

Accelerating Viola-Jones Facce Detection Algorithm on GPUs.
Proceedings of the 14th IEEE International Conference on High Performance Computing and Communication & 9th IEEE International Conference on Embedded Software and Systems, 2012

GPURoofline: A Model for Guiding Performance Optimizations on GPUs.
Proceedings of the Euro-Par 2012 Parallel Processing - 18th International Conference, 2012

A Locality-based Performance Model for Load-and-Compute Style Computation.
Proceedings of the 2012 IEEE International Conference on Cluster Computing, 2012

2011
Optimizing SpMV for Diagonal Sparse Matrices on GPU.
Proceedings of the International Conference on Parallel Processing, 2011

Automatic FFT Performance Tuning on OpenCL GPUs.
Proceedings of the 17th IEEE International Conference on Parallel and Distributed Systems, 2011

CRSD: Application Specific Auto-tuning of SpMV for Diagonal Sparse Matrices.
Proceedings of the Euro-Par 2011 Parallel Processing - 17th International Conference, 2011

2010
Perspectives of China's HPC system development: a view from the 2009 China HPC TOP100 list.
Frontiers Comput. Sci. China, 2010

Heterogeneous Multi-core Parallel SGEMM Performance Testing and Analysis on Cell/B.E Processor.
Proceedings of the Fifth International Conference on Networking, Architecture, and Storage, 2010

Optimizing Sparse Matrix Vector Multiplication Using Diagonal Storage Matrix Format.
Proceedings of the 12th IEEE International Conference on High Performance Computing and Communications, 2010

Numerical Simulation of the Thermal Convection in the Earth's Outer Core.
Proceedings of the 12th IEEE International Conference on High Performance Computing and Communications, 2010

LogGPH: A Parallel Computational Model with Hierarchical Communication Awareness.
Proceedings of the 13th IEEE International Conference on Computational Science and Engineering, 2010

QuantWiz: A scalable parallel software package for label-free protein quantification.
Proceedings of the Fifth International Conference on Bio-Inspired Computing: Theories and Applications, 2010

Accelerating Linpack Performance with Mixed Precision Algorithm on CPU+GPGPU Heterogeneous Cluster.
Proceedings of the 10th IEEE International Conference on Computer and Information Technology, 2010

2009
Early Performance Evaluation of Dawning 5000A and DeepComp 7000.
Proceedings of the 15th IEEE International Conference on Parallel and Distributed Systems, 2009

QuantWiz: A Parallel Software Package for LC-MS-based Label-Free Protein Quantification.
Proceedings of the 11th IEEE International Conference on High Performance Computing and Communications, 2009

Performance Evaluation of Multithreaded Sparse Matrix-Vector Multiplication Using OpenMP.
Proceedings of the 11th IEEE International Conference on High Performance Computing and Communications, 2009

Development of a Scalable Solver for the Earth's Core Convection.
Proceedings of the High Performance Computing and Applications, 2009

2008
Basic research in computer science and software engineering at SKLCS.
Frontiers Comput. Sci. China, 2008

Parallelization of FM-Index.
Proceedings of the 10th IEEE International Conference on High Performance Computing and Communications, 2008

Memory Access Complexity Analysis of SpMV in RAM (h) Model.
Proceedings of the 10th IEEE International Conference on High Performance Computing and Communications, 2008

A Parallel Shortest Path Algorithm Based on Graph-Partitioning and Iterative Correcting.
Proceedings of the 10th IEEE International Conference on High Performance Computing and Communications, 2008

Utilizing the Multi-threading Techniques to Improve the Two-Level Checkpoint/Rollback System for MPI Applications.
Proceedings of the 10th IEEE International Conference on High Performance Computing and Communications, 2008

2007
Models of parallel computation: a survey and classification.
Frontiers Comput. Sci. China, 2007

A brief introduction to China HPC TOP100: from 2002 to 2006.
Proceedings of the CHINA HPC 2007, 2007

Block size selection of parallel LU and QR on PVP-based and RISC-based supercomputers.
Proceedings of the CHINA HPC 2007, 2007

Efficient Construction of FM-index Using Overlapping Block Processing for Large Scale Texts.
Proceedings of the Advances in Information Retrieval, 2007


  Loading...