Guangwen Yang

According to our database1, Guangwen Yang authored at least 170 papers between 1998 and 2018.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

On csauthors.net:

Bibliography

2018
Accelerating MapReduce on Commodity Clusters: An SSD-Empowered Approach.
IEEE Trans. Big Data, 2018

Optimizing Convolutional Neural Networks on the Sunway TaihuLight Supercomputer.
TACO, 2018

RedSync : Reducing Synchronization Traffic for Distributed Deep Learning.
CoRR, 2018

Episodic Memory Deep Q-Networks.
CoRR, 2018

Cavs: An Efficient Runtime System for Dynamic Neural Networks.
Proceedings of the 2018 USENIX Annual Technical Conference, 2018

Taming the "Monster": Overcoming Program Optimization Challenges on SW26010 Through Precise Performance Modeling.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium, 2018

Episodic Memory Deep Q-Networks.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

A Fast Sparse Triangular Solver for Structured-grid Problems on Sunway Many-core Processor SW26010.
Proceedings of the 47th International Conference on Parallel Processing, 2018

A Parallel Quicksort Algorithm on Manycore Processors in Sunway TaihuLight.
Proceedings of the Computational Science - ICCS 2018, 2018

Global Simulation of Planetary Rings on Sunway TaihuLight.
Proceedings of the Computational Science - ICCS 2018, 2018

Working principles of binary differential evolution.
Proceedings of the Genetic and Evolutionary Computation Conference, 2018

swCaffe: A Parallel Framework for Accelerating Deep Learning Applications on Sunway TaihuLight.
Proceedings of the IEEE International Conference on Cluster Computing, 2018

2017
A Fully-Pipelined Hardware Design for Gaussian Mixture Models.
IEEE Trans. Computers, 2017

An EnKF-based scheme to optimize hyper-parameters and features for SVM classifier.
Pattern Recognition, 2017

Solving Mesoscale Atmospheric Dynamics Using a Reconfigurable Dataflow Architecture.
IEEE Micro, 2017

Designing and implementing a heuristic cross-architecture combination for graph traversal.
J. Parallel Distrib. Comput., 2017

Cavs: A Vertex-centric Programming Interface for Dynamic Neural Networks.
CoRR, 2017

Chapter Four - Data Flow Computing in Geoscience Applications.
Advances in Computers, 2017

Redesigning CAM-SE for peta-scale climate modeling performance and ultra-high resolution on Sunway TaihuLight.
Proceedings of the International Conference for High Performance Computing, 2017

18.9-Pflops nonlinear earthquake simulation on Sunway TaihuLight: enabling depiction of 18-Hz and 8-meter scenarios.
Proceedings of the International Conference for High Performance Computing, 2017

SW-AES: Accelerating AES Algorithm on the Sunway TaihuLight.
Proceedings of the 2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC), 2017

swDNN: A Library for Accelerating Deep Learning Applications on Sunway TaihuLight.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium, 2017

Accelerating Financial Market Server through Hybrid List Design (Abstract Only).
Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2017

A Nanosecond-Level Hybrid Table Design for Financial Market Data Generators.
Proceedings of the 25th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2017

2016
Czip: A Fast Lossless Compression Algorithm for Climate Data.
International Journal of Parallel Programming, 2016

A time-space domain stereo finite difference method for 3D scalar wave propagation.
Computers & Geosciences, 2016

The Sunway TaihuLight supercomputer: system and applications.
SCIENCE CHINA Information Sciences, 2016

Evaluating the POWER8 Architecture through Optimizing Stencil-Based Algorithms.
Proceedings of the 2016 IEEE Trustcom/BigDataSE/ISPA, 2016

10M-core scalable fully-implicit solver for nonhydrostatic atmospheric dynamics.
Proceedings of the International Conference for High Performance Computing, 2016

A highly effective global surface wave numerical simulation with ultra-high resolution.
Proceedings of the International Conference for High Performance Computing, 2016

Refactoring and optimizing the community atmosphere model (CAM) on the sunway taihulight supercomputer.
Proceedings of the International Conference for High Performance Computing, 2016

TADE: Tight Adaptive Differential Evolution.
Proceedings of the Parallel Problem Solving from Nature - PPSN XIV, 2016

Cache-Friendly Design for Complex Spatially-Variable Coefficient Stencils on Many-Core Architectures.
Proceedings of the 23rd IEEE International Conference on High Performance Computing, 2016

Accelerating the 3D euler atmospheric solver through heterogeneous CPU-GPU platforms.
Proceedings of the ACM International Conference on Computing Frontiers, CF'16, 2016

Libra: an automated code generation and tuning framework for register-limited stencils on GPUs.
Proceedings of the ACM International Conference on Computing Frontiers, CF'16, 2016

Generalized GPU Acceleration for Applications Employing Finite-Volume Methods.
Proceedings of the IEEE/ACM 16th International Symposium on Cluster, 2016

Graph-Oriented Code Transformation Approach for Register-Limited Stencils on GPUs.
Proceedings of the IEEE/ACM 16th International Symposium on Cluster, 2016

F-CNN: An FPGA-based framework for training Convolutional Neural Networks.
Proceedings of the 27th IEEE International Conference on Application-specific Systems, 2016

Performance optimization of Jacobi stencil algorithms based on POWER8 architecture.
Proceedings of the 27th IEEE International Conference on Application-specific Systems, 2016

Unleashing the performance potential of CPU-GPU platforms for the 3D atmospheric Euler solver.
Proceedings of the 27th IEEE International Conference on Application-specific Systems, 2016

2015
Solving the Global Atmospheric Equations through Heterogeneous Reconfigurable Platforms.
TRETS, 2015

Scaling Support Vector Machines on modern HPC platforms.
J. Parallel Distrib. Comput., 2015

Data Reduction Analysis for Climate Data Sets.
International Journal of Parallel Programming, 2015

Improving the scalability of the ocean barotropic solver in the community earth system model.
Proceedings of the International Conference for High Performance Computing, 2015

ActCap: Accelerating MapReduce on heterogeneous clusters with capability-aware data placement.
Proceedings of the 2015 IEEE Conference on Computer Communications, 2015

Targeted Mutation: A Novel Mutation Strategy for Differential Evolution.
Proceedings of the 27th IEEE International Conference on Tools with Artificial Intelligence, 2015

Optimizing Complex Spatially-Variant Coefficient Stencils for Seismic Modeling on GPU.
Proceedings of the 21st IEEE International Conference on Parallel and Distributed Systems, 2015

Performance Characterization and Optimization for Intel Xeon Phi Coprocessor.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2015

CSAP: A Performance Predictor for Climate Simulation Applications on Intel CPUs.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2015

Optimizing Residue Number Reverse Converters through Bitwise Arithmetic on FPGAs.
Proceedings of the 23rd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2015

2014
Scaling Reverse Time Migration Performance through Reconfigurable Dataflow Engines.
IEEE Micro, 2014

Efficient Top-k Query Processing Algorithms in Highly Distributed Environments.
JCP, 2014

Evaluating multi-core and many-core architectures through accelerating the three-dimensional Lax-Wendroff correction stencil.
IJHPCA, 2014

Interpolation oriented parallel communication to optimize coupling in earth system modeling.
Frontiers Comput. Sci., 2014

CFIO2: Overlapping Communications and I/O with Computations Using RDMA Technology.
Proceedings of the Network and Parallel Computing, 2014

mpCache: Accelerating MapReduce with Hybrid Storage System on Many-Core Clusters.
Proceedings of the Network and Parallel Computing, 2014

A High Performance Compression Method for Climate Data.
Proceedings of the IEEE International Symposium on Parallel and Distributed Processing with Applications, 2014

MIC-SVM: Designing a Highly Efficient Support Vector Machine for Advanced Modern Multi-core and Many-Core Architectures.
Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014

Scaling and analyzing the stencil performance on multi-core and many-core architectures.
Proceedings of the 20th IEEE International Conference on Parallel and Distributed Systems, 2014

Porting the Princeton Ocean Model to GPUs.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2014

Customized Network-on-Chip for Message Reduction.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2014

Patra: Parallel tree-reweighted message passing architecture.
Proceedings of the 24th International Conference on Field Programmable Logic and Applications, 2014

A highly-efficient and green data flow engine for solving euler atmospheric equations.
Proceedings of the 24th International Conference on Field Programmable Logic and Applications, 2014

A Fully-Pipelined FPGA Design for Tree-Reweighted Message Passing Algorithm.
Proceedings of the 22nd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2014

Adaptive Indexing for Distributed Array Processing.
Proceedings of the 2014 IEEE International Congress on Big Data, Anchorage, AK, USA, June 27, 2014

A customized GPU acceleration of the princeton ocean model.
Proceedings of the IEEE 25th International Conference on Application-Specific Systems, 2014

An approach of processor core customization for stencil computation.
Proceedings of the IEEE 25th International Conference on Application-Specific Systems, 2014

2013
CFIO: A Fast I/O Library for Climate Models.
Proceedings of the 12th IEEE International Conference on Trust, 2013

SciHive: Array-Based Query Processing with HiveQL.
Proceedings of the 12th IEEE International Conference on Trust, 2013

A peta-scalable CPU-GPU algorithm for global atmospheric simulations.
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2013

Optimize Multidimensional Arrays Queries with Heterogeneous Replica Method.
Proceedings of the IEEE Eighth International Conference on Networking, 2013

Accelerating the 3D Elastic Wave Forward Modeling on GPU and MIC.
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

tpNFS: Efficient Support of Small Files Processing over pNFS.
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

Subdomain Mapping Approach to Enhance the Coupling in Earth System Modeling.
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

Accelerating solvers for global atmospheric equations through mixed-precision data flow engine.
Proceedings of the 23rd International Conference on Field programmable Logic and Applications, 2013

An FPGA-Based Data Flow Engine for Gaussian Copula Model.
Proceedings of the 21st IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2013

Global Atmospheric Simulation on a Reconfigurable Platform.
Proceedings of the 21st IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2013

A Scalable Barotropic Mode Solver for the Parallel Ocean Program.
Proceedings of the Euro-Par 2013 Parallel Processing, 2013

Understanding Data Characteristics and Access Patterns in a Cloud Storage System.
Proceedings of the 13th IEEE/ACM International Symposium on Cluster, 2013

2012
Revisiting finite difference and spectral migration methods on diverse parallel architectures.
Computers & Geosciences, 2012

Job failures in high performance computing systems: A large-scale empirical study.
Computers & Mathematics with Applications, 2012

The Chunk-Locality Index: An Efficient Query Method for Climate Datasets.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

Droplet: A Distributed Solution of Data Deduplication.
Proceedings of the 13th ACM/IEEE International Conference on Grid Computing, 2012

2011
Automatically constructing trusted cluster computing environment.
The Journal of Supercomputing, 2011

Optimization of sub-query processing in distributed data integration systems.
J. Network and Computer Applications, 2011

Metadata changes in large file systems: a metadata querying perspective.
Comput. Syst. Sci. Eng., 2011

Optimizing write operation on replica in data grid.
SCIENCE CHINA Information Sciences, 2011

A Two-Layered Replica Management Method.
Proceedings of the IEEE 10th International Conference on Trust, 2011

Efficient Nonserial Polyadic Dynamic Programming on the Cell Processor.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Location-Aware MapReduce in Virtual Cloud.
Proceedings of the International Conference on Parallel Processing, 2011

Making Service Granularity Right: An Assistant Approach Based on Business Process Analysis.
Proceedings of the Sixth Chinagrid Annual Conference, ChinaGrid 2011, Dalian, Liaoning, 2011

2010
An adaptive task-level fault-tolerant approach to Grid.
The Journal of Supercomputing, 2010

Distributed bandwidth allocation based on alternating evolution algorithm.
J. Parallel Distrib. Comput., 2010

VDB-MR: MapReduce-based distributed data integration using virtual database.
Future Generation Comp. Syst., 2010

Improving grid performance by dynamically deploying applications.
Concurrency and Computation: Practice and Experience, 2010

Service-oriented execution model supporting data sharing and adaptive query processing.
Cluster Computing, 2010

Efficient Monte Carlo-based options pricing on graphics processors and its optimizations.
SCIENCE CHINA Information Sciences, 2010

A Knowledge-based Continuous Double Auction Model for Cloud Market.
Proceedings of the Sixth International Conference on Semantics Knowledge and Grid, 2010

DABGPM: A Double Auction Bayesian Game-Based Pricing Model in Cloud Market.
Proceedings of the Network and Parallel Computing, IFIP International Conference, 2010

PV-EASY: a strict fairness guaranteed and prediction enabled scheduler in parallel job scheduling.
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, 2010

Using Memcached to Promote Read Throughput in Massive Small-File Storage System.
Proceedings of the GCC 2010, 2010

2009
Workflow management in the grid era: A goal-driven approach based on process patterns.
Multiagent and Grid Systems, 2009

Dynamic load balancing efficiently in a large-scale cluster.
IJHPCN, 2009

ALR-MIN: A Replacement Strategy to Reduce Overhead during Dynamic Deployment of Applications in Grid.
Proceedings of the International Conference on Scalable Computing and Communications / Eighth International Conference on Embedded Computing, 2009

FRB: File Resource Broker for Integrating Heterogeneous File Resources.
Proceedings of the Eighth International Conference on Grid and Cooperative Computing, 2009

Optimization of Data Retrievals in Processing Data Integration Queries.
Proceedings of the Fourth International Conference on Frontier of Computer Science and Technology, 2009

Integrating Cloud-Computing-Specific Model into Aircraft Design.
Proceedings of the Cloud Computing, First International Conference, CloudCom 2009, Beijing, 2009

CampusWare: An Easy-to-Use, Efficient and Portable Grid Middleware for Compute-Intensive Applications.
Proceedings of the Fourth ChinaGrid Annual Conference, ChinaGrid 2009, Yantai, Shandong, 2009

2008
Modeling replication strategies in data grid systems with arbitrary clustered demands.
Proceedings of the 3rd International ICST Conference on Scalable Information Systems, 2008

End-to-End Congestion Control for High Speed Networks Based on Population Ecology Models.
Proceedings of the 28th IEEE International Conference on Distributed Computing Systems (ICDCS 2008), 2008

VDM: Virtual Database Management for Distributed Databases and File Systems.
Proceedings of the Seventh International Conference on Grid and Cooperative Computing, 2008

A Grid Workflow Framework with High Scalability and Usability.
Proceedings of the Seventh International Conference on Grid and Cooperative Computing, 2008

Design More Usable and Reliable Large-Scale Software Systems: A New Approach Based on P2P, SOA and Web 2.0.
Proceedings of the 32nd Annual IEEE International Computer Software and Applications Conference, 2008

A Survey of Methods and Applications for Trace Analysis in Grid Systems.
Proceedings of the Third ChinaGrid Annual Conference, ChinaGrid 2008, Dunhuang, Gansu, 2008

ZettaDS: A Light-weight Distributed Storage System for Cluster.
Proceedings of the Third ChinaGrid Annual Conference, ChinaGrid 2008, Dunhuang, Gansu, 2008

Optimizing Communications in Processing Data Integration Queries.
Proceedings of the Third ChinaGrid Annual Conference, ChinaGrid 2008, Dunhuang, Gansu, 2008

Impact of Clustered Demands on Performance of Replication Strategies in Data Grid Systems.
Proceedings of the Third ChinaGrid Annual Conference, ChinaGrid 2008, Dunhuang, Gansu, 2008

Adaptive Hybrid Model for Long Term Load Prediction in Computational Grid.
Proceedings of the 8th IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2008), 2008

Scalable Distributed Ontology Reasoning Using DHT-Based Partitioning.
Proceedings of the Semantic Web, 3rd Asian Semantic Web Conference, 2008

2007
Parallel programming over ChinaGrid.
IJWGS, 2007

An optimal replication strategy for data grid systems.
Frontiers Comput. Sci. China, 2007

An analytical model for performance evaluation in a computational grid.
Proceedings of the CHINA HPC 2007, 2007

Workflow Management in Grid Era: From Process-Driven Paradigm to a Goal-Driven One.
Proceedings of the On the Move to Meaningful Internet Systems 2007: OTM 2007 Workshops, 2007

Dynamic Load-Balancing and High Performance Communication in Jcluster.
Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

Improving the Convergence and Stability of Congestion Control Algorithm.
Proceedings of the IEEE International Conference on Network Protocols, 2007

Load prediction using hybrid model for computational grid.
Proceedings of the 8th IEEE/ACM International Conference on Grid Computing (GRID 2007), 2007

Dynamic Data Replication based on Local Optimization Principle in Data Grid.
Proceedings of the Grid and Cooperative Computing, 2007

Adapting to Application Workflow in Processing Data Integration Queries.
Proceedings of the Grid and Cooperative Computing, 2007

Improving the Performance of MPI Applications over Computational Grid.
Proceedings of the Grid and Cooperative Computing, 2007

A Component Based Interoperability Solution over Existing Grid Middleware.
Proceedings of the Grid and Cooperative Computing, 2007

2006
Reciprocity: Enforcing Contribution in P2P Perpendicular Downloading.
IEICE Transactions, 2006

Jcluster: an efficient Java parallel environment on a large-scale heterogeneous cluster.
Concurrency and Computation: Practice and Experience, 2006

Towards a Transaction Model for Services in Grid Environment.
Proceedings of the 2006 IEEE / WIC / ACM International Conference on Web Intelligence (WI 2006), 2006

Overlapping Communication and Computation in MPI by Multithreading.
Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications & Conference on Real-Time Computing Systems and Applications, 2006

Scheduling divisible loads in the dynamic heterogeneous grid environment.
Proceedings of the 1st International Conference on Scalable Information Systems, 2006

A Transaction Model for Service Grid Environment and Implementation Considerations.
Proceedings of the 2006 IEEE International Conference on Web Services (ICWS 2006), 2006

A Resource-Autonomy Based Monitoring Architecture for Grids.
Proceedings of the Advances in Grid and Pervasive Computing, 2006

General Running Service: An Execution Framework for Executing Legacy Program on Grid.
Proceedings of the Grid and Cooperative Computing Workshops, 2006

Hierarchical Replica Location Service Based on Hybrid Overlay Platform.
Proceedings of the Grid and Cooperative Computing Workshops, 2006

Grid Programming Environment over ChinaGrid Support Platform.
Proceedings of the Grid and Cooperative Computing Workshops, 2006

BSM: A scheduling algorithm for dynamic jobs based on economics theory.
Proceedings of the Grid and Cooperative Computing, 2006

DPGS: A Distributed Programmable Grid System.
Proceedings of the Advanced Web and Network Technologies, and Applications, 2006

2005
Scheduling Efficiently for Irregular Load Distributions in a Large-scale Cluster.
Proceedings of the Parallel and Distributed Processing and Applications, 2005

An Efficient Dynamic Load-Balancing Algorithm in a Large-Scale Cluster.
Proceedings of the Distributed and Parallel Computing, 2005

2004
Grid Computing in China.
J. Grid Comput., 2004

Lookup-Ring: Building Efficient Lookups for High Dynamic Peer-to-Peer Overlays.
Proceedings of the Network and Parallel Computing, IFIP International Conference, 2004

Paramecium: Assembling Raw Nodes into Composite Cells.
Proceedings of the Network and Parallel Computing, IFIP International Conference, 2004

Efficiently Rationing Resources for Grid and P2P Computing.
Proceedings of the Network and Parallel Computing, IFIP International Conference, 2004

Making Peer-to-Peer Keyword Searching Feasible Using Multi-level Partitioning.
Proceedings of the Peer-to-Peer Systems III, Third International Workshop, 2004

DisCAS: A Distributed-Parallel Computer Algebra System.
Proceedings of the Computational Science, 2004

An Accounting and QoS Model for Grid Computing.
Proceedings of the Grid and Cooperative Computing, 2004

Efficient Search Using Adaptive Metadata Spreading in Peer-to-Peer Networks.
Proceedings of the Grid and Cooperative Computing, 2004

Latent Semantic Indexing in Peer-to-Peer Networks.
Proceedings of the Organic and Pervasive Computing, 2004

A Fine-grained Parallel Programming Model for Grid Computing.
Proceedings of the 2004 IEEE International Conference on Services Computing (SCC 2004), 2004

2003
A Site-Based Proxy Cache.
J. Comput. Sci. Technol., 2003

DSI: Distributed Service Integration for Service Grid.
J. Comput. Sci. Technol., 2003

Grid Computing Pool and Its Framework.
Proceedings of the 32nd International Conference on Parallel Processing Workshops (ICPP 2003 Workshops), 2003

Distributed Page Ranking in Structured P2P Networks.
Proceedings of the 32nd International Conference on Parallel Processing (ICPP 2003), 2003

Stationary and adaptive replication approach to data availability in structured peer-to-peer overlay networks.
Proceedings of the 11th IEEE International Conference on Networks, 2003

Grid-Based Biological Computation Service Environment.
Proceedings of the Grid and Cooperative Computing, Second International Workshop, 2003

Coarse-Grained Distributed Parallel Programming Interface for Grid Computing.
Proceedings of the Grid and Cooperative Computing, Second International Workshop, 2003

Scalable Resource Management and Load Assignment for Grid and Peer-to-Peer Services.
Proceedings of the Grid and Cooperative Computing, Second International Workshop, 2003

On the Malicious Participants Problem in Computational Grid.
Proceedings of the Grid and Cooperative Computing, Second International Workshop, 2003

Gridmarket: A Practical, Efficient Market Balancing Resource for Grid and P2P Computing.
Proceedings of the Grid and Cooperative Computing, Second International Workshop, 2003

Improving the Objects Set Availability in the P2P Environment by Multiple Groups.
Proceedings of the Grid and Cooperative Computing, Second International Workshop, 2003

Improving Availability of P2P Storage Systems.
Proceedings of the Advanced Parallel Programming Technologies, 5th International Workshop, 2003

TMSS: A Task Management and Scheduler System in Cluster for Remote Computing Service.
Proceedings of the 17th International Conference on Advanced Information Networking and Applications (AINA'03), 2003

2002
A multi-protocol cross-domain communication model for metacomputing systems.
Operating Systems Review, 2002

Potential-Based Hierarchical Clustering.
Proceedings of the 16th International Conference on Pattern Recognition, 2002

1998
A Statistical Clustering Model and Algorithm.
Proceedings of the Advances in Pattern Recognition, 1998


  Loading...