Proceedings of the 24th IEEE Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, 2022

Black-box Attacks to Log-based Anomaly Detection.

[BibT_eX]

[DOI]

Shaohan Huang

Proceedings of the 18th International Conference on Network and Service Management, 2022

2021

The Deep Learning Compiler: A Comprehensive Survey.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2021

Towards efficient tile low-rank GEMM computation on sunway many-core processors.

[BibT_eX]

[DOI]

J. Supercomput., 2021

swMR: A Framework for Accelerating MapReduce Applications on Sunway Taihulight.

[BibT_eX]

[DOI]

IEEE Trans. Emerg. Top. Comput., 2021

Towards efficient canonical polyadic decomposition on sunway many-core processor.

[BibT_eX]

[DOI]

Inf. Sci., 2021

User-level failure detection and auto-recovery of parallel programs in HPC systems.

[BibT_eX]

[DOI]

Frontiers Comput. Sci., 2021

Adaptive watermark generation mechanism based on time series prediction for stream processing.

[BibT_eX]

[DOI]

Frontiers Comput. Sci., 2021

Accelerating Sparse Approximate Matrix Multiplication on GPUs.

[BibT_eX]

[DOI]

CoRR, 2021

dgQuEST: Accelerating Large Scale Quantum Circuit Simulation through Hybrid CPU-GPU Memory Hierarchies.

[BibT_eX]

[DOI]

Proceedings of the Network and Parallel Computing, 2021

An optimized tensor completion library for multiple GPUs.

[BibT_eX]

[DOI]

Proceedings of the ICS '21: 2021 International Conference on Supercomputing, 2021

Automatic Code Generation and Optimization of Large-scale Stencil Computation on Many-core Processors.

[BibT_eX]

[DOI]

Proceedings of the ICPP 2021: 50th International Conference on Parallel Processing, Lemont, IL, USA, August 9, 2021

DRStencil: Exploiting Data Reuse within Low-order Stencil on GPU.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE 23rd Int Conf on High Performance Computing & Communications; 7th Int Conf on Data Science & Systems; 19th Int Conf on Smart City; 7th Int Conf on Dependability in Sensor, 2021

csTuner: Scalable Auto-tuning Framework for Complex Stencil Computation on GPUs.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Cluster Computing, 2021

PriPro: Towards Effective Privacy Protection on Edge-Cloud System running DNN Inference.

[BibT_eX]

[DOI]

Proceedings of the 21st IEEE/ACM International Symposium on Cluster, 2021

2020

Accelerating Sparse Cholesky Factorization on Sunway Manycore Architecture.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2020

Massively Scaling Seismic Processing on Sunway TaihuLight Supercomputer.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2020

HitAnomaly: Hierarchical Transformers for Anomaly Detection in System Log.

[BibT_eX]

[DOI]

IEEE Trans. Netw. Serv. Manag., 2020

Temperature-Aware DRAM Cache Management - Relaxing Thermal Constraints in 3-D Systems.

[BibT_eX]

[DOI]

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2020

The Deep Learning Compiler: A Comprehensive Survey.

[BibT_eX]

[DOI]

CoRR, 2020

Privacy for Rescue: A New Testimony Why Privacy is Vulnerable In Deep Models.

[BibT_eX]

[DOI]

CoRR, 2020

An Optimal Recovery Approach for Liberation Codes in Distributed Storage Systems.

[BibT_eX]

[DOI]

IEEE Access, 2020

swGBDT: Efficient Gradient Boosted Decision Tree on Sunway Many-Core Processor.

[BibT_eX]

[DOI]

Proceedings of the Supercomputing Frontiers - 6th Asian Conference, 2020

ZeroSpy: exploring software inefficiency with redundant zeros.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2020

SpTFS: sparse tensor format selection for MTTKRP via deep learning.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2020

Paddy: An Event Log Parsing Approach using Dynamic Dictionary.

[BibT_eX]

[DOI]

Proceedings of the NOMS 2020, 2020

Extremely Low-bit Convolution Optimization for Quantized Neural Network on Modern Computer Architectures.

[BibT_eX]

[DOI]

Proceedings of the ICPP 2020: 49th International Conference on Parallel Processing, 2020

A Gated Few-shot Learning Model For Anomaly Detection.

[BibT_eX]

[DOI]

Proceedings of the 2020 International Conference on Information Networking, 2020

Towards GPU Acceleration of Phonon Computation with ShengBTE.

[BibT_eX]

[DOI]

Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region, 2020

Transfer Log-based Anomaly Detection with Pseudo Labels.

[BibT_eX]

[DOI]

Proceedings of the 16th International Conference on Network and Service Management, 2020

swRodinia: A Benchmark Suite for Exploiting Architecture Properties of Sunway Processor.

[BibT_eX]

[DOI]

Proceedings of the Benchmarking, Measuring, and Optimizing, 2020

2019

Improving Thread-level Parallelism in GPUs Through Expanding Register File to Scratchpad Memory.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2019

Accelerating in-memory transaction processing using general purpose graphics processing units.

[BibT_eX]

[DOI]

Future Gener. Comput. Syst., 2019

A novel index system describing program runtime characteristics for workload consolidation.

[BibT_eX]

[DOI]

Frontiers Comput. Sci., 2019

Intelligent-Unrolling: Exploiting Regular Patterns in Irregular Applications.

[BibT_eX]

[DOI]

CoRR, 2019

Massively Scaling Seismic Processing on Sunway TaihuLight Supercomputer.

[BibT_eX]

[DOI]

CoRR, 2019

swTVM: Exploring the Automated Compilation for Deep Learning on Sunway Architecture.

[BibT_eX]

[DOI]

CoRR, 2019

swTensor: accelerating tensor decomposition on Sunway architecture.

[BibT_eX]

[DOI]

CCF Trans. High Perform. Comput., 2019

FPowerTool: A Function-Level Power Profiling Tool.

[BibT_eX]

[DOI]

IEEE Access, 2019

ADSM: Adaptive Data Scheduling Method for Hybrid Memories in Distributed System.

[BibT_eX]

[DOI]

IEEE Access, 2019

Performance Evaluation and Analysis of Linear Algebra Kernels in the Prototype Tianhe-3 Cluster.

[BibT_eX]

[DOI]

Proceedings of the Supercomputing Frontiers - 5th Asian Conference, 2019

Modeling Power Consumption of The Code Execution Using Performance Counters Statistics.

[BibT_eX]

[DOI]

Proceedings of the 20th International Conference on Parallel and Distributed Computing, 2019

ASTracer: An Efficient Tracing Tool for HDFS with Adaptive Sampling.

[BibT_eX]

[DOI]

Proceedings of the Network and Parallel Computing, 2019

Redundant loads: a software inefficiency indicator.

[BibT_eX]

[DOI]

Proceedings of the 41st International Conference on Software Engineering, 2019

Improving the Parallelism of CESM on GPU.

[BibT_eX]

[DOI]

Proceedings of the Algorithms and Architectures for Parallel Processing, 2019

Structure Characteristic-Aware Pruning Strategy for Convolutional Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 21st IEEE International Conference on High Performance Computing and Communications; 17th IEEE International Conference on Smart City; 5th IEEE International Conference on Data Science and Systems, 2019

Towards a General and Efficient Linked-List Hash Table on GPUs.

[BibT_eX]

[DOI]

swCPD: Optimizing Canonical Polyadic Decomposition on Sunway Manycore Architecture.

[BibT_eX]

[DOI]

L-DAG: Enabling Loopy Workflow in Scientific Application with Automatic DAG Transformation.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE Intl Conf on Dependable, 2019

Generative Model for Probabilistic Inference.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE Intl Conf on Dependable, 2019

SMQoS: Improving Utilization and Energy Efficiency with QoS Awareness on GPUs.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE International Conference on Cluster Computing, 2019

Accelerating tile low-rank GEMM on sunway architecture: POSTER.

[BibT_eX]

[DOI]

Proceedings of the 16th ACM International Conference on Computing Frontiers, 2019

2018

LWPTool: A Lightweight Profiler to Guide Data Layout Optimization.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2018

SMGuard: A Flexible and Fine-Grained Resource Management Framework for GPUs.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2018

SRAM- and STT-RAM-based hybrid, shared last-level cache for on-chip CPU-GPU heterogeneous architectures.

[BibT_eX]

[DOI]

J. Supercomput., 2018

T1000: Mitigating the memory footprint of convolution neural networks with decomposition and re-fusion.

[BibT_eX]

[DOI]

Future Gener. Comput. Syst., 2018

Generative Model for Heterogeneous Inference.

[BibT_eX]

[DOI]

CoRR, 2018

BigRoots: An Effective Approach for Root-Cause Analysis of Stragglers in Big Data System.

[BibT_eX]

[DOI]

IEEE Access, 2018

A Lightweight and Flexible Tool for Distinguishing Between Hardware Malfunctions and Program Bugs in Debugging Large-Scale Programs.

[BibT_eX]

[DOI]

IEEE Access, 2018

Sparsing Deep Neural Network Using Semi-Discrete Matrix Decomposition.

[BibT_eX]

[DOI]

IEEE Access, 2018

A Fine-Grained Performance Bottleneck Analysis Method for HDFS.

[BibT_eX]

[DOI]

Proceedings of the Network and Parallel Computing, 2018

Towards Efficient SpMV on Sunway Manycore Architectures.

[BibT_eX]

[DOI]

Proceedings of the 32nd International Conference on Supercomputing, 2018

Multi-role SpTRSV on Sunway Many-Core Architecture.

[BibT_eX]

[DOI]

Proceedings of the 20th IEEE International Conference on High Performance Computing and Communications; 16th IEEE International Conference on Smart City; 4th IEEE International Conference on Data Science and Systems, 2018

Research on Asynchronous Inter-VM Communication Mechanism Based on Embedded Hypervisor.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE 42nd Annual Computer Software and Applications Conference, 2018

Performance Analysis and Optimization of Cyro-EM Structure Determination in RELION-2.

[BibT_eX]

[DOI]

Proceedings of the Advanced Computer Architecture - 12th Conference, 2018

EffectFace: A Fast and Efficient Deep Neural Network Model for Face Recognition.

[BibT_eX]

[DOI]

Proceedings of the Advanced Computer Architecture - 12th Conference, 2018

2017

iDPL: A scalable and flexible inter-continental testbed for data placement research and experiment.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE Symposium on Computers and Communications, 2017

PowerChief: Intelligent Power Allocation for Multi-Stage Applications to Improve Responsiveness on Power Constrained CMP.

[BibT_eX]

[DOI]

Proceedings of the 44th Annual International Symposium on Computer Architecture, 2017

Efficient Asynchronous Communication between Virtual Machines in Embedded Systems.

[BibT_eX]

[DOI]

Proceedings of the 19th IEEE International Conference on High Performance Computing and Communications; 15th IEEE International Conference on Smart City; 3rd IEEE International Conference on Data Science and Systems, 2017

Data Mining Based Root-Cause Analysis of Performance Bottleneck for Big Data Workload.

[BibT_eX]

[DOI]

Prophet: Precise QoS Prediction on Non-Preemptive Accelerators to Improve Utilization in Warehouse-Scale Computers.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems, 2017

2016

Designing Future Warehouse-Scale Computers for Sirius, an End-to-End Voice and Vision Personal Assistant.

[BibT_eX]

[DOI]

Johann Hauswald

Michael A. Laurenzano

ACM Trans. Comput. Syst., 2016

VinaSC: Scalable Autodock Vina with fine-grained scheduling on heterogeneous platform.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2016

Baymax: QoS Awareness and Increased Utilization for Non-Preemptive Accelerators in Warehouse Scale Computers.

[BibT_eX]

[DOI]

Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems, 2016

2015

Request Squeezer: Mitigating Tail Latency through Pruned Request Replication.

[BibT_eX]

[DOI]

Proceedings of the 17th IEEE International Conference on High Performance Computing and Communications, 2015

Data Analysis and Synchronization on Inter-Continent Data Placement Laboratory.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Cloud Computing and Big Data, 2015

2014

iMeter: An integrated VM power model based on performance profiling.

[BibT_eX]

[DOI]

Future Gener. Comput. Syst., 2014

Performance-Aware Based Correlated Datasets Replication Strategy.

[BibT_eX]

[DOI]

Lin Ye

Zhongzhi Luan

Hailong Yang

Proceedings of the Trustworthy Computing and Services - International Conference, 2014

2013

Energy Efficiency Evaluation of Workload Execution on Intel Xeon Phi Coprocessor.

[BibT_eX]

[DOI]

Proceedings of the Trustworthy Computing and Services, 2013

Bubble-flux: precise online QoS management for increased utilization in warehouse scale computers.

[BibT_eX]

[DOI]

Proceedings of the 40th Annual International Symposium on Computer Architecture, 2013

POIGEM: A Programming-Oriented Instruction Level GPU Energy Model for CUDA Program.

[BibT_eX]

[DOI]

Proceedings of the Algorithms and Architectures for Parallel Processing, 2013

2012

MapReduce Workload Modeling with Statistical Approach.

[BibT_eX]

[DOI]

J. Grid Comput., 2012

Efficient Statistical Computing on Multicore and MultiGPU Systems.

[BibT_eX]

[DOI]

Proceedings of the 15th International Conference on Network-Based Information Systems, 2012

Statistics-based Workload Modeling for MapReduce.

[BibT_eX]

[DOI]

Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

UVMPM: A Unitary Approach for VM Power Metering Based on Performance Profiling.

[BibT_eX]

[DOI]

Proceedings of the 41st International Conference on Parallel Processing Workshops, 2012

CPOP: Component Design and Parallelization towards POP Ocean Model Based on ESMF.

[BibT_eX]

[DOI]

Proceedings of the 41st International Conference on Parallel Processing Workshops, 2012

2011

Energy Prediction for MapReduce Workloads.

[BibT_eX]

[DOI]

Proceedings of the IEEE Ninth International Conference on Dependable, 2011

CDebugger: A scalable parallel debugger with dynamic communication topology configuration.

[BibT_eX]

[DOI]

Proceedings of the 2011 International Conference on Cloud and Service Computing, 2011

2010

Accelerating Dock6's Amber Scoring with Graphic Processing Unit.

[BibT_eX]

[DOI]

Proceedings of the Algorithms and Architectures for Parallel Processing, 2010

Hailong Yang

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...