Yi Liu

Orcid: 0000-0003-1829-2817

Affiliations:
  • Beihang University, Sino-German Joint Software Institute, Beijing, China
  • Xi'an Jiaotong University, Computer Science Department, China (PhD 2000)


According to our database1, Yi Liu authored at least 93 papers between 2003 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Adaptive Auto-Tuning Framework for Global Exploration of Stencil Optimization on GPUs.
IEEE Trans. Parallel Distributed Syst., January, 2024

Minions: Accelerating Large Language Model Inference with Adaptive and Collective Speculative Decoding.
CoRR, 2024

2023
HAOTuner: A Hardware Adaptive Operator Auto-Tuner for Dynamic Shape Tensor Compilers.
IEEE Trans. Computers, November, 2023

Improving Log-Based Anomaly Detection by Pre-Training Hierarchical Transformers.
IEEE Trans. Computers, September, 2023

Software approaches for resilience of high performance computing systems: a survey.
Frontiers Comput. Sci., August, 2023

Input-Aware Sparse Tensor Storage Format Selection for Optimizing MTTKRP.
Computer, August, 2023

swSpAMM: optimizing large-scale sparse approximate matrix multiplication on Sunway Taihulight.
Frontiers Comput. Sci., 2023

LogQA: Question Answering in Unstructured Logs.
CoRR, 2023

EasyScale: Elastic Training with Consistent Accuracy and Improved Utilization on GPUs.
Proceedings of the International Conference for High Performance Computing, 2023

Exploiting Subgraph Similarities for Efficient Auto-tuning of Tensor Programs.
Proceedings of the 52nd International Conference on Parallel Processing, 2023

2022
iBalancer: Load-Aware in-Server Flow Scheduling for Sub-Millisecond Tail Latency.
IEEE Trans. Parallel Distributed Syst., 2022

Efficient detection of silent data corruption in HPC applications with synchronization-free message verification.
J. Supercomput., 2022

Magas: matrix-based asynchronous graph analytics on shared memory systems.
J. Supercomput., 2022

Accelerating approximate matrix multiplication for near-sparse matrices on GPUs.
J. Supercomput., 2022

Input-Aware Sparse Tensor Storage Format Selection for Optimizing MTTKRP.
IEEE Trans. Computers, 2022

EasyScale: Accuracy-consistent Elastic Training for Deep Learning.
CoRR, 2022

FamilySeer: Towards Optimized Tensor Codes by Exploiting Computation Subgraph Similarity.
CoRR, 2022

CoGNN: Efficient Scheduling for Concurrent GNN Training on GPUs.
Proceedings of the SC22: International Conference for High Performance Computing, 2022

StencilMART: Predicting Optimization Selection for Stencil Computations across GPUs.
Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022

Toward accelerated stencil computation by adapting tensor core unit on GPU.
Proceedings of the ICS '22: 2022 International Conference on Supercomputing, Virtual Event, June 28, 2022

Black-box Attacks to Log-based Anomaly Detection.
Proceedings of the 18th International Conference on Network and Service Management, 2022

2021
The Deep Learning Compiler: A Comprehensive Survey.
IEEE Trans. Parallel Distributed Syst., 2021

ELS: Emulation system for debugging and tuning large-scale parallel programs on small clusters.
J. Supercomput., 2021

swMR: A Framework for Accelerating MapReduce Applications on Sunway Taihulight.
IEEE Trans. Emerg. Top. Comput., 2021

User-level failure detection and auto-recovery of parallel programs in HPC systems.
Frontiers Comput. Sci., 2021

Mutual calibration training: Training deep neural networks with noisy labels using dual-models.
Comput. Vis. Image Underst., 2021

Accelerating Sparse Approximate Matrix Multiplication on GPUs.
CoRR, 2021

Automatic Code Generation and Optimization of Large-scale Stencil Computation on Many-core Processors.
Proceedings of the ICPP 2021: 50th International Conference on Parallel Processing, Lemont, IL, USA, August 9, 2021

csTuner: Scalable Auto-tuning Framework for Complex Stencil Computation on GPUs.
Proceedings of the IEEE International Conference on Cluster Computing, 2021

2020
Accelerating Sparse Cholesky Factorization on Sunway Manycore Architecture.
IEEE Trans. Parallel Distributed Syst., 2020

HitAnomaly: Hierarchical Transformers for Anomaly Detection in System Log.
IEEE Trans. Netw. Serv. Manag., 2020

Processing graphs with barrierless asynchronous parallel model on shared-memory systems.
Future Gener. Comput. Syst., 2020

The Deep Learning Compiler: A Comprehensive Survey.
CoRR, 2020

FT-PBLAS: PBLAS-Based Fault-Tolerant Linear Algebra Computation on High-performance Computing Systems.
IEEE Access, 2020

SpTFS: sparse tensor format selection for MTTKRP via deep learning.
Proceedings of the International Conference for High Performance Computing, 2020

Paddy: An Event Log Parsing Approach using Dynamic Dictionary.
Proceedings of the NOMS 2020, 2020

Real-Time Polyp Detection for Colonoscopy Video on CPU.
Proceedings of the 32nd IEEE International Conference on Tools with Artificial Intelligence, 2020

A Gated Few-shot Learning Model For Anomaly Detection.
Proceedings of the 2020 International Conference on Information Networking, 2020

Transfer Log-based Anomaly Detection with Pseudo Labels.
Proceedings of the 16th International Conference on Network and Service Management, 2020

2019
Improving parallel efficiency for asynchronous graph analytics using Gauss-Seidel-based matrix computation.
Concurr. Comput. Pract. Exp., 2019

SunwayImg: A Parallel Image Processing Library for the Sunway Many-Core Processor.
IEEE Access, 2019

Performance Evaluation and Analysis of Linear Algebra Kernels in the Prototype Tianhe-3 Cluster.
Proceedings of the Supercomputing Frontiers - 5th Asian Conference, 2019

Multiple Algorithms Against Multiple Hardware Architectures: Data-Driven Exploration on Deep Convolution Neural Network.
Proceedings of the Network and Parallel Computing, 2019

Structure Characteristic-Aware Pruning Strategy for Convolutional Neural Networks.
Proceedings of the 21st IEEE International Conference on High Performance Computing and Communications; 17th IEEE International Conference on Smart City; 5th IEEE International Conference on Data Science and Systems, 2019

Generative Model for Probabilistic Inference.
Proceedings of the 2019 IEEE Intl Conf on Dependable, 2019

SMQoS: Improving Utilization and Energy Efficiency with QoS Awareness on GPUs.
Proceedings of the 2019 IEEE International Conference on Cluster Computing, 2019

2018
A Lightweight and Flexible Tool for Distinguishing Between Hardware Malfunctions and Program Bugs in Debugging Large-Scale Programs.
IEEE Access, 2018

HPC-SFI: System-Level Fault Injection for High Performance Computing Systems.
Proceedings of the Network and Parallel Computing, 2018

A Fine-Grained Performance Bottleneck Analysis Method for HDFS.
Proceedings of the Network and Parallel Computing, 2018

Re-Running Large-Scale Parallel Programs Using Two Nodes.
Proceedings of the IEEE International Conference on Parallel & Distributed Processing with Applications, 2018

Block-Checksum-Based Fault Tolerance for Matrix Multiplication on Large-Scale Parallel Systems.
Proceedings of the 20th IEEE International Conference on High Performance Computing and Communications; 16th IEEE International Conference on Smart City; 4th IEEE International Conference on Data Science and Systems, 2018

Mitigating I/O Impact of Checkpointing on Large Scale Parallel Systems.
Proceedings of the 20th IEEE International Conference on High Performance Computing and Communications; 16th IEEE International Conference on Smart City; 4th IEEE International Conference on Data Science and Systems, 2018

Multi-role SpTRSV on Sunway Many-Core Architecture.
Proceedings of the 20th IEEE International Conference on High Performance Computing and Communications; 16th IEEE International Conference on Smart City; 4th IEEE International Conference on Data Science and Systems, 2018

2017
ParaFlow: Fine-grained parallel SDN controller for large-scale networks.
J. Netw. Comput. Appl., 2017

Controller-proxy: Scaling network management for large-scale SDN networks.
Comput. Commun., 2017

Flow Stealer: lightweight load balancing by stealing flows in distributed SDN controllers.
Sci. China Inf. Sci., 2017

2016
Coordinating workload balancing and power switching in renewable energy powered data center.
Frontiers Comput. Sci., 2016

DScheduler: Dynamic Network Scheduling Method for MapReduce in Distributed Controllers.
Proceedings of the 22nd IEEE International Conference on Parallel and Distributed Systems, 2016

Parallel Image Processing on the Sunway Many-Core Processor.
Proceedings of the 18th IEEE International Conference on High Performance Computing and Communications; 14th IEEE International Conference on Smart City; 2nd IEEE International Conference on Data Science and Systems, 2016

Restricted Boltzmann Machines and Deep Belief Networks on Sunway Cluster.
Proceedings of the 18th IEEE International Conference on High Performance Computing and Communications; 14th IEEE International Conference on Smart City; 2nd IEEE International Conference on Data Science and Systems, 2016

2015
SEIP: System for Efficient Image Processing on Distributed Platform.
J. Comput. Sci. Technol., 2015

Online Replacement of Distributed Controllers in Software Defined Networks.
Proceedings of the 21st IEEE International Conference on Parallel and Distributed Systems, 2015

JellyFish: Online Performance Tuning with Adaptive Configuration and Elastic Container in Hadoop Yarn.
Proceedings of the 21st IEEE International Conference on Parallel and Distributed Systems, 2015

2014
Lightweight dynamic partitioning for last-level cache of multicore processor on real system.
J. Supercomput., 2014

Paraio: A scalable network I/O framework for many-core systems.
Proceedings of the 20th IEEE International Conference on Parallel and Distributed Systems, 2014

Efficient Work-Stealing with Blocking Deques.
Proceedings of the 2014 IEEE International Conference on High Performance Computing and Communications, 2014

2013
Partition-Based Hardware Transactional Memory for Many-Core Processors.
Proceedings of the Network and Parallel Computing - 10th IFIP International Conference, 2013

SimNUMA: Simulating NUMA-Architecture Multiprocessor Systems Efficiently.
Proceedings of the 19th IEEE International Conference on Parallel and Distributed Systems, 2013

PT: A Lightweight Job Scheduler with Remote Interface for Multiprocessors.
Proceedings of the 16th IEEE International Conference on Computational Science and Engineering, 2013

Pipeline-Based Parallel Framework for Mass File Processing.
Proceedings of the 2013 International Conference on Cloud and Service Computing, 2013

2012
Measuring and Visualizing Thread Communications for Pthread Applications.
Proceedings of the 13th International Conference on Parallel and Distributed Computing, 2012

MOLTS: Mobile Object Localization and Tracking System Based on Wireless Sensor Networks.
Proceedings of the Seventh IEEE International Conference on Networking, 2012

2010
Throughput maximization with bargaining game in cognitive radio networks.
Proceedings of the 3rd IFIP Wireless Days Conference 2010, 2010

A Novel Scheme for High Performance Finite-Difference Time-Domain (FDTD) Computations Based on GPU.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2010

I/O Feature-based File Prefetching for Multi-Applications.
Proceedings of the GCC 2010, 2010

Video Streaming over Wireless Mesh Networks with Multi-Gateway Support.
Proceedings of the IEEE/IFIP 8th International Conference on Embedded and Ubiquitous Computing, 2010

Efficient Transaction Nesting in Hardware Transactional Memory.
Proceedings of the Architecture of Computing Systems, 2010

2009
A Novel Spatio-Temporal Attributes Index Based Query for Wireless Sensor Networks.
Int. J. Distributed Sens. Networks, 2009

Intra-flow Network Coding Based Multipath Routing Protocol for Event-Driven Wireless Sensor Networks.
Proceedings of the MSN 2009, 2009

RRDD: Receiver-oriented Robust Data Delivery in Mobile Sensor Networks.
Proceedings of the IEEE 6th International Conference on Mobile Adhoc and Sensor Systems, 2009

A Heuristic Energy-aware Scheduling Algorithm for Heterogeneous Clusters.
Proceedings of the 15th IEEE International Conference on Parallel and Distributed Systems, 2009

Embedded Processors in Heterogeneous Architectures for Web Servers.
Proceedings of the 2009 International Conference on Internet Computing, 2009

2008
Auto-configuration of Shared Network-layer Address in Cluster-based Wireless Sensor Network.
Proceedings of the IEEE International Conference on Networking, Sensing and Control, 2008

Hardware Transactional Memory Supporting I/O Operations within Transactions.
Proceedings of the 10th IEEE International Conference on High Performance Computing and Communications, 2008

2007
Handover for Seamless Stream Media in Mobile IPv6 Network.
Proceedings of the Wired/Wireless Internet Communications, 5th International Conference, 2007

An Approach of End-to-End DiffServ/MPLS QoS Context Transfer in HMIPv6 Net.
Proceedings of the International Symposium on Autonomous Decentralized Systems (ISADS 2007), 2007

An On-demand Address Allocation Scheme for Query based Sensor Networks.
Proceedings of the International Symposium on Autonomous Decentralized Systems (ISADS 2007), 2007

EDDS: An Efficient Data Delivery Scheme for Address-Free Wireless Sensor Networks.
Proceedings of the Sixth International Conference on Networking (ICN 2007), 2007

2006
RSVP Context Extraction in IP Mobility Environments.
Proceedings of the 63rd IEEE Vehicular Technology Conference, 2006

2005
Mapping Resources for Network Emulation with Heuristic and Genetic Algorithms.
Proceedings of the Sixth International Conference on Parallel and Distributed Computing, 2005

Solving Network Testbed Mapping Problem with Genetic Algorithm.
Proceedings of the Artificial Intelligence Applications and Innovations - IFIP TC12 WG12.5, 2005

2004
A Framework for End-to-End QoS Context Transfer in Mobile IPv6.
Proceedings of the Personal Wireless Communications, IFIP TC6 9th International Conference, 2004

2003
Site-Role Based GreedyDual-Size Replacement Algorithm.
Proceedings of the Advances in Web-Age Information Management, 2003


  Loading...