Hao Fu

Orcid: 0000-0002-4349-8748

Affiliations:
  • National Supercomputing Center in Tianjin, Tianjin, China


According to our database1, Hao Fu authored at least 16 papers between 2015 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
A Survey on Efficiency Optimization Techniques for DNN-based Video Analytics: Process Systems, Algorithms, and Applications.
CoRR, July, 2025

MixLoRA: An Efficient Multi-Tenant Framework for Concurrently Serving Diverse LoRA Models in Large Language Models.
Proceedings of the 54th International Conference on Parallel Processing, 2025

2024
A large-scale heterogeneous computing framework for non-uniform sampling two-dimensional convolution applications.
CCF Trans. High Perform. Comput., April, 2024

Fairness-Efficiency Scheduling for Pay-as-You-Go Shared Caching Systems With Long-Term Fairness Guarantees.
IEEE Trans. Serv. Comput., 2024

Accuracy-Efficiency Optimization for Multi-Stage Small Object Detection in Surveillance Video with Collaborative Frame Sampling.
Proceedings of the IEEE International Conference on Cluster Computing, 2024

2022
A method for efficient radio astronomical data gridding on multi-core vector processor.
Parallel Comput., 2022

Long-Term Fairness Scheduler for Pay-as-You-Use Cache Sharing Systems.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2022

EasyNUSC: An Efficient Heterogeneous Computing Framework for Non-uniform Sampling Two-Dimensional Convolution Applications.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2022

2021
HGP4CNN: an efficient parallelization framework for training convolutional neural networks on modern GPUs.
J. Supercomput., 2021

DVQShare: An Analytics System for DNN-based Video Queries.
Proceedings of the 21st IEEE/ACM International Symposium on Cluster, 2021

2020
Accelerating Exact Constrained Shortest Paths on GPUs.
Proc. VLDB Endow., 2020

2019
ASW: Accelerating Smith-Waterman Algorithm on Coupled CPU-GPU Architecture.
Int. J. Parallel Program., 2019

2018
GLP4NN: A Convergence-invariant and Network-agnostic Light-weight Parallelization Framework for Deep Neural Networks on Modern GPUs.
Proceedings of the 47th International Conference on Parallel Processing, 2018

2017
KD-Tree and HEALPix-Based Distributed Cone Search Indexing System for Multi-Band Astronomical Catalogs.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2017

2015
A List Scheduling Algorithm for DAG-Based Parallel Computing Models.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2015

A Multilevel Fault-Tolerance Technique for the DAG Data Driven Model.
Proceedings of the 15th IEEE/ACM International Symposium on Cluster, 2015


  Loading...