Wei Niu

Orcid: 0000-0002-2697-7042

Affiliations:
  • University of Georgia, Athens, GA, USA
  • College of William & Mary, Williamsburg, VA, USA (PhD)


According to our database1, Wei Niu authored at least 52 papers between 2017 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
SoD<sup>2</sup>: Statically Optimizing Dynamic Deep Neural Network.
CoRR, 2024

2023
Survey: Exploiting Data Redundancy for Optimization of Deep Learning.
ACM Comput. Surv., 2023

Decentralized Application-Level Adaptive Scheduling for Multi-Instance DNNs on Open Mobile Devices.
Proceedings of the 2023 USENIX Annual Technical Conference, 2023

Pruning Parameterization with Bi-level Optimization for Efficient Semantic Segmentation on the Edge.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Towards High-Quality and Efficient Video Super-Resolution via Spatial-Temporal Data Overfitting.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Towards Real-Time Segmentation on the Edge.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Mobile or FPGA? A Comprehensive Evaluation on Energy Efficiency and a Unified Optimization Framework.
ACM Trans. Embed. Comput. Syst., September, 2022

Automatic Mapping of the Best-Suited DNN Pruning Schemes for Real-Time Mobile Acceleration.
ACM Trans. Design Autom. Electr. Syst., 2022

GRIM: A General, Real-Time Deep Learning Inference Framework for Mobile Devices Based on Fine-Grained Structured Weight Sparsity.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

Brief Industry Paper: Enabling Level-4 Autonomous Driving on a Single $1k Off-the-Shelf Card.
Proceedings of the 28th IEEE Real-Time and Embedded Technology and Applications Symposium, 2022

SparCL: Sparse Continual Learning on the Edge.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

GCD<sup>2</sup>: A Globally Optimizing Compiler for Mapping DNNs to Mobile DSPs.
Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

BLCR: Towards Real-time DNN Execution with Block-based Reweighted Pruning.
Proceedings of the 23rd International Symposium on Quality Electronic Design, 2022

Real-Time Portrait Stylization on the Edge.
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

Compiler-Aware Neural Architecture Search for On-Mobile Real-time Super-Resolution.
Proceedings of the Computer Vision - ECCV 2022, 2022

SPViT: Enabling Faster Vision Transformers via Latency-Aware Soft Token Pruning.
Proceedings of the Computer Vision - ECCV 2022, 2022

2021
SPViT: Enabling Faster Vision Transformers via Soft Token Pruning.
CoRR, 2021

Enabling Level-4 Autonomous Driving on a Single 1 Off-the-Shelf Card.
CoRR, 2021

Achieving Real-Time Object Detection on MobileDevices with Neural Pruning Search.
CoRR, 2021

CoCoPIE: enabling real-time AI on off-the-shelf mobile devices via compression-compilation co-design.
Commun. ACM, 2021

Brief Industry Paper: Towards Real-Time 3D Object Detection for Autonomous Vehicles with Pruning Search.
Proceedings of the 27th IEEE Real-Time and Embedded Technology and Applications Symposium, 2021

Work in Progress: Mobile or FPGA? A Comprehensive Evaluation on Energy Efficiency and a Unified Optimization Framework.
Proceedings of the 27th IEEE Real-Time and Embedded Technology and Applications Symposium, 2021

DNNFusion: accelerating deep neural networks execution with advanced operator fusion.
Proceedings of the PLDI '21: 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation, 2021

MEST: Accurate and Fast Memory-Economic Sparse Training Framework on the Edge.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Towards Fast and Accurate Multi-Person Pose Estimation on Mobile Devices.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

A Compression-Compilation Framework for On-mobile Real-time BERT Applications.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

ClickTrain: efficient and accurate end-to-end deep learning training via fine-grained architecture-preserving pruning.
Proceedings of the ICS '21: 2021 International Conference on Supercomputing, 2021

Achieving on-Mobile Real-Time Super-Resolution with Neural Architecture and Pruning Search.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

HEALS: A Parallel eALS Recommendation System on CPU/GPU Heterogeneous Platforms.
Proceedings of the 28th IEEE International Conference on High Performance Computing, 2021

Neural Pruning Search for Real-Time Object Detection of Autonomous Vehicles.
Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021

NPAS: A Compiler-Aware Framework of Unified Network Pruning and Architecture Search for Beyond Real-Time Mobile Acceleration.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Real-Time Mobile Acceleration of DNNs: From Computer Vision to Medical Applications.
Proceedings of the ASPDAC '21: 26th Asia and South Pacific Design Automation Conference, 2021

RT3D: Achieving Real-Time Execution of 3D Convolutional Neural Networks on Mobile Devices.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

A Compression-Compilation Co-Design Framework Towards Real-Time Object Detection on Mobile Devices.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

YOLObile: Real-Time Object Detection on Mobile Devices via Compression-Compilation Co-Design.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Achieving Real-Time LiDAR 3D Object Detection on a Mobile Device.
CoRR, 2020

6.7ms on Mobile with over 78% ImageNet Accuracy: Unified Network Pruning and Architecture Search for Beyond Real-Time Mobile Acceleration.
CoRR, 2020

An Efficient End-to-End Deep Learning Training Framework via Fine-Grained Pattern-Based Pruning.
CoRR, 2020

Achieving Real-Time Execution of Transformer-based Large-scale Models on Mobile with Compiler-aware Neural Architecture Optimization.
CoRR, 2020

Achieving Real-Time Execution of 3D Convolutional Neural Networks on Mobile Devices.
CoRR, 2020

A Privacy-Preserving DNN Pruning and Mobile Acceleration Framework.
CoRR, 2020

RTMobile: Beyond Real-Time Mobile Acceleration of RNNs for Speech Recognition.
CoRR, 2020

BLK-REW: A Unified Block-based DNN Pruning Framework using Reweighted Regularization Method.
CoRR, 2020

An Image Enhancing Pattern-based Sparsity for Real-time Inference on Mobile Devices.
CoRR, 2020

Towards Real-Time DNN Inference on Mobile Platforms with Model Pruning and Compiler Optimization.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

A Privacy-Preserving-Oriented DNN Pruning and Mobile Acceleration Framework.
Proceedings of the GLSVLSI '20: Great Lakes Symposium on VLSI 2020, 2020

An Image Enhancing Pattern-Based Sparsity for Real-Time Inference on Mobile Devices.
Proceedings of the Computer Vision - ECCV 2020, 2020

RTMobile: Beyond Real-Time Mobile Acceleration of RNNs for Speech Recognition.
Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

PatDNN: Achieving Real-Time DNN Execution on Mobile Devices with Pattern-based Weight Pruning.
Proceedings of the ASPLOS '20: Architectural Support for Programming Languages and Operating Systems, 2020

PCONV: The Missing but Desirable Sparsity in DNN Weight Pruning for Real-Time Execution on Mobile Devices.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
26ms Inference Time for ResNet-50: Towards Real-Time Execution of all DNNs on Smartphone.
CoRR, 2019

2017
User-aware partitioning algorithm for mobile cloud computing based on maximum graph cuts.
Comput. Networks, 2017


  Loading...