Ao Ren

Orcid: 0000-0002-2322-8038

According to our database1, Ao Ren authored at least 58 papers between 2016 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Trustworthy Self-Attention: Enabling the Network to Focus Only on the Most Relevant References.
CoRR, 2024

YOIO: You Only Iterate Once by mining and fusing multiple necessary global information in the optical flow estimation.
CoRR, 2024

2023
FedMDS: An Efficient Model Discrepancy-Aware Semi-Asynchronous Clustered Federated Learning Framework.
IEEE Trans. Parallel Distributed Syst., March, 2023

Optimizing the Incremental Update Mechanism by Inlaying File Indexes on Flash Storage.
Proceedings of the 12th Non-Volatile Memory Systems and Applications Symposium, 2023

RadarSSD: A Computational Storage for Radar Signal Processing.
Proceedings of the 52nd International Conference on Parallel Processing, 2023

Data-Quality-Driven Federated Learning for Optimizing Communication Costs.
Proceedings of the 29th IEEE International Conference on Parallel and Distributed Systems, 2023

Re-compact: Structured Pruning and SpMM Kernel Co-design for Accelerating DNNs on GPUs.
Proceedings of the 41st IEEE International Conference on Computer Design, 2023

An Efficient Scheduling Algorithm for Multi-mode Tasks on Near-Data Processing SSDs.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2023

HBP: Hierarchically Balanced Pruning and Accelerator Co-Design for Efficient DNN Inference.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

Optimizing the Performance of NDP Operations by Retrieving File Semantics in Storage.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

IFHE: Intermediate-Feature Heterogeneity Enhancement for Image Synthesis in Data-Free Knowledge Distillation.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

SCRA: Systolic-Friendly DNN Compression and Reconfigurable Accelerator Co-Design.
Proceedings of the IEEE Intl Conf on Parallel & Distributed Processing with Applications, 2023

2022
Flexible Clustered Federated Learning for Client-Level Data Distribution Shift.
IEEE Trans. Parallel Distributed Syst., 2022

SENTunnel: Fast Path for Sensor Data Access on Automotive Embedded Systems.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

FRL: Fast and Reconfigurable Accelerator for Distributed Sound Source Localization.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

Federated learning with workload-aware client scheduling in heterogeneous systems.
Neural Networks, 2022

Measuring Data Reconstruction Defenses in Collaborative Inference Systems.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

CADedup: High-performance Consistency-aware Deduplication Based on Persistent Memory.
Proceedings of the IEEE 40th International Conference on Computer Design, 2022

VEA: An FPGA-Based Voxel Encoding Accelerator for 3D Object Detection with LiDAR.
Proceedings of the IEEE 40th International Conference on Computer Design, 2022

3DS: An Efficient DPDK-based Data Distribution Service for Distributed Real-time Applications.
Proceedings of the 24th IEEE Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, 2022

2021
STICKER-T: An Energy-Efficient Neural Network Processor Using Block-Circulant Algorithm and Unified Frequency-Domain Acceleration.
IEEE J. Solid State Circuits, 2021

Improving DNN Fault Tolerance using Weight Pruning and Differential Crossbar Mapping for ReRAM-based Edge AI.
Proceedings of the 22nd International Symposium on Quality Electronic Design, 2021

CSAFL: A Clustered Semi-Asynchronous Federated Learning Framework.
Proceedings of the International Joint Conference on Neural Networks, 2021

FedSAE: A Novel Self-Adaptive Federated Learning Framework in Heterogeneous Systems.
Proceedings of the International Joint Conference on Neural Networks, 2021

2020
3D Capsule Networks for Object Classification With Weight Pruning.
IEEE Access, 2020

DARB: A Density-Adaptive Regular-Block Pruning for Deep Neural Networks.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
HEIF: Highly Efficient Stochastic Computing-Based Inference Framework for Deep Neural Networks.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2019

Normalization and dropout for stochastic computing-based deep convolutional neural networks.
Integr., 2019

DARB: A Density-Aware Regular-Block Pruning for Deep Neural Networks.
CoRR, 2019

A Stochastic-Computing based Deep Learning Framework using Adiabatic Quantum-Flux-Parametron SuperconductingTechnology.
CoRR, 2019

IDE Development, Logic Synthesis and Buffer/Splitter Insertion Framework for Adiabatic Quantum-Flux-Parametron Superconducting Circuits.
Proceedings of the 2019 IEEE Computer Society Annual Symposium on VLSI, 2019

A 65nm 0.39-to-140.3TOPS/W 1-to-12b Unified Neural Network Processor Using Block-Circulant-Enabled Transpose-Domain Acceleration with 8.1 × Higher TOPS/mm<sup>2</sup>and 6T HBST-TRAM-Based 2D Data-Reuse Architecture.
Proceedings of the IEEE International Solid- State Circuits Conference, 2019

A stochastic-computing based deep learning framework using adiabatic quantum-flux-parametron superconducting technology.
Proceedings of the 46th International Symposium on Computer Architecture, 2019

A Buffer and Splitter Insertion Framework for Adiabatic Quantum-Flux-Parametron Superconducting Circuits.
Proceedings of the 37th IEEE International Conference on Computer Design, 2019

A Majority Logic Synthesis Framework for Adiabatic Quantum-Flux-Parametron Superconducting Circuits.
Proceedings of the 2019 on Great Lakes Symposium on VLSI, 2019

ADMM-NN: An Algorithm-Hardware Co-Design Framework of DNNs Using Alternating Direction Methods of Multipliers.
Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, 2019

2018
A low-computation-complexity, energy-efficient, and high-performance linear program solver based on primal-dual interior point method using memristor crossbars.
Nano Commun. Networks, 2018

ADMM-NN: An Algorithm-Hardware Co-Design Framework of DNNs Using Alternating Direction Method of Multipliers.
CoRR, 2018

Towards Budget-Driven Hardware Optimization for Deep Convolutional Neural Networks Using Stochastic Computing.
Proceedings of the 2018 IEEE Computer Society Annual Symposium on VLSI, 2018

An area and energy efficient design of domain-wall memory-based deep convolutional neural networks using stochastic computing.
Proceedings of the 19th International Symposium on Quality Electronic Design, 2018

Structured Weight Matrices-Based Hardware Accelerators in Deep Neural Networks: FPGAs and ASICs.
Proceedings of the 2018 on Great Lakes Symposium on VLSI, 2018

VIBNN: Hardware Acceleration of Bayesian Neural Networks.
Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, 2018

2017
Deep Reinforcement Learning: Framework, Applications, and Embedded Implementations.
CoRR, 2017

Memristor crossbar-based ultra-efficient next-generation baseband processors.
Proceedings of the IEEE 60th International Midwest Symposium on Circuits and Systems, 2017

Hardware-driven nonlinear activation for stochastic computing based deep convolutional neural networks.
Proceedings of the 2017 International Joint Conference on Neural Networks, 2017

Hardware Acceleration of Bayesian Neural Networks Using RAM Based Linear Feedback Gaussian Random Number Generators.
Proceedings of the 2017 IEEE International Conference on Computer Design, 2017

Deep reinforcement learning: Framework, applications, and embedded implementations: Invited paper.
Proceedings of the 2017 IEEE/ACM International Conference on Computer-Aided Design, 2017

Ultra-fast robust compressive sensing based on memristor crossbars.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Softmax Regression Design for Stochastic Computing Based Deep Convolutional Neural Networks.
Proceedings of the on Great Lakes Symposium on VLSI 2017, 2017

Structural design optimization for deep convolutional neural networks using stochastic computing.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2017

SC-DCNN: Highly-Scalable Deep Convolutional Neural Network using Stochastic Computing.
Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems, 2017

Algorithm-hardware co-optimization of the memristor-based framework for solving SOCP and homogeneous QCQP problems.
Proceedings of the 22nd Asia and South Pacific Design Automation Conference, 2017

Towards acceleration of deep convolutional neural networks using stochastic computing.
Proceedings of the 22nd Asia and South Pacific Design Automation Conference, 2017

2016
Design of high-speed low-power polar BP decoder using emerging technologies.
Proceedings of the 29th IEEE International System-on-Chip Conference, 2016

A low-computation-complexity, energy-efficient, and high-performance linear program solver using memristor crossbars.
Proceedings of the 29th IEEE International System-on-Chip Conference, 2016

Memristor-Based Discrete Fourier Transform for Improving Performance and Energy Efficiency.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2016

Designing reconfigurable large-scale deep learning systems using stochastic computing.
Proceedings of the IEEE International Conference on Rebooting Computing, 2016

DSCNN: Hardware-oriented optimization for Stochastic Computing based Deep Convolutional Neural Networks.
Proceedings of the 34th IEEE International Conference on Computer Design, 2016


  Loading...