Yang Sui

Orcid: 0000-0003-3020-0612

Affiliations:
  • Rutgers University, Department of Electrical and Computer Engineering, Piscataway, NJ, USA (PhD 2024)


According to our database1, Yang Sui authored at least 41 papers between 2021 and 2025.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
Pruning 3D Convolutional Neural Networks via Channel Independence.
J. Signal Process. Syst., December, 2025

LowDiff: Efficient Diffusion Sampling with Low-Resolution Condition.
CoRR, September, 2025

When Tokens Talk Too Much: A Survey of Multimodal Long-Context Token Compression across Images, Videos, and Audios.
CoRR, July, 2025

AutoL2S: Auto Long-Short Reasoning for Efficient Large Language Models.
CoRR, May, 2025

HoliTom: Holistic Token Merging for Fast Video Large Language Models.
CoRR, May, 2025

Co-Exploring Structured Sparsification and Low-Rank Tensor Decomposition for Compact DNNs.
IEEE Trans. Neural Networks Learn. Syst., April, 2025

70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float.
CoRR, April, 2025

Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models.
CoRR, March, 2025

Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language Models.
CoRR, March, 2025

Confident or Seek Stronger: Exploring Uncertainty-Based On-device LLM Routing From Benchmarking to Generalization.
CoRR, February, 2025

DisDet: Exploring Detectability of Backdoor Attack on Diffusion Models.
Trans. Mach. Learn. Res., 2025

Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models.
Trans. Mach. Learn. Res., 2025

TopV: Compatible Token Pruning with Inference Time Optimization for Fast and Low-Memory Multimodal Vision Language Model.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

DyCoke: Dynamic Compression of Tokens for Fast Video Large Language Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

iTAP: An Incremental Task Graph Partitioner for Task-parallel Static Timing Analysis.
Proceedings of the 30th Asia and South Pacific Design Automation Conference, 2025

2024
Corner-to-Center long-range context model for efficient learned image compression.
J. Vis. Commun. Image Represent., 2024

Understanding Artificial Neural Network's Behavior from Neuron Activation Perspective.
CoRR, 2024

MoE-I<sup>2</sup>: Compressing Mixture of Experts Models through Inter-Expert Pruning and Intra-Expert Low-Rank Decomposition.
CoRR, 2024

ELRT: Efficient Low-Rank Training for Compact Convolutional Neural Networks.
CoRR, 2024

BitsFusion: 1.99 bits Weight Quantization of Diffusion Model.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

MOPED: Efficient Motion Planning Engine with Flexible Dimension Support.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024

MoE-I²: Compressing Mixture of Experts Models through Inter-Expert Pruning and Intra-Expert Low-Rank Decomposition.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Clean and Compact: Efficient Data-Free Backdoor Defense with Model Compactness.
Proceedings of the Computer Vision - ECCV 2024, 2024

Reconstruction Distortion of Learned Image Compression with Imperceptible Perturbations.
Proceedings of the Data Compression Conference, 2024

Invited: Algorithm and Hardware Co-Design for Energy-Efficient Neural SLAM.
Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024

Transferable Learned Image Compression-Resistant Adversarial Perturbations.
Proceedings of the 35th British Machine Vision Conference, 2024

2023
In-Sensor Radio Frequency Computing for Energy-Efficient Intelligent Radar.
CoRR, 2023

Learning-based Homography Matrix Optimization for Dual-fisheye Video Stitching.
Proceedings of the 2023 Workshop on Emerging Multimedia Systems, 2023

ETTE: Efficient Tensor-Train-based Computing Engine for Deep Neural Networks.
Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023

DynGMP: Graph Neural Network-Based Motion Planning in Unpredictable Dynamic Environments.
IROS, 2023

Invited Paper: In-Sensor Radio Frequency Computing for Energy-Efficient Intelligent Radar.
Proceedings of the IEEE/ACM International Conference on Computer Aided Design, 2023

DSPIMM: A Fully Digital SParse In-Memory Matrix Vector Multiplier for Communication Applications.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

HALOC: Hardware-Aware Automatic Low-Rank Compression for Compact Neural Networks.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

CSTAR: Towards Compact and Structured Deep Neural Networks with Adversarial Robustness.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Algorithm and Hardware Co-Design of Energy-Efficient LSTM Networks for Video Recognition with Hierarchical Tucker Tensor Decomposition.
CoRR, 2022

HODEC: Towards Efficient High-Order DEcomposed Convolutional Neural Networks.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
CHIP: CHannel Independence-based Pruning for Compact Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

GoSPA: An Energy-efficient High-performance Globally Optimized SParse Convolutional Neural Network Accelerator.
Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021

Algorithm and Hardware Co-design for Deep Learning-powered Channel Decoder: A Case Study.
Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2021

Towards Efficient Tensor Decomposition-Based DNN Model Compression With Optimization Framework.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021


  Loading...