Yang Sui

Orcid: 0000-0003-3020-0612

Affiliations:

Rutgers University, Department of Electrical and Computer Engineering, Piscataway, NJ, USA (PhD 2024)

According to our database¹, Yang Sui authored at least 46 papers between 2021 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Bibliography

2026

TIDE: Efficient and Lossless MoE Diffusion LLM Inference with I/O-aware Expert Offload.

[BibT_eX]

[DOI]

CoRR, May, 2026

ATA: Bridging Implicit Reasoning with Attention-Guided and Action-Guided Inference for Vision-Language Action Models.

[BibT_eX]

[DOI]

CoRR, March, 2026

A Survey of Token Compression for Efficient Multimodal Large Language Models.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2026

2025

Pruning 3D Convolutional Neural Networks via Channel Independence.

[BibT_eX]

[DOI]

J. Signal Process. Syst., December, 2025

EcoSpa: Efficient Transformer Training with Coupled Sparsity.

[BibT_eX]

[DOI]

CoRR, November, 2025

LowDiff: Efficient Diffusion Sampling with Low-Resolution Condition.

[BibT_eX]

[DOI]

CoRR, September, 2025

When Tokens Talk Too Much: A Survey of Multimodal Long-Context Token Compression across Images, Videos, and Audios.

[BibT_eX]

[DOI]

CoRR, July, 2025

AutoL2S: Auto Long-Short Reasoning for Efficient Large Language Models.

[BibT_eX]

[DOI]

CoRR, May, 2025

Co-Exploring Structured Sparsification and Low-Rank Tensor Decomposition for Compact DNNs.

[BibT_eX]

[DOI]

IEEE Trans. Neural Networks Learn. Syst., April, 2025

70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float.

[BibT_eX]

[DOI]

Anshumali Shrivastava

CoRR, April, 2025

Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models.

[BibT_eX]

[DOI]

CoRR, March, 2025

Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language Models.

[BibT_eX]

[DOI]

CoRR, March, 2025

Confident or Seek Stronger: Exploring Uncertainty-Based On-device LLM Routing From Benchmarking to Generalization.

[BibT_eX]

[DOI]

CoRR, February, 2025

DisDet: Exploring Detectability of Backdoor Attack on Diffusion Models.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2025

Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2025

70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float (DFloat11).

[BibT_eX]

[DOI]

Tianyi Zhang

Mohsen Hariri

Shaochen (Henry) Zhong

Vipin Chaudhary

Yang Sui

Xia Hu

Anshumali Shrivastava

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

HoliTom: Holistic Token Merging for Fast Video Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

TopV: Compatible Token Pruning with Inference Time Optimization for Fast and Low-Memory Multimodal Vision Language Model.

[BibT_eX]

[DOI]

Ponnuswamy Sadayappan

Xia Hu

Bo Yuan

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

DyCoke: Dynamic Compression of Tokens for Fast Video Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

iTAP: An Incremental Task Graph Partitioner for Task-parallel Static Timing Analysis.

[BibT_eX]

[DOI]

Proceedings of the 30th Asia and South Pacific Design Automation Conference, 2025

2024

Corner-to-Center long-range context model for efficient learned image compression.

[BibT_eX]

[DOI]

J. Vis. Commun. Image Represent., 2024

Understanding Artificial Neural Network's Behavior from Neuron Activation Perspective.

[BibT_eX]

[DOI]

Yizhou Zhang

Yang Sui

CoRR, 2024

MoE-I<sup>2</sup>: Compressing Mixture of Experts Models through Inter-Expert Pruning and Intra-Expert Low-Rank Decomposition.

[BibT_eX]

[DOI]

CoRR, 2024

ELRT: Efficient Low-Rank Training for Compact Convolutional Neural Networks.

[BibT_eX]

[DOI]

CoRR, 2024

BitsFusion: 1.99 bits Weight Quantization of Diffusion Model.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

MOPED: Efficient Motion Planning Engine with Flexible Dimension Support.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024

MoE-I²: Compressing Mixture of Experts Models through Inter-Expert Pruning and Intra-Expert Low-Rank Decomposition.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Clean and Compact: Efficient Data-Free Backdoor Defense with Model Compactness.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

Reconstruction Distortion of Learned Image Compression with Imperceptible Perturbations.

[BibT_eX]

[DOI]

Proceedings of the Data Compression Conference, 2024

Invited: Algorithm and Hardware Co-Design for Energy-Efficient Neural SLAM.

[BibT_eX]

[DOI]

Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024

Transferable Learned Image Compression-Resistant Adversarial Perturbations.

[BibT_eX]

[DOI]

Proceedings of the 35th British Machine Vision Conference, 2024

2023

In-Sensor Radio Frequency Computing for Energy-Efficient Intelligent Radar.

[BibT_eX]

[DOI]

CoRR, 2023

Learning-based Homography Matrix Optimization for Dual-fisheye Video Stitching.

[BibT_eX]

[DOI]

Proceedings of the 2023 Workshop on Emerging Multimedia Systems, 2023

ETTE: Efficient Tensor-Train-based Computing Engine for Deep Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023

DynGMP: Graph Neural Network-Based Motion Planning in Unpredictable Dynamic Environments.

[BibT_eX]

[DOI]

IROS, 2023

Invited Paper: In-Sensor Radio Frequency Computing for Energy-Efficient Intelligent Radar.

[BibT_eX]

[DOI]

Proceedings of the IEEE/ACM International Conference on Computer Aided Design, 2023

DSPIMM: A Fully Digital SParse In-Memory Matrix Vector Multiplier for Communication Applications.

[BibT_eX]

[DOI]

Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

HALOC: Hardware-Aware Automatic Low-Rank Compression for Compact Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

CSTAR: Towards Compact and Structured Deep Neural Networks with Adversarial Robustness.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022

Algorithm and Hardware Co-Design of Energy-Efficient LSTM Networks for Video Recognition with Hierarchical Tucker Tensor Decomposition.

[BibT_eX]

[DOI]

CoRR, 2022

HODEC: Towards Efficient High-Order DEcomposed Convolutional Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

CHIP: CHannel Independence-based Pruning for Compact Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

GoSPA: An Energy-efficient High-performance Globally Optimized SParse Convolutional Neural Network Accelerator.

[BibT_eX]

[DOI]

Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021

Algorithm and Hardware Co-design for Deep Learning-powered Channel Decoder: A Case Study.

[BibT_eX]

[DOI]

Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2021

Towards Efficient Tensor Decomposition-Based DNN Model Compression With Optimization Framework.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Yang Sui

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...