Sucheng Ren

Orcid: 0000-0003-4730-8435

According to our database1, Sucheng Ren authored at least 45 papers between 2020 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
HResFormer: Hybrid Residual Transformer for Volumetric Medical Image Segmentation.
IEEE Trans. Neural Networks Learn. Syst., June, 2025

Grouping First, Attending Smartly: Training-Free Acceleration for Diffusion Transformers.
CoRR, May, 2025

Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation.
CoRR, February, 2025

ARVideo: Autoregressive Pretraining for Self-Supervised Video Representation Learning.
Trans. Mach. Learn. Res., 2025

DeepMIM: Deep Supervision for Masked Image Modeling.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2025

Autoregressive Pretraining with Mamba in Vision.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Adventurer: Optimizing Vision Mamba Architecture Designs for Efficiency.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Mamba-Reg: Vision Mamba Also Needs Registers.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024
Unifying Global-Local Representations in Salient Object Detection With Transformers.
IEEE Trans. Emerg. Top. Comput. Intell., August, 2024

FlowAR: Scale-wise Autoregressive Image Generation Meets Flow Matching.
CoRR, 2024

HResFormer: Hybrid Residual Transformer for Volumetric Medical Image Segmentation.
CoRR, 2024

M-VAR: Decoupled Scale-wise Autoregressive Modeling for High-Quality Image Generation.
CoRR, 2024

Causal Image Modeling for Efficient Visual Understanding.
CoRR, 2024

What If We Recaption Billions of Web Images with LLaMA-3?
CoRR, 2024

Medical Vision Generalist: Unifying Medical Imaging Tasks in Context.
CoRR, 2024

Mamba-R: Vision Mamba ALSO Needs Registers.
CoRR, 2024

Beyond Finite Data: Towards Data-free Out-of-distribution Generalization via Extrapolation.
CoRR, 2024

Glance to Count: Learning to Rank with Anchors for Weakly-supervised Crowd Counting.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

Rejuvenating image-GPT as Strong Visual Representation Learners.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

2023
Reducing Spatial Labeling Redundancy for Active Semi-Supervised Crowd Counting.
IEEE Trans. Pattern Anal. Mach. Intell., July, 2023

Edge Distraction-aware Salient Object Detection.
IEEE Multim., 2023

Compress & Align: Curating Image-Text Data with Human Knowledge.
CoRR, 2023

DeepMIM: Deep Supervision for Masked Image Modeling.
CoRR, 2023

NPF-200: A Multi-Modal Eye Fixation Dataset and Method for Non-Photorealistic Videos.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Fine-grained Domain Adaptive Crowd Counting via Point-derived Segmentation.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

The Modality Focusing Hypothesis: Towards Understanding Crossmodal Knowledge Distillation.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Self-supervision through Random Segments with Autoregressive Coding (RandSAC).
Proceedings of the Eleventh International Conference on Learning Representations, 2023

SG-Former: Self-guided Transformer with Evolving Token Reallocation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

TinyMIM: An Empirical Study of Distilling MIM Pre-trained Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
The Modality Focusing Hypothesis: On the Blink of Multimodal Knowledge Distillation.
CoRR, 2022

Training-Free Robust Multimodal Learning via Sample-Wise Jacobian Regularization.
CoRR, 2022

Self-supervision through Random Segments with Autoregressive Coding (RandSAC).
CoRR, 2022

DynaST: Dynamic Sparse Transformer for Exemplar-Guided Image Generation.
Proceedings of the Computer Vision - ECCV 2022, 2022

Learning from Multiple Annotator Noisy Labels via Sample-Wise Label Fusion.
Proceedings of the Computer Vision, 2022

Shunted Self-Attention via Multi-Scale Token Aggregation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

A Simple Data Mixing Prior for Improving Self-Supervised Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Co-advise: Cross Inductive Bias Distillation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
Reducing Spatial Labeling Redundancy for Semi-supervised Crowd Counting.
CoRR, 2021

Unifying Global-Local Representations in Salient Object Detection with Transformer.
CoRR, 2021

Multimodal Knowledge Expansion.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

On Feature Decorrelation in Self-Supervised Learning.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Reciprocal Transformations for Unsupervised Video Object Segmentation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Learning From the Master: Distilling Cross-Modal Advanced Knowledge for Lip Reading.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Delving Deep Into Many-to-Many Attention for Few-Shot Video Object Segmentation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020
TENet: Triple Excitation Network for Video Salient Object Detection.
Proceedings of the Computer Vision - ECCV 2020, 2020


  Loading...