Shusheng Yang

According to our database1, Shusheng Yang authored at least 31 papers between 2021 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Cambrian-P: Pose-Grounded Video Understanding.
CoRR, May, 2026

Better early detector for high-performance detection transformer.
Image Vis. Comput., 2026

Multi-level reinforcement learning with agent-based simulation for dynamic concrete scheduling in high-speed railway construction.
Appl. Soft Comput., 2026

High-speed railway operational safety indicator assessment under earthquakes: a physics-informed hybrid prediction framework.
Adv. Eng. Informatics, 2026

2025
Cambrian-S: Towards Spatial Supersensing in Video.
CoRR, November, 2025

Benchmark Designers Should "Train on the Test Set" to Expose Exploitable Non-Visual Shortcuts.
CoRR, November, 2025

BLIP3o-NEXT: Next Frontier of Native Image Generation.
CoRR, October, 2025

VideoNSA: Native Sparse Attention Scales Video Understanding.
CoRR, October, 2025

Towards Ambiguity-Free Spatial Foundation Model: Rethinking and Decoupling Depth Ambiguity.
CoRR, March, 2025

Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024
ViTMatte: Boosting image matting with pre-trained plain vision transformers.
Inf. Fusion, March, 2024

Biobjective optimization for railway alignment fine-grained designs with parallel existing railways.
Comput. Aided Civ. Infrastructure Eng., February, 2024

Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs.
CoRR, 2024

The static ridesharing routing problem with flexible locations: A Norwegian case study.
Comput. Oper. Res., 2024

Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs.
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Rethinking Pragmatics in Large Language Models: Towards Open-Ended Evaluation and Preference Tuning.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

MobileInst: Video Instance Segmentation on the Mobile.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
A sequential exploration algorithm for the design optimization of horizontal road alignment.
Comput. Aided Civ. Infrastructure Eng., October, 2023

Qwen Technical Report.
CoRR, 2023

TouchStone: Evaluating Vision-Language Models by Language Models.
CoRR, 2023

Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities.
CoRR, 2023

ViTMatte: Boosting Image Matting with Pretrained Plain Vision Transformers.
CoRR, 2023

Masked Visual Reconstruction in Language Semantic Space.
CoRR, 2023

Masked Image Modeling with Denoising Contrast.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

RILS: Masked Visual Reconstruction in Language Semantic Space.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
Relational Surrogate Loss Learning.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Temporally Efficient Vision Transformer for Video Instance Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
Tracking Instances as Queries.
CoRR, 2021

Crossover Learning for Fast Online Video Instance Segmentation.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Instances as Queries.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021


  Loading...