Sibei Yang

Orcid: 0000-0002-8144-7351

According to our database1, Sibei Yang authored at least 56 papers between 2018 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
Vision Function Layer in Multimodal LLMs.
CoRR, September, 2025

Sim-DETR: Unlock DETR for Temporal Sentence Grounding.
CoRR, September, 2025

Penalizing Boundary Activation for Object Completeness in Diffusion Models.
CoRR, September, 2025

No More Sibling Rivalry: Debiasing Human-Object Interaction Detection.
CoRR, September, 2025

TransXNet: Learning Both Global and Local Dynamics With a Dual Dynamic Token Mixer for Visual Recognition.
IEEE Trans. Neural Networks Learn. Syst., June, 2025

Beyond Jailbreaks: Revealing Stealthier and Broader LLM Security Risks Stemming from Alignment Failures.
CoRR, June, 2025

VTON 360: High-Fidelity Virtual Try-On from Any Viewing Direction.
CoRR, March, 2025

DELMAN: Dynamic Defense Against Large Language Model Jailbreaking with Model Editing.
CoRR, February, 2025

Auto-Search and Refinement: An Automated Framework for Gender Bias Mitigation in Large Language Models.
CoRR, February, 2025

Discovering Influential Neuron Path in Vision Transformers.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

CityAnchor: City-scale 3D Visual Grounding with Multi-modality LLMs.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

MVTokenFlow: High-quality 4D Content Generation using Multiview Token Flow.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Rethinking Query-based Transformer for Continual Image Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

SeqAfford: Sequential 3D Affordance Reasoning via Multimodal Large Language Model.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Dissecting and Mitigating Diffusion Bias via Mechanistic Interpretability.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

VTON 360: High-Fidelity Virtual Try-On from Any Viewing Direction.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Free on the Fly: Enhancing Flexibility in Test-Time Adaptation with Online EM.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Adaptive Part Learning for Fine-Grained Generalized Category Discovery: A Plug-and-Play Enhancement.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Don't Say No: Jailbreaking LLM by Suppressing Refusal.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

DELMAN: Dynamic Defense Against Large Language Model Jailbreaking with Model Editing.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024
A Survey on Graph Neural Networks and Graph Transformers in Computer Vision: A Task-Oriented Perspective.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2024

Plain-Det: A Plain Multi-Dataset Object Detector.
CoRR, 2024

RealDex: Towards Human-like Grasping for Robotic Dexterous Hand.
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

The Devil is in the Object Boundary: Towards Annotation-free Instance Segmentation using Foundation Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Part2Object: Hierarchical Unsupervised 3D Instance Segmentation.
Proceedings of the Computer Vision - ECCV 2024, 2024

Plain-Det: A Plain Multi-dataset Object Detector.
Proceedings of the Computer Vision - ECCV 2024, 2024

WildRefer: 3D Object Localization in Large-Scale Dynamic Scenes with Multi-modal Visual Data and Natural Language.
Proceedings of the Computer Vision - ECCV 2024, 2024

OMG: Towards Open-vocabulary Motion Generation via Mixture of Controllers.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Curriculum Point Prompting for Weakly-Supervised Referring Image Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
DreamFace: Progressive Generation of Animatable 3D Faces under Text Guidance.
ACM Trans. Graph., August, 2023

A Unified Visual Information Preservation Framework for Self-supervised Pre-Training in Medical Image Analysis.
IEEE Trans. Pattern Anal. Mach. Intell., July, 2023

OMG: Towards Open-vocabulary Motion Generation via Mixture of Controllers.
CoRR, 2023

TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual Recognition.
CoRR, 2023

WildRefer: 3D Object Localization in Large-scale Dynamic Scenes with Multi-modal Visual Data and Natural Language.
CoRR, 2023

PCRLv2: A Unified Visual Information Preservation Framework for Self-supervised Pre-training in Medical Image Analysis.
CoRR, 2023

DDCoT: Duty-Distinct Chain-of-Thought Prompting for Multimodal Reasoning in Language Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Free-Bloom: Zero-Shot Text-to-Video Generator with LLM Director and LDM Animator.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Grounded Image Text Matching with Mismatched Relation Reasoning.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

CoTDet: Affordance Knowledge Prompting for Task Driven Object Detection.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Temporal Collection and Distribution for Referring Video Object Segmentation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

EdaDet: Open-Vocabulary Object Detection Using Early Dense Alignment.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

LoGoPrompt: Synthetic Text Images Can Be Good Visual Prompts for Vision-Language Models.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Contrastive Grouping with Transformer for Referring Image Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

CCQ: Cross-Class Query Network for Partially Labeled Organ Segmentation.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Structured Attention Network for Referring Image Segmentation.
IEEE Trans. Multim., 2022

Spatial and Visual Perspective-Taking via View Rotation and Relation Reasoning for Embodied Reference Understanding.
Proceedings of the Computer Vision - ECCV 2022, 2022

2021
Relationship-Embedded Representation Learning for Grounding Referring Expressions.
IEEE Trans. Pattern Anal. Mach. Intell., 2021

ConvNets vs. Transformers: Whose Visual Representations are More Transferable?
Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2021

Preservational Learning Improves Self-supervised Medical Image Models by Reconstructing Diverse Contexts.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Bottom-Up Shift and Reasoning for Referring Image Segmentation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020
Propagating Over Phrase Relations for One-Stage Visual Grounding.
Proceedings of the Computer Vision - ECCV 2020, 2020

Graph-Structured Referring Expression Reasoning in the Wild.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019
Dynamic Graph Attention for Referring Expression Comprehension.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Cross-Modal Relationship Inference for Grounding Referring Expressions.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Non-Local Context Encoder: Robust Biomedical Image Segmentation against Adversarial Attacks.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Multi-Evidence Filtering and Fusion for Multi-Label Classification, Object Detection and Semantic Segmentation Based on Weakly Supervised Learning.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018


  Loading...