Sicong Leng

Orcid: 0000-0002-3084-5026

According to our database1, Sicong Leng authored at least 17 papers between 2021 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
VL-Cogito: Progressive Curriculum Reinforcement Learning for Advanced Multimodal Reasoning.
CoRR, July, 2025

Two Is Better Than One: Rotations Scale LoRAs.
CoRR, May, 2025

Advancing Expert Specialization for Better MoE.
CoRR, May, 2025

Refining Positive and Toxic Samples for Dual Safety Self-Alignment of LLMs with Minimal Human Interventions.
CoRR, February, 2025

VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding.
CoRR, January, 2025

Breaking the Memory Barrier of Contrastive Loss via Tile-Based Strategy.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024
Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss.
CoRR, 2024

The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio.
CoRR, 2024

AGLA: Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention.
CoRR, 2024

VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs.
CoRR, 2024

BenchX: A Unified Benchmark Framework for Medical Vision-Language Pretraining on Chest X-Rays.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Constrained Layout Generation with Factor Graphs.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Uncovering what, why and How: A Comprehensive Benchmark for Causation Understanding of Video Anomaly.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
Tell2Design: A Dataset for Language-Guided Floor Plan Generation.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2021
Interventional Video Grounding With Dual Contrastive Learning.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021


  Loading...