Lin Song

Affiliations:

Tencent AILab, Shenzhen, China
Xi'an Jiaotong University, College of Artificial Intelligence, Xi'an, China (PhD)

According to our database¹, Lin Song authored at least 32 papers between 2019 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Bibliography

2025

How Far are VLMs from Visual Spatial Intelligence? A Benchmark-Driven Perspective.

[BibT_eX]

[DOI]

CoRR, September, 2025

HaploOmni: Unified Single Transformer for Multimodal Video Understanding and Generation.

[BibT_eX]

[DOI]

CoRR, June, 2025

TensorAR: Refinement is All You Need in Autoregressive Image Generation.

[BibT_eX]

[DOI]

CoRR, May, 2025

MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO.

[BibT_eX]

[DOI]

CoRR, May, 2025

HaploVL: A Single-Transformer Baseline for Multi-Modal Understanding.

[BibT_eX]

[DOI]

CoRR, March, 2025

LoRA-Gen: Specializing Large Language Model via Online LoRA Generation.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

HaploVL: A Single-Transformer Baseline for Multi-Modal Understanding.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

2024

YOLO-UniOW: Efficient Universal Open-World Object Detection.

[BibT_eX]

[DOI]

CoRR, 2024

GrootVL: Tree Topology is All You Need in State Space Model.

[BibT_eX]

[DOI]

CoRR, 2024

SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation.

[BibT_eX]

[DOI]

CoRR, 2024

MambaTree: Tree Topology is All You Need in State Space Model.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

InstructDET: Diversifying Referring Object Detection with Generalized Instructions.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Low-Rank Approximation for Sparse Attention in Multi-Modal LLMs.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

YOLO-World: Real-Time Open-Vocabulary Object Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023

GMM: Delving into Gradient Aware and Model Perceive Depth Mining for Monocular 3D Detection.

[BibT_eX]

[DOI]

CoRR, 2023

Sticker820K: Empowering Interactive Retrieval with Stickers.

[BibT_eX]

[DOI]

CoRR, 2023

GPT4Tools: Teaching Large Language Model to Use Tools via Self-instruction.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Meta-Adapter: An Online Few-shot Learner for Vision-Language Model.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

DBQ-SSD: Dynamic Ball Query for Efficient 3D Object Detection.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

BoxSnake: Polygonal Instance Segmentation with Box Supervision.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

2022

DBQ-SSD: Dynamic Ball Query for Efficient 3D Object Detection.

[BibT_eX]

[DOI]

CoRR, 2022

2021

Workshop on Autonomous Driving at CVPR 2021: Technical Report for Streaming Perception Challenge.

[BibT_eX]

[DOI]

CoRR, 2021

Dynamic Grained Encoder for Vision Transformers.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

End-to-End Object Detection With Fully Convolutional Network.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020

GLNet: Global Local Network for Weakly Supervised Action Localization.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2020

Fine-Grained Dynamic Head for Object Detection.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Rethinking Learnable Tree Filter for Generic Feature Transform.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Learning Dynamic Routing for Semantic Segmentation.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019

NIPM-sWMF: Toward Efficient FPGA Design for High-Definition Large-Disparity Stereo Matching.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., 2019

Learnable Tree Filter for Structure-preserving Feature Transform.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

TACNet: Transition-Aware Context Network for Spatio-Temporal Action Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Lin Song

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...