Bin Lin

Affiliations:
  • Peking University, Shenzhen Graduate School, Rabbitpre Intelligence, China


According to our database1, Bin Lin authored at least 23 papers between 2023 and 2025.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
MagicTime: Time-Lapse Video Generation Models as Metamorphic Simulators.
IEEE Trans. Pattern Anal. Mach. Intell., September, 2025

Look-Back: Implicit Visual Re-focusing in MLLM Reasoning.
CoRR, July, 2025

UniWorld-V1: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation.
CoRR, June, 2025

OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation.
CoRR, May, 2025

ImgEdit: A Unified Image Editing Dataset and Benchmark.
CoRR, May, 2025

SwapAnyone: Consistent and Realistic Video Synthesis for Swapping Any Person into Any Video.
CoRR, March, 2025

WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation.
CoRR, March, 2025

TaxDiff: taxonomic-guided diffusion model for protein sequence generation.
Sci. China Inf. Sci., 2025

WF-VAE: Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent Video Diffusion Model.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Cycle3D: High-quality and Consistent Image-to-3D Generation via Generation-Reconstruction Cycle.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024
Next Patch Prediction for Autoregressive Visual Generation.
CoRR, 2024

DreamDance: Animating Human Images by Enriching 3D Geometry Cues from 2D Poses.
CoRR, 2024

Open-Sora Plan: Open-Source Large Video Generation Model.
CoRR, 2024

OD-VAE: An Omni-dimensional Video Compressor for Improving Latent Video Diffusion Model.
CoRR, 2024

ShareGPT4Video: Improving Video Understanding and Generation with Better Captions.
CoRR, 2024

UNIAA: A Unified Multi-modal Image Aesthetic Assessment Baseline and Benchmark.
CoRR, 2024

TaxDiff: Taxonomic-Guided Diffusion Model for Protein Sequence Generation.
CoRR, 2024

LLMBind: A Unified Modality-Task Integration Framework.
CoRR, 2024

MoE-LLaVA: Mixture of Experts for Large Vision-Language Models.
CoRR, 2024

LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Video-LLaVA: Learning United Visual Representation by Alignment Before Projection.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

2023
Video-Bench: A Comprehensive Benchmark and Toolkit for Evaluating Video-based Large Language Models.
CoRR, 2023

LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment.
CoRR, 2023


  Loading...