Bin Lin

Orcid: 0009-0003-4805-9730

Affiliations:

Peking University, Shenzhen Graduate School, Rabbitpre Intelligence, China

According to our database¹, Bin Lin authored at least 35 papers between 2023 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Bibliography

2026

OSP-Next: Efficient High-Quality Video Generation with Sparse Sequence Parallelism, HiF8 Quantization, and Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, May, 2026

Manifold-Aware Exploration for Reinforcement Learning in Video Generation.

[BibT_eX]

[DOI]

CoRR, March, 2026

iFSQ: Improving FSQ for Image Generation with 1 Line of Code.

[BibT_eX]

[DOI]

CoRR, January, 2026

MoE-LLaVA: Mixture of Experts for Large Vision-Language Models.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2026

Video-Bench: A Comprehensive Benchmark and Toolkit for Evaluating Video-Based Large Language Models.

[BibT_eX]

[DOI]

Comput. Vis. Media, 2026

Look-Back: Implicit Visual Re-focusing in MLLM Reasoning.

[BibT_eX]

[DOI]

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

Next Patch Prediction for AutoRegressive Visual Generation.

[BibT_eX]

[DOI]

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

360Explorer: Exploring 4D Controllable World in Panoramic Videos.

[BibT_eX]

[DOI]

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025

JarvisEvo: Towards a Self-Evolving Photo Editing Agent with Synergistic Editor-Evaluator Optimization.

[BibT_eX]

[DOI]

CoRR, November, 2025

Does Understanding Inform Generation in Unified Multimodal Models? From Analysis to Path Forward.

[BibT_eX]

[DOI]

CoRR, November, 2025

Uniworld-V2: Reinforce Image Editing with Diffusion Negative-aware Finetuning and MLLM Implicit Feedback.

[BibT_eX]

[DOI]

CoRR, October, 2025

MagicTime: Time-Lapse Video Generation Models as Metamorphic Simulators.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., September, 2025

FlashI2V: Fourier-Guided Latent Shifting Prevents Conditional Image Leakage in Image-to-Video Generation.

[BibT_eX]

[DOI]

CoRR, September, 2025

Unified Multimodal Model as Auto-Encoder.

[BibT_eX]

[DOI]

CoRR, September, 2025

UniWorld-V1: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation.

[BibT_eX]

[DOI]

CoRR, June, 2025

OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation.

[BibT_eX]

[DOI]

CoRR, May, 2025

SwapAnyone: Consistent and Realistic Video Synthesis for Swapping Any Person into Any Video.

[BibT_eX]

[DOI]

CoRR, March, 2025

WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation.

[BibT_eX]

[DOI]

CoRR, March, 2025

TaxDiff: taxonomic-guided diffusion model for protein sequence generation.

[BibT_eX]

[DOI]

Sci. China Inf. Sci., 2025

OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

ImgEdit: A Unified Image Editing Dataset and Benchmark.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

OD-VAE: An Omni-dimensional Video Compressor for Improving Latent Video Diffusion Model.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2025

WF-VAE: Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent Video Diffusion Model.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Cycle3D: High-quality and Consistent Image-to-3D Generation via Generation-Reconstruction Cycle.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024

Next Patch Prediction for Autoregressive Visual Generation.

[BibT_eX]

[DOI]

CoRR, 2024

DreamDance: Animating Human Images by Enriching 3D Geometry Cues from 2D Poses.

[BibT_eX]

[DOI]

CoRR, 2024

Open-Sora Plan: Open-Source Large Video Generation Model.

[BibT_eX]

[DOI]

CoRR, 2024

ShareGPT4Video: Improving Video Understanding and Generation with Better Captions.

[BibT_eX]

[DOI]

CoRR, 2024

UNIAA: A Unified Multi-modal Image Aesthetic Assessment Baseline and Benchmark.

[BibT_eX]

[DOI]

CoRR, 2024

TaxDiff: Taxonomic-Guided Diffusion Model for Protein Sequence Generation.

[BibT_eX]

[DOI]

CoRR, 2024

LLMBind: A Unified Modality-Task Integration Framework.

[BibT_eX]

[DOI]

CoRR, 2024

MoE-LLaVA: Mixture of Experts for Large Vision-Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Video-LLaVA: Learning United Visual Representation by Alignment Before Projection.

[BibT_eX]

[DOI]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

2023

LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment.

[BibT_eX]

[DOI]

CoRR, 2023

Bin Lin

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...