Yiren Song

Orcid: 0000-0002-7028-3347

According to our database¹, Yiren Song authored at least 66 papers between 2022 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

AnySurf: Any Surface Generation with Directed Edge.

[BibT_eX]

[DOI]

CoRR, May, 2026

SWEET: Sparse World Modeling with Image Editing for Embodied Task Execution.

[BibT_eX]

[DOI]

CoRR, May, 2026

Soap2Soap: Long Cinematic Video Remaking via Multi-Agent Collaboration.

[BibT_eX]

[DOI]

CoRR, May, 2026

VISTA: Triplet-Supervised Video Style Transfer with Diffusion Transformers.

[BibT_eX]

[DOI]

CoRR, May, 2026

StreamingEffect: Real-Time Human-Centric Video Effect Generation.

[BibT_eX]

[DOI]

CoRR, May, 2026

OmniHumanoid: Streaming Cross-Embodiment Video Generation with Paired-Free Adaptation.

[BibT_eX]

[DOI]

CoRR, May, 2026

EditTransfer++: Toward Faithful and Efficient Visual-Prompt-Guided Image Editing.

[BibT_eX]

[DOI]

CoRR, May, 2026

OpenWorldLib: A Unified Codebase and Definition of Advanced World Models.

[BibT_eX]

[DOI]

CoRR, April, 2026

UENR-600K: A Large-Scale Physically Grounded Dataset for Nighttime Video Deraining.

[BibT_eX]

[DOI]

CoRR, April, 2026

Unlocking the Latent Canvas: Eliciting and Benchmarking Symbolic Visual Expression in LLMs.

[BibT_eX]

[DOI]

CoRR, March, 2026

SIGMA: Selective-Interleaved Generation with Multi-Attribute Tokens.

[BibT_eX]

[DOI]

CoRR, February, 2026

MCA-Bench: A Multimodal Benchmark for Evaluating CAPTCHA Robustness Against VLM-based Attacks.

[BibT_eX]

[DOI]

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

EasyText: Controllable Diffusion Transformer for Multilingual Text Rendering.

[BibT_eX]

[DOI]

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025

Loom: Diffusion-Transformer for Interleaved Generation.

[BibT_eX]

[DOI]

Mingcheng Ye

Jiaming Liu

Yiren Song

CoRR, December, 2025

Mitty: Diffusion-based Human-to-Robot Video Generation.

[BibT_eX]

[DOI]

CoRR, December, 2025

IC-Effect: Precise and Efficient Video Effects Editing via In-Context Learning.

[BibT_eX]

[DOI]

CoRR, December, 2025

H2R-Grounder: A Paired-Data-Free Paradigm for Translating Human Interaction Videos into Physically Grounded Robot Videos.

[BibT_eX]

[DOI]

CoRR, December, 2025

OmniPSD: Layered PSD Generation with Diffusion Transformer.

[BibT_eX]

[DOI]

CoRR, December, 2025

X-Humanoid: Robotize Human Videos to Generate Humanoid Videos at Scale.

[BibT_eX]

[DOI]

CoRR, December, 2025

TokenPure: Watermark Removal through Tokenized Appearance and Structural Guidance.

[BibT_eX]

[DOI]

CoRR, December, 2025

WorldWander: Bridging Egocentric and Exocentric Worlds in Video Generation.

[BibT_eX]

[DOI]

CoRR, November, 2025

The Consistency Critic: Correcting Inconsistencies in Generated Images via Reference-Guided Attentive Alignment.

[BibT_eX]

[DOI]

CoRR, November, 2025

OmniRefiner: Reinforcement-Guided Local Diffusion Refinement.

[BibT_eX]

[DOI]

CoRR, November, 2025

Edit2Perceive: Image Editing Diffusion Models Are Strong Dense Perceivers.

[BibT_eX]

[DOI]

Yiqing Shi

Yiren Song

Mike Zheng Shou

CoRR, November, 2025

Personalized Vision via Visual In-Context Learning.

[BibT_eX]

[DOI]

CoRR, September, 2025

WordCon: Word-level Typography Control in Scene Text Rendering.

[BibT_eX]

[DOI]

CoRR, June, 2025

MCA-Bench: A Multimodal Benchmark for Evaluating CAPTCHA Robustness Against VLM-based Attacks.

[BibT_eX]

[DOI]

CoRR, June, 2025

Autoregressive Images Watermarking through Lexical Biasing: An Approach Resistant to Regeneration Attack.

[BibT_eX]

[DOI]

CoRR, June, 2025

DiffDecompose: Layer-Wise Decomposition of Alpha-Composited Images via Diffusion Transformers.

[BibT_eX]

[DOI]

CoRR, May, 2025

GRE Suite: Geo-localization Inference via Fine-Tuned Vision-Language Models and Enhanced Reasoning Chains.

[BibT_eX]

[DOI]

CoRR, May, 2025

FocusedAD: Character-centric Movie Audio Description.

[BibT_eX]

[DOI]

CoRR, April, 2025

TransAnimate: Taming Layer Diffusion to Generate RGBA Video.

[BibT_eX]

[DOI]

Xuewei Chen

Zhimin Chen

Yiren Song

CoRR, March, 2025

EasyControl: Adding Efficient and Flexible Control for Diffusion Transformer.

[BibT_eX]

[DOI]

CoRR, March, 2025

PhotoDoodle: Learning Artistic Image Editing from Few-Shot Pairwise Data.

[BibT_eX]

[DOI]

CoRR, February, 2025

MakeAnything: Harnessing Diffusion Transformers for Multi-Domain Procedural Sequence Generation.

[BibT_eX]

[DOI]

Yiren Song

Cheng Liu

Mike Zheng Shou

CoRR, February, 2025

Two-dimensional normalized knowledge distillation leveraging class relations.

[BibT_eX]

[DOI]

J. Vis. Commun. Image Represent., 2025

StableMakeup: When Real-World Makeup Transfer Meets Diffusion Model.

[BibT_eX]

[DOI]

Proceedings of the Special Interest Group on Computer Graphics and Interactive Techniques Conference, 2025

GRE Suite: Geo-localization Inference via Fine-Tuned Vision-Language Models and Enhanced Reasoning Chains.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

OmniConsistency: Learning Style-Agnostic Consistency from Paired Stylization Data.

[BibT_eX]

[DOI]

Yiren Song

Cheng Liu

Mike Zheng Shou

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

RelationAdapter: Learning and Transferring Visual Relation with Diffusion Transformers.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

WMAdapter: Adding WaterMark Control to Latent Diffusion Models.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

Image Watermarks are Removable using Controllable Regeneration from Clean Noise.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

EasyControl: Adding Efficient and Flexible Control for Diffusion Transformer.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

DiffSim: Taming Diffusion Models for Evaluating Visual Similarity.

[BibT_eX]

[DOI]

Yiren Song

Xiaokang Liu

Mike Zheng Shou

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

LayerTracer: Cognitive-Aligned Layered SVG Synthesis via Diffusion Transformer.

[BibT_eX]

[DOI]

Yiren Song

Danze Chen

Mike Zheng Shou

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

FonTS: Text Rendering with Typography and Style Controls.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

ArtEditor: Learning Customized Instructional Image Editor From Few-Shot Examples.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

Any2anytryon: Leveraging Adaptive Position Embeddings for Versatile Virtual Clothing Tasks.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

IDProtector: An Adversarial Noise Encoder to Protect Against ID-Preserving Image Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Stable-Hair: Real-World Hair Transfer via Diffusion Model.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024

GridShow: Omni Visual Generation.

[BibT_eX]

[DOI]

CoRR, 2024

Anti-Reference: Universal and Immediate Defense Against Reference-Based Generation.

[BibT_eX]

[DOI]

CoRR, 2024

Stable-Hair: Real-World Hair Transfer via Diffusion Model.

[BibT_eX]

[DOI]

CoRR, 2024

Steganalysis on Digital Watermarking: Is Your Defense Truly Impervious?

[BibT_eX]

[DOI]

CoRR, 2024

ProcessPainter: Learn Painting Process from Sequence Data.

[BibT_eX]

[DOI]

CoRR, 2024

Fast Personalized Text-to-Image Syntheses With Attention Injection.

[BibT_eX]

[DOI]

CoRR, 2024

Stable-Makeup: When Real-World Makeup Transfer Meets Diffusion Model.

[BibT_eX]

[DOI]

CoRR, 2024

ProcessPainter: Learning to draw from sequence data.

[BibT_eX]

[DOI]

Proceedings of the SIGGRAPH Asia 2024 Conference Papers, 2024

Can Simple Averaging Defeat Modern Watermarks?

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Fast Personalized Text to Image Synthesis with Attention Injection.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

RingID: Rethinking Tree-Ring Watermarking for Enhanced Multi-key Identification.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

SSR-Encoder: Encoding Selective Subject Representation for Subject-Driven Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023

CLIPVG: Text-Guided Image Manipulation Using Differentiable Vector Graphics.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022

Multi-sensor measurement fusion based on minimum mixture error entropy with non-Gaussian measurement noise.

[BibT_eX]

[DOI]

Digit. Signal Process., 2022

CLIPTexture: Text-Driven Texture Synthesis.

[BibT_eX]

[DOI]

Yiren Song

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

CLIPFont: Text Guided Vector WordArt Generation.

[BibT_eX]

[DOI]

Yiren Song

Yuxuan Zhang

Proceedings of the 33rd British Machine Vision Conference 2022, 2022

Yiren Song

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...