Yuren Cong

Orcid: 0000-0001-7505-8563

According to our database¹, Yuren Cong authored at least 23 papers between 2020 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

WavFlow: Audio Generation in Waveform Space.

[BibT_eX]

[DOI]

CoRR, May, 2026

Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation.

[BibT_eX]

[DOI]

CoRR, April, 2026

SPAN: Learning Similarity Between Scene Graphs and Images With Transformers.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., March, 2026

2025

HiStream: Efficient High-Resolution Video Generation via Redundancy-Eliminated Streaming.

[BibT_eX]

[DOI]

Juan-Manuel Pérez-Rúa

CoRR, December, 2025

Scaling Zero-Shot Reference-to-Video Generation.

[BibT_eX]

[DOI]

CoRR, December, 2025

TUNA: Taming Unified Visual Representations for Native Unified Multimodal Models.

[BibT_eX]

[DOI]

Juan-Manuel Pérez-Rúa

CoRR, December, 2025

Mixture of States: Routing Token-Level Dynamics for Multimodal Generation.

[BibT_eX]

[DOI]

Juan-Manuel Pérez-Rúa

CoRR, November, 2025

Attribute-Centric Compositional Text-to-Image Generation.

[BibT_eX]

[DOI]

Int. J. Comput. Vis., July, 2025

FDSG: Forecasting Dynamic Scene Graphs.

[BibT_eX]

[DOI]

CoRR, June, 2025

PanoSCU: A Simulation-Based Dataset for Panoramic Indoor Scene Understanding.

[BibT_eX]

[DOI]

IEEE Access, 2025

Learning Flow Fields in Attention for Controllable Person Image Generation.

[BibT_eX]

[DOI]

Juan-Manuel Pérez-Rúa

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024

Holistic scene understanding through image and video scene graphs.

[BibT_eX]

[DOI]

Yuren Cong

PhD thesis, 2024

Indoor Scene Change Understanding (SCU): Segment, Describe, and Revert Any Change.

[BibT_eX]

[DOI]

Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2024

Worldafford: Affordance Grounding Based on Natural Language Instructions.

[BibT_eX]

[DOI]

Changmao Chen

Yuren Cong

Zhen Kan

Proceedings of the 36th IEEE International Conference on Tools with Artificial Intelligence, 2024

FLATTEN: optical FLow-guided ATTENtion for consistent text-to-video editing.

[BibT_eX]

[DOI]

Juan-Manuel Pérez-Rúa

Bodo Rosenhahn

Tao Xiang

Sen He

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Segment Any Object Model (SAOM): Real-To-Simulation Fine-Tuning Strategy For Multi-Class Multi-Instance Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Image Processing, 2024

GenTron: Diffusion Transformers for Image and Video Generation.

[BibT_eX]

[DOI]

Juan-Manuel Pérez-Rúa

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023

RelTR: Relation Transformer for Scene Graph Generation.

[BibT_eX]

[DOI]

Yuren Cong

Michael Ying Yang

Bodo Rosenhahn

IEEE Trans. Pattern Anal. Mach. Intell., September, 2023

GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation.

[BibT_eX]

[DOI]

Juan-Manuel Pérez-Rúa

CoRR, 2023

Learning Similarity between Scene Graphs and Images with Transformers.

[BibT_eX]

[DOI]

CoRR, 2023

SSGVS: Semantic Scene Graph-to-Video Synthesis.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2021

Spatial-Temporal Transformer for Dynamic Scene Graph Generation.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

2020

NODIS: Neural Ordinary Differential Scene Understanding.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

Yuren Cong

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...