Manyuan Zhang
Orcid: 0009-0003-2148-1085
According to our database1,
Manyuan Zhang authored at least 46 papers
between 2018 and 2026.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2026
CoRR, May, 2026
CoRR, May, 2026
CoRR, March, 2026
AutoWeather4D: Autonomous Driving Video Weather Conversion via G-Buffer Dual-Pass Editing.
CoRR, March, 2026
CoRR, March, 2026
RPiAE: A Representation-Pivoted Autoencoder Enhancing Both Image Generation and Editing.
CoRR, March, 2026
CoRR, March, 2026
OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence.
CoRR, February, 2026
OmniVideo-R1: Reinforcing Audio-visual Reasoning with Query Intention and Modality Attention.
CoRR, February, 2026
Proceedings of the International Conference on 3D Visio, 2026
2025
OpenSubject: Leveraging Video-Derived Identity and Diversity Priors for Subject-driven Image Generation and Manipulation.
CoRR, December, 2025
CoRR, December, 2025
AlignVid: Training-Free Attention Scaling for Semantic Fidelity in Text-Guided Image-to-Video Generation.
CoRR, December, 2025
CoRR, November, 2025
Thinking-while-Generating: Interleaving Textual Reasoning throughout Visual Generation.
CoRR, November, 2025
Are Video Models Ready as Zero-Shot Reasoners? An Empirical Study with the MME-CoF Benchmark.
CoRR, October, 2025
CoRR, October, 2025
CoRR, October, 2025
CoRR, October, 2025
ARES: Multimodal Adaptive Reasoning via Difficulty-Aware Token-Level Entropy Shaping.
CoRR, October, 2025
LM-Searcher: Cross-domain Neural Architecture Search with LLMs via Unified Numerical Encoding.
CoRR, September, 2025
CoRR, March, 2025
LM-Searcher: Cross-domain Neural Architecture Search with LLMs via Unified Numerical Encoding.
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
2024
Motion-I2V: Consistent and Controllable Image-to-Video Generation with Explicit Motion Modeling.
Proceedings of the ACM SIGGRAPH 2024 Conference Papers, 2024
Three Things We Need to Know About Transferring Stable Diffusion to Visual Dense Prediction Tasks.
Proceedings of the Computer Vision - ECCV 2024, 2024
Proceedings of the Computer Vision - ECCV 2024, 2024
2023
Decoupled DETR: Spatially Disentangling Localization and Classification for Improved End-to-End Object Detection.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
FlowFormer++: Masked Cost Volume Autoencoding for Pretraining Optical Flow Estimation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
2022
Proceedings of the Computer Vision - ECCV 2022, 2022
2021
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
2020
Complementary Boundary Generator with Scale-Invariant Relation Modeling for Temporal Action Localization: Submission to ActivityNet Challenge 2020.
CoRR, 2020
CoRR, 2020
Proceedings of the Computer Vision - ECCV 2020, 2020
2019
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019
2018
Proceedings of the 17th IEEE International Conference On Trust, 2018
Proceedings of the 2018 IEEE International Conference on Multimedia and Expo, 2018