Qinglin Lu

Orcid: 0000-0002-4584-0826

According to our database1, Qinglin Lu authored at least 80 papers between 2015 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Refinement via Regeneration: Enlarging Modification Space Boosts Image Refinement in Unified Multimodal Models.
CoRR, April, 2026

Meta-CoT: Enhancing Granularity and Generalization in Image Editing.
CoRR, April, 2026

SOAR: Self-Correction for Optimal Alignment and Refinement in Diffusion Models.
CoRR, April, 2026

OmniCamera: A Unified Framework for Multi-task Video Generation with Arbitrary Camera Control.
CoRR, April, 2026

Hierarchical SVG Tokenization: Learning Compact Visual Programs for Scalable Vector Graphics Modeling.
CoRR, April, 2026

VisionCreator-R1: A Reflection-Enhanced Native Visual-Generation Agentic Model.
CoRR, March, 2026

EffectMaker: Unifying Reasoning and Generation for Customized Visual Effect Creation.
CoRR, March, 2026

VisionCreator: A Native Visual-Generation Agentic Model with Understanding, Thinking, Planning and Creation.
CoRR, March, 2026

Generative Visual Chain-of-Thought for Image Editing.
CoRR, March, 2026

ChatUMM: Robust Context Tracking for Conversational Interleaved Generation.
CoRR, February, 2026

OmniVideo-R1: Reinforcing Audio-visual Reasoning with Query Intention and Modality Attention.
CoRR, February, 2026

Euphonium: Steering Video Flow Matching via Process Reward Gradient Guided Stochastic Dynamics.
CoRR, February, 2026

Making Avatars Interact: Towards Text-Driven Human-Object Interaction for Controllable Talking Avatars.
CoRR, February, 2026

TAG-MoE: Task-Aware Gating for Unified Generative Mixture-of-Experts.
CoRR, January, 2026

TAGRPO: Boosting GRPO on Image-to-Video Generation with Direct Trajectory Alignment.
CoRR, January, 2026

Rotate Your Character: Revisiting Video Diffusion Models for High-Quality 3D Character Generation.
CoRR, January, 2026

Re-Align: Structured Reasoning-guided Alignment for In-Context Image Generation and Editing.
CoRR, January, 2026

Phased One-Step Adversarial Equilibrium for Video Diffusion Models.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025
SoliReward: Mitigating Susceptibility to Reward Hacking and Annotation Noise in Video Generation Reward Models.
CoRR, December, 2025

StreamAvatar: Streaming Diffusion Models for Real-Time Interactive Human Avatars.
CoRR, December, 2025

ActAvatar: Temporally-Aware Precise Action Control for Talking Avatars.
CoRR, December, 2025

USV: Unified Sparsification for Accelerating Video Diffusion Models.
CoRR, December, 2025

Hunyuan-GameCraft-2: Instruction-following Interactive Game World Model.
CoRR, November, 2025

JarvisEvo: Towards a Self-Evolving Photo Editing Agent with Synergistic Editor-Evaluator Optimization.
CoRR, November, 2025

Harmony: Harmonizing Audio and Video Generation through Cross-Task Synergy.
CoRR, November, 2025

Video Generation Models Are Good Latent Reward Models.
CoRR, November, 2025

UniAVGen: Unified Audio and Video Generation with Asymmetric Cross-Modal Interactions.
CoRR, November, 2025

UniGenBench++: A Unified Semantic Evaluation Benchmark for Text-to-Image Generation.
CoRR, October, 2025

Bee: A High-Quality Corpus and Full-Stack Suite to Unlock Advanced Fully Open MLLMs.
CoRR, October, 2025

Pack and Force Your Memory: Long-form and Consistent Video Generation.
CoRR, October, 2025

Arbitrary Generative Video Interpolation.
CoRR, October, 2025

HunyuanImage 3.0 Technical Report.
CoRR, September, 2025

Directly Aligning the Full Diffusion Trajectory with Fine-Grained Human Preference.
CoRR, September, 2025

PromptEnhancer: A Simple Approach to Enhance Text-to-Image Models via Chain-of-Thought Prompt Rewriting.
CoRR, September, 2025

POSE: Phased One-Step Adversarial Equilibrium for Video Diffusion Models.
CoRR, August, 2025

Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning.
CoRR, August, 2025

PersonaVlog: Personalized Multimodal Vlog Generation with Multi-Agent Collaboration and Iterative Self-Correction.
CoRR, August, 2025

X-Omni: Reinforcement Learning Makes Discrete Autoregressive Image Generative Models Great Again.
CoRR, July, 2025

Hunyuan-GameCraft: High-dynamic Interactive Game Video Generation with Hybrid History Condition.
CoRR, June, 2025

HunyuanVideo-HOMA: Generic Human-Object Interaction in Multimodal Driven Human Animation.
CoRR, June, 2025

PolyVivid: Vivid Multi-Subject Video Generation with Cross-Modal Interaction and Enhancement.
CoRR, June, 2025

OmniV2V: Versatile Video Generation and Editing via Dynamic Content Manipulation.
CoRR, June, 2025

HunyuanVideo-Avatar: High-Fidelity Audio-Driven Human Animation for Multiple Characters.
CoRR, May, 2025

Hunyuan-Game: Industrial-grade Intelligent Game Creation Model.
CoRR, May, 2025

HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation.
CoRR, May, 2025

Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning.
CoRR, May, 2025

InstantCharacter: Personalize Any Characters with a Scalable Diffusion Transformer Framework.
CoRR, April, 2025

LoRA-Composer: Leveraging Low-Rank Adaptation for Multi-Concept Customization in Training-Free Diffusion Models.
IEEE Trans. Image Process., 2025

HOMA: Towards Generic Human-Object Interaction in Multimodal Driven Human Animation with Weak Conditions.
Proceedings of the SIGGRAPH Asia 2025 Conference Papers, 2025

DialogGen: Multi-modal Interactive Dialogue System with Multi-turn Text-Image Generation.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025

Audio-Visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

FireEdit: Fine-grained Instruction-based Image Editing via Region-aware Vision Language Model.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

HunyuanPortrait: Implicit Condition Control for Enhanced Portrait Animation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Sonic: Shifting Focus to Global Audio Perception in Portrait Animation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Concept-Edge Fusion: Background Generation for Product Presentation Based on Text-to-Image Model.
Proceedings of the Computational Visual Media - 13th International Conference, 2025

Local Conditional Controlling for Text-to-Image Diffusion Models.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024
HunyuanVideo: A Systematic Framework For Large Video Generative Models.
CoRR, 2024

Searching Priors Makes Text-to-Video Synthesis Better.
CoRR, 2024

Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding.
CoRR, 2024

LoRA-Composer: Leveraging Low-Rank Adaptation for Multi-Concept Customization in Training-Free Diffusion Models.
CoRR, 2024

DialogGen: Multi-modal Interactive Dialogue System for Multi-turn Text-to-Image Generation.
CoRR, 2024

2023
An Optimized Framework for Matrix Factorization on the New Sunway Many-core Platform.
ACM Trans. Archit. Code Optim., June, 2023

Publisher Correction: xMath2.0: a high-performance extended math library for SW26010-Pro many-core processor.
CCF Trans. High Perform. Comput., March, 2023

xMath2.0: a high-performance extended math library for SW26010-Pro many-core processor.
CCF Trans. High Perform. Comput., March, 2023

Local Conditional Controlling for Text-to-Image Diffusion Models.
CoRR, 2023

Smooth Video Synthesis with Noise Constraints on Diffusion Models for One-shot Video Tuning.
CoRR, 2023

IRB-5-CA Net: A Lightweight, Deep Learning-Based Approach to Wheat Seed Identification.
IEEE Access, 2023

GFFT: a Task Graph Based Fast Fourier Transform Optimization Framework.
Proceedings of the 52nd International Conference on Parallel Processing, 2023

2022
Tencent AVS: A Holistic Ads Video Dataset for Multi-Modal Scene Segmentation.
IEEE Access, 2022

2.5 Million-Atom Ab Initio Electronic-Structure Simulation of Complex Metallic Heterostructures with DGDFT.
Proceedings of the SC22: International Conference for High Performance Computing, 2022

Multi-modal Segment Assemblage Network for Ad Video Editing with Importance-Coherence Reward.
Proceedings of the Computer Vision - ACCV 2022, 2022

2021
Identification Method of Wheat Cultivars by Using a Convolutional Neural Network Combined with Images of Multiple Growth Periods of Wheat.
Symmetry, 2021

Overview of Tencent Multi-modal Ads Video Understanding Challenge.
CoRR, 2021

Better Learning Shot Boundary Detection via Multi-task.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Overview of Tencent Multi-modal Ads Video Understanding.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

2019
Context-free grammars, generating functions and combinatorial arrays.
Eur. J. Comb., 2019

2015
Entire reflective object surface structure understanding based on reflection motion estimation.
Pattern Recognit. Lett., 2015

Manufactured object sub-segmentation based on reflection motion estimation.
Proceedings of the 14th IAPR International Conference on Machine Vision Applications, 2015

Local surface curvature analysis based on reflection estimation.
Proceedings of the Seventh International Conference on Digital Image Processing, 2015

Entire Reflective Object Surface Structure Understanding.
Proceedings of the British Machine Vision Conference 2015, 2015


  Loading...