Dawei Leng

Orcid: 0009-0000-5461-1681

According to our database1, Dawei Leng authored at least 28 papers between 2016 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
CTA-Flux: Integrating Chinese Cultural Semantics into High-Quality English Text-to-Image Communities.
CoRR, August, 2025

NanoControl: A Lightweight Framework for Precise and Efficient Control in Diffusion Transformer.
CoRR, August, 2025

FLUX-Makeup: High-Fidelity, Identity-Consistent, and Robust Makeup Transfer via Diffusion Transformer.
CoRR, August, 2025

LMM-Det: Make Large Multimodal Models Excel in Object Detection.
CoRR, July, 2025

FG-CLIP: Fine-Grained Visual and Textual Alignment.
CoRR, May, 2025

PlanGen: Towards Unified Layout Planning and Image Generation in Auto-Regressive Vision Language Models.
CoRR, March, 2025

NAMI: Efficient Image Generation via Progressive Rectified Flow Transformers.
CoRR, March, 2025

U-StyDiT: Ultra-high Quality Artistic Style Transfer Using Diffusion Transformers.
CoRR, March, 2025

WISA: World Simulator Assistant for Physics-Aware Text-to-Video Generation.
CoRR, March, 2025

RelaCtrl: Relevance-Guided Efficient Control for Diffusion Transformers.
CoRR, February, 2025

PT-T2I/V: An Efficient Proxy-Tokenized Diffusion Transformer for Text-to-Image/Video-Task.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Prompt as Knowledge Bank: Boost Vision-language model via Structural Representation for zero-shot medical detection.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

IAA: Inner-Adaptor Architecture Empowers Frozen Large Language Model with Multimodal Capabilities.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

Bridge Diffusion Model: Bridge Chinese Text-to-Image Diffusion Model with English Communities.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024
Qihoo-T2X: An Efficiency-Focused Diffusion Transformer via Proxy Tokens for Text-to-Any-Task.
CoRR, 2024

IAA: Inner-Adaptor Architecture Empowers Frozen Large Language Model with Multimodal Capabilities.
CoRR, 2024

FancyVideo: Towards Dynamic and Consistent Video Generation via Cross-frame Textual Guidance.
CoRR, 2024

HiCo: Hierarchical Controllable Diffusion Model for Layout-to-image Generation.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

2023
Bridge Diffusion Model: bridge non-English language-native text-to-image diffusion model with English communities.
CoRR, 2023

What Makes Good Open-Vocabulary Detector: A Disassembling Perspective.
CoRR, 2023

CCMB: A Large-scale Chinese Cross-modal Benchmark.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

2022
Zero and R2D2: A Large-scale Chinese Cross-modal Benchmark and A Vision-Language Framework.
CoRR, 2022

2021
Sequence-based deep learning antibody design for in silico antibody affinity maturation.
CoRR, 2021

Real-time tracking of COVID-19 and coronavirus research updates through text mining.
CoRR, 2021

ParaVS: A Simple, Fast, Efficient and Flexible Graph Neural Network Framework for Structure-Based Virtual Screening.
CoRR, 2021

Enhance Information Propagation for Graph Neural Network by Heterogeneous Aggregations.
CoRR, 2021

Heterogeneous Graph based Deep Learning for Biomedical Network Link Prediction.
CoRR, 2021

2016
Random Projected Convolutional Feature for Scene Text Recognition.
Proceedings of the 15th International Conference on Frontiers in Handwriting Recognition, 2016


  Loading...