Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: System Demonstrations, 2024

StrokeNUWA - Tokenizing Strokes for Vector Graphic Synthesis.

[BibT_eX]

[DOI]

Zecheng Tang

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Using Left and Right Brains Together: Towards Vision and Language Planning.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

LayoutNUWA: Revealing the Hidden Layout Expertise of Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Learning to Plan by Updating Natural Language.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

HORIZON: High-Resolution Semantically Controlled Panorama Synthesis.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

ORES: Open-Vocabulary Responsible Visual Synthesis.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

EIPE-text: Evaluation-Guided Iterative Plan Extraction for Long-Form Narrative Text Generation.

[BibT_eX]

[DOI]

CoRR, 2023

GameEval: Evaluating LLMs on Conversational Games.

[BibT_eX]

[DOI]

CoRR, 2023

DragNUWA: Fine-grained Control in Video Generation by Integrating Text, Image, and Trajectory.

[BibT_eX]

[DOI]

CoRR, 2023

Towards Medical Artificial General Intelligence via Knowledge-Enhanced Multimodal Pretraining.

[BibT_eX]

[DOI]

CoRR, 2023

Learning to Program with Natural Language.

[BibT_eX]

[DOI]

CoRR, 2023

Low-code LLM: Visual Programming over LLMs.

[BibT_eX]

[DOI]

CoRR, 2023

NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation.

[BibT_eX]

[DOI]

CoRR, 2023

Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models.

[BibT_eX]

[DOI]

CoRR, 2023

Learning 3D Photography Videos via Self-supervised Diffusion on Single Images.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

ReCo: Region-Controlled Text-to-Image Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

NUWA-XL: Diffusion over Diffusion for eXtremely Long Video Generation.

[BibT_eX]

[DOI]

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

ManagerTower: Aggregating the Insights of Uni-Modal Experts for Vision-Language Representation Learning.

[BibT_eX]

[DOI]

Anahita Bhiwandiwalla

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

BridgeTower: Building Bridges between Encoders in Vision-Language Representation Learning.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022

HORIZON: A High-Resolution Panorama Synthesis Framework.

[BibT_eX]

[DOI]

CoRR, 2022

Bridge-Tower: Building Bridges Between Encoders in Vision-Language Representation Learning.

[BibT_eX]

[DOI]

CoRR, 2022

DiVAE: Photorealistic Images Synthesis with Denoising Diffusion Decoder.

[BibT_eX]

[DOI]

CoRR, 2022

NÜWA-LIP: Language Guided Image Inpainting with Defect-free VQGAN.

[BibT_eX]

[DOI]

CoRR, 2022

Learning Temporal Video Procedure Segmentation from an Automatically Collected Large Dataset.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022

NUWA-Infinity: Autoregressive over Autoregressive Generation for Infinite Visual Synthesis.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

KD-VLP: Improving End-to-End Vision-and-Language Pretraining with Object Knowledge Distillation.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2022, 2022

Trace Controlled Text to Image Generation.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

VL-InterpreT: An Interactive Visualization Tool for Interpreting Vision-Language Transformers.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

GODIVA: Generating Open-DomaIn Videos from nAtural Descriptions.

[BibT_eX]

[DOI]

CoRR, 2021

GEM: A General Evaluation Benchmark for Multimodal Tasks.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021

2019

Deep Reason: A Strong Baseline for Real-World Visual Reasoning.

[BibT_eX]

[DOI]

CoRR, 2019

Differential Networks for Visual Question Answering.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018

Chain of Reasoning for Visual Question Answering.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Object-Difference Attention: A Simple Relational Attention for Visual Question Answering.

[BibT_eX]

[DOI]

Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Sequential Visual Reasoning for Visual Question Answering.

[BibT_eX]

[DOI]

Proceedings of the 5th IEEE International Conference on Cloud Computing and Intelligence Systems, 2018

Chenfei Wu

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...