Lu Jiang

Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023

Switchable Novel Object Captioner.

[BibT_eX]

[DOI]

Yu Wu

Nitesh Bharadwaj Gundavarapu

Yi Yang

IEEE Trans. Pattern Anal. Mach. Intell., 2023

Photorealistic Video Generation with Diffusion Models.

[BibT_eX]

[DOI]

CoRR, 2023

Fine-grained Controllable Video Generation via Object Appearance and Context.

[BibT_eX]

[DOI]

CoRR, 2023

Text-Driven Image Editing via Learnable Regions.

[BibT_eX]

[DOI]

CoRR, 2023

VideoGLUE: Video General Understanding Evaluation of Foundation Models.

[BibT_eX]

[DOI]

Liangzhe Yuan

CoRR, 2023

StyleDrop: Text-to-Image Generation in Any Style.

[BibT_eX]

[DOI]

CoRR, 2023

Learning Disentangled Prompts for Compositional Image Synthesis.

[BibT_eX]

[DOI]

CoRR, 2023

SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

StyleDrop: Text-to-Image Synthesis of Any Style.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Muse: Text-To-Image Generation via Masked Generative Transformers.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Discrete Predictor-Corrector Diffusion Models for Image Synthesis.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

MAGVIT: Masked Generative Video Transformer.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Visual Prompt Tuning for Generative Transfer Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

Contrastive Adaptation Network for Single- and Multi-Source Domain Adaptation.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2022

Discrete Representations Strengthen Vision Transformer Robustness.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

ViTGAN: Training GANs with Vision Transformers.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

Improved Masked Image Generation with Token-Critic.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

BLT: Bidirectional Layout Transformer for Controllable Layout Generation.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Pyramid Adversarial Training Improves ViT Performance.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

MaskGIT: Masked Generative Image Transformer.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

Confident Learning: Estimating Uncertainty in Dataset Labels.

[BibT_eX]

[DOI]

Curtis G. Northcutt

Isaac L. Chuang

J. Artif. Intell. Res., 2021

Controllable and Progressive Image Extrapolation.

[BibT_eX]

[DOI]

Yijun Li

Ming-Hsuan Yang

Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2021

Text as Neural Operator: Image Manipulation by Text Instruction.

[BibT_eX]

[DOI]

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Self-supervised and Supervised Joint Training for Resource-rich Machine Translation.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Faster Meta Update Strategy for Noise-Robust Deep Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Regularizing Generative Adversarial Networks Under Limited Data.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020

Revisiting EmbodiedQA: A Simple Baseline and Beyond.

[BibT_eX]

[DOI]

Yu Wu

Yi Yang

IEEE Trans. Image Process., 2020

SimAug: Learning Robust Representations from 3D Simulation for Pedestrian Trajectory Prediction in Unseen Cameras.

[BibT_eX]

[DOI]

CoRR, 2020

Beyond Synthetic Noise: Deep Learning on Controlled Noisy Labels.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

RetrieveGAN: Image Synthesis via Differentiable Patch Retrieval.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

SimAug: Learning Robust Representations from Simulation for Trajectory Prediction.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

Neural Design Network: Graphic Layout Generation with Constraints.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

The Garden of Forking Paths: Towards Multi-Future Trajectory Prediction.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

AdvAug: Robust Adversarial Augmentation for Neural Machine Translation.

[BibT_eX]

[DOI]

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019

Focal Visual-Text Attention for Memex Question Answering.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2019

Neural Design Network: Graphic Layout Generation with Constraints.

[BibT_eX]

[DOI]

CoRR, 2019

Feature Partitioning for Efficient Multi-Task Architectures.

[BibT_eX]

[DOI]

CoRR, 2019

Let's Transfer Transformations of Shared Semantic Representations.

[BibT_eX]

[DOI]

Nam Vo

James Hays

CoRR, 2019

Eidetic 3D LSTM: A Model for Video Prediction and Beyond.

[BibT_eX]

[DOI]

Proceedings of the 7th International Conference on Learning Representations, 2019

Composing Text and Image for Image Retrieval - an Empirical Odyssey.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Peeking Into the Future: Predicting Future Person Activities and Locations in Videos.

[BibT_eX]

[DOI]

Juan Carlos Niebles

Li Fei-Fei

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Contrastive Adaptation Network for Unsupervised Domain Adaptation.

[BibT_eX]

[DOI]

Guoliang Kang

Yi Yang

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Robust Neural Machine Translation with Doubly Adversarial Inputs.

[BibT_eX]

[DOI]

Yong Cheng

Wolfgang Macherey

Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

2018

Decoupled Novel Object Captioner.

[BibT_eX]

[DOI]

Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

MentorNet: Learning Data-Driven Curriculum for Very Deep Neural Networks on Corrupted Labels.

[BibT_eX]

[DOI]

Proceedings of the 35th International Conference on Machine Learning, 2018

Graph Distillation for Action Detection with Privileged Modalities.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2018, 2018

Focal Visual-Text Attention for Visual Question Answering.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

2017

Revealing Event Saliency in Unconstrained Video Collection.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2017

A theoretical understanding of self-paced learning.

[BibT_eX]

[DOI]

Qian Zhao

Inf. Sci., 2017

MentorNet: Regularizing Very Deep Neural Networks on Corrupted Labels.

[BibT_eX]

[DOI]

CoRR, 2017

Graph Distillation for Action Detection with Privileged Information.

[BibT_eX]

[DOI]

CoRR, 2017

MemexQA: Visual Memex Question Answering.

[BibT_eX]

[DOI]

CoRR, 2017

Video Representation Learning and Latent Concept Mining for Large-scale Multi-label Video Classification.

[BibT_eX]

[DOI]

CoRR, 2017

Delving Deep into Personal Photo and Video Search.

[BibT_eX]

[DOI]

Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, 2017

Video Search via Ranking Network with Very Few Query Exemplars.

[BibT_eX]

[DOI]

Proceedings of the MultiMedia Modeling - 23rd International Conference, 2017

Leveraging Multi-modal Prior Knowledge for Large-scale Concept Learning in Noisy Web Data.

[BibT_eX]

[DOI]

Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval, 2017

Temporal localization of audio events for conflict monitoring in social media.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Webly-Supervised Learning of Multimodal Video Detectors.

[BibT_eX]

[DOI]

Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

An Event Reconstruction Tool for Conflict Monitoring Using Social Media.

[BibT_eX]

[DOI]

Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

Visual Memory QA: Your Personal Photo and Video Search Agent.

[BibT_eX]

[DOI]

Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016

Text-to-video: a semantic search engine for internet videos.

[BibT_eX]

[DOI]

Int. J. Multim. Inf. Retr., 2016

Strategies for Searching Video Content with Text Queries or Video Examples.

[BibT_eX]

[DOI]

Chuang Gan

Xingzhong Du

Xiaojun Chang

CoRR, 2016

Exploiting Multi-modal Curriculum in Noisy Web Data for Large-scale Concept Learning.

[BibT_eX]

[DOI]

CoRR, 2016

Web-scale Multimedia Search for Internet Video Content.

[BibT_eX]

[DOI]

Proceedings of the 25th International Conference on World Wide Web, 2016

Informedia @ TRECVID 2016.

[BibT_eX]

[DOI]

Proceedings of the 2016 TREC Video Retrieval Evaluation, 2016

Learning to Detect Concepts from Webly-Labeled Video Data.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016

2015

CMU Informedia@TRECVID 2015: MED/SIN/LNK/SED.

[BibT_eX]

[DOI]

Proceedings of the 2015 TREC Video Retrieval Evaluation, 2015

Early Implementation Experience with Wearable Cognitive Assistance Applications.

[BibT_eX]

[DOI]

Mahadev Satyanarayanan

Proceedings of the 2015 workshop on Wearable Systems and Applications, 2015

Fast and Accurate Content-based Semantic Search in 100M Internet Videos.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

Content-Based Video Search over 1 Million Videos with 1 Core in 1 Second.

[BibT_eX]

[DOI]

Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, 2015

Incremental Multimodal Query Construction for Video Search.

[BibT_eX]

[DOI]

Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, 2015

Bridging the Ultimate Semantic Gap: A Semantic Search Engine for Internet Videos.

[BibT_eX]

[DOI]

Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, 2015

A Self-Paced Multiple-Instance Learning Framework for Co-Saliency Detection.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Self-Paced Learning for Matrix Factorization.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015

Self-Paced Curriculum Learning.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015

2014

E-LAMP: integration of innovative ideas for multimedia event detection.

[BibT_eX]

[DOI]

Mach. Vis. Appl., 2014

Informedia @ TRECVID 2014.

[BibT_eX]

[DOI]

Proceedings of the 2014 TREC Video Retrieval Evaluation, 2014

Improvements to speaker adaptive training of deep neural networks.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE Spoken Language Technology Workshop, 2014

Self-Paced Learning with Diversity.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Instructional Videos for Unsupervised Harvesting and Learning of Action Examples.

[BibT_eX]

[DOI]

Shoou-I Yu

Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014

Easy Samples First: Self-paced Reranking for Zero-Example Multimedia Search.

[BibT_eX]

[DOI]

Teruko Mitamura

Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014

Towards Efficient Learning of Optimal Spatial Bag-of-Words Representations.

[BibT_eX]

[DOI]

Wei Tong

Proceedings of the International Conference on Multimedia Retrieval, 2014

Viral Video Style: A Closer Look at Viral Videos on YouTube.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Multimedia Retrieval, 2014

Zero-Example Event Search using MultiModal Pseudo Relevance Feedback.

[BibT_eX]

[DOI]

Teruko Mitamura

Shoou-I Yu

Proceedings of the International Conference on Multimedia Retrieval, 2014

A Novel Group-Sparsity-Optimization-Based Feature Selection Model for Complex Interaction Recognition.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ACCV 2014, 2014

2013

Informedia@TRECVID 2013.

[BibT_eX]

[DOI]

Proceedings of the 2013 TREC Video Retrieval Evaluation, 2013

2012

Informedia @TRECVID 2012.

[BibT_eX]

[DOI]

Proceedings of the 2012 TREC Video Retrieval Evaluation, 2012

Leveraging high-level and low-level features for multimedia event detection.

[BibT_eX]

[DOI]

Guang Xiang

Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

2011

Informedia@TRECVID 2011: Surveillance Event Detection.

[BibT_eX]

[DOI]