Chuang Gan

According to our database1, Chuang Gan authored at least 77 papers between 2013 and 2020.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

On csauthors.net:

Bibliography

2020
Relation Attention for Temporal Action Localization.
IEEE Trans. Multim., 2020

Generating Visually Aligned Sound From Videos.
IEEE Trans. Image Process., 2020

Interactive Fiction Game Playing as Multi-Paragraph Reading Comprehension with Reinforcement Learning.
CoRR, 2020

Noisy Agents: Self-supervised Exploration by Predicting Auditory Events.
CoRR, 2020

Tiny Transfer Learning: Towards Memory-Efficient On-Device Learning.
CoRR, 2020

Foley Music: Learning to Generate Music from Videos.
CoRR, 2020

MCUNet: Tiny Deep Learning on IoT Devices.
CoRR, 2020

ThreeDWorld: A Platform for Interactive Multi-Modal Physical Simulation.
CoRR, 2020

Language Guided Networks for Cross-modal Moment Retrieval.
CoRR, 2020

A Real-time Action Representation with Temporal Encoding and Deep Compression.
CoRR, 2020

Cross-Modal Relation-Aware Networks for Audio-Visual Event Localization.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

HUMA'20: 1st International Workshop on Human-Centric Multimedia Analysis.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Deep Concept-wise Temporal Convolutional Networks for Action Localization.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Look, Listen, and Act: Towards Audio-Visual Embodied Navigation.
Proceedings of the 2020 IEEE International Conference on Robotics and Automation, 2020

Deep Audio Priors Emerge From Harmonic Convolutional Networks.
Proceedings of the 8th International Conference on Learning Representations, 2020

CLEVRER: Collision Events for Video Representation and Reasoning.
Proceedings of the 8th International Conference on Learning Representations, 2020

Once-for-All: Train One Network and Specialize it for Efficient Deployment.
Proceedings of the 8th International Conference on Learning Representations, 2020

Dense Regression Network for Video Grounding.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Music Gesture for Visual Sound Separation.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

HAT: Hardware-Aware Transformers for Efficient Natural Language Processing.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Location-Aware Graph Convolutional Networks for Video Question Answering.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Breaking Winner-Takes-All: Iterative-Winners-Out Networks for Weakly Supervised Temporal Action Localization.
IEEE Trans. Image Process., 2019

Toward Efficient Action Recognition: Principal Backpropagation for Training Two-Stream Networks.
IEEE Trans. Image Process., 2019

TruNet: Short Videos Generation from Long Videos via Story-Preserving Truncation.
CoRR, 2019

Training Kinetics in 15 Minutes: Large-scale Distributed Training on Videos.
CoRR, 2019

Once for All: Train One Network and Specialize it for Efficient Deployment.
CoRR, 2019

Interpreting Adversarial Examples by Activation Promotion and Suppression.
CoRR, 2019

Cross-channel Communication Networks.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Imitation Learning from Observations by Minimizing Inverse Dynamics Disagreement.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Visual Concept-Metaconcept Learning.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Facial Image-to-Video Translation by a Hidden Affine Transformation.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Watch, Reason and Code: Learning to Represent Videos Using Program.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

The Neuro-Symbolic Concept Learner: Interpreting Scenes, Words, and Sentences From Natural Supervision.
Proceedings of the 7th International Conference on Learning Representations, 2019

Defensive Quantization: When Efficiency Meets Robustness.
Proceedings of the 7th International Conference on Learning Representations, 2019

The Sound of Motions.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Graph Convolutional Networks for Temporal Action Localization.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

TSM: Temporal Shift Module for Efficient Video Understanding.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Self-Supervised Moving Vehicle Tracking With Stereo Sound.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Self-supervised Audio-visual Co-segmentation.
Proceedings of the IEEE International Conference on Acoustics, 2019

Self-Supervised Segmentation and Source Separation on Videos.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019

Beyond RNNs: Positional Self-Attention with Co-Attention for Video Question Answering.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

StNet: Local and Global Spatial-Temporal Modeling for Action Recognition.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Controllable Image-to-Video Translation: A Case Study on Facial Expression Generation.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Video Captioning with Multi-Faceted Attention.
Trans. Assoc. Comput. Linguistics, 2018

Temporal Shift Module for Efficient Video Understanding.
CoRR, 2018

Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Weakly Supervised Dense Event Captioning in Videos.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Unsupervised Domain Adaptation for 3D Keypoint Estimation via View Consistency.
Proceedings of the Computer Vision - ECCV 2018, 2018

The Sound of Pixels.
Proceedings of the Computer Vision - ECCV 2018, 2018

Attention Clusters: Purely Attention Based Local Feature Integration for Video Classification.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Geometry Guided Convolutional Neural Networks for Self-Supervised Video Representation Learning.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

End-to-End Learning of Motion Representation for Video Understanding.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Sparse, Smart Contours to Represent and Edit Images.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Multimodal Keyless Attention Fusion for Video Classification.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

T-C3D: Temporal Convolutional 3D Network for Real-Time Action Recognition.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
A Multisource Domain Generalization Approach to Visual Attribute Detection.
Proceedings of the Domain Adaptation in Computer Vision Applications., 2017

Smart, Sparse Contours to Represent and Edit Images.
CoRR, 2017

Unsupervised Domain Adaptation for 3D Keypoint Prediction from a Single Depth Scan.
CoRR, 2017

Revisiting the Effectiveness of Off-the-shelf Temporal Modeling Approaches for Large-scale Video Classification.
CoRR, 2017

Temporal Modeling Approaches for Large-scale Youtube-8M Video Understanding.
CoRR, 2017

Recurrent Topic-Transition GAN for Visual Paragraph Generation.
Proceedings of the IEEE International Conference on Computer Vision, 2017

VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic Segmentation.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Semantic Compositional Networks for Visual Captioning.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

StyleNet: Generating Attractive Visual Captions with Styles.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

DECK: Discovering Event Composition Knowledge from Web Images for Zero-Shot Event Detection and Recounting in Videos.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016
Recognizing an Action Using Its Name: A Knowledge-Based Approach.
Int. J. Comput. Vis., 2016

Strategies for Searching Video Content with Text Queries or Video Examples.
CoRR, 2016

Webly-Supervised Video Recognition by Mutually Voting for Relevant Web Images and Web Video Frames.
Proceedings of the Computer Vision - ECCV 2016, 2016

You Lead, We Exceed: Labor-Free Video Concept Learning by Jointly Exploiting Web Videos and Images.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Learning Attributes Equals Multi-Source Domain Generalization.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Concepts Not Alone: Exploring Pairwise Relationships for Zero-Shot Video Activity Recognition.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

2015

Automatic Concept Discovery from Parallel Text and Visual Corpora.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

DevNet: A Deep Event Network for multimedia event detection and evidence recounting.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Exploring Semantic Inter-Class Relationships (SIR) for Zero-Shot Action Recognition.
Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015

2014

2013
Salient object detection in image sequences via spatial-temporal cue.
Proceedings of the 2013 Visual Communications and Image Processing, 2013


  Loading...