Yale Song

According to our database1, Yale Song authored at least 66 papers between 2011 and 2023.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2023
Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives.
CoRR, 2023

Egocentric Video Task Translation @ Ego4D Challenge 2022.
CoRR, 2023

Scaling Novel Object Detection with Weakly Supervised Detection Transformers.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023

Ego4D Goal-Step: Toward Hierarchical Understanding of Procedural Activities.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Egocentric Video Task Translation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
Video Summarization Overview.
Found. Trends Comput. Graph. Vis., 2022

PatchBlender: A Motion Prior for Video Transformers.
CoRR, 2022

One Network Doesn't Rule Them All: Moving Beyond Handcrafted Architectures in Self-Supervised Learning.
CoRR, 2022

COMPASS: Contrastive Multimodal Pretraining for Autonomous Systems.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2022

Visual Attention Emerges from Recurrent Sparse Reconstruction.
Proceedings of the International Conference on Machine Learning, 2022

Anomaly Detection in Time Series with Robust Variational Quasi-Recurrent Autoencoders.
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022

Neural-Sim: Learning to Generate Training Data with NeRF.
Proceedings of the Computer Vision - ECCV 2022, 2022

Robust Contrastive Learning against Noisy Views.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

CausalCity: Complex Simulations with Agency for Causal Discovery and Reasoning.
Proceedings of the 1st Conference on Causal Learning and Reasoning, 2022

DOC2PPT: Automatic Presentation Slides Generation from Scientific Documents.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
On the Virality of Animated GIFs on Tumblr.
CoRR, 2021

Contrastive Learning of Global and Local Audio-Visual Representations.
CoRR, 2021

Automatic Curation of Large-Scale Datasets for Audio-Visual Representation Learning.
CoRR, 2021

Contrastive Learning of Global and Local Video Representations.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Self-Supervised Learning of Compressed Video Representations.
Proceedings of the 9th International Conference on Learning Representations, 2021

Active Contrastive Learning of Audio-Visual Video Representations.
Proceedings of the 9th International Conference on Learning Representations, 2021

Parameter Efficient Multimodal Transformers for Video Representation Learning.
Proceedings of the 9th International Conference on Learning Representations, 2021

ACAV100M: Automatic Curation of Large-Scale Datasets for Audio-Visual Video Representation Learning.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

2020
Learning to Transfer Visual Effects from Videos to Images.
CoRR, 2020

Learning Audio-Visual Representations with Active Contrastive Coding.
CoRR, 2020

Phans, Stans and Cishets: Self-Presentation Effects on Content Propagation in Tumblr.
Proceedings of the WebSci '20: 12th ACM Conference on Web Science, 2020

Image to Video Domain Adaptation Using Web Supervision.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2020

Multi-Reference Neural TTS Stylization with Adversarial Cycle Consistency.
Proceedings of the Interspeech 2020, 2020

Attention-Based Deep Metric Learning for Near-Duplicate Video Retrieval.
Proceedings of the 25th International Conference on Pattern Recognition, 2020

2019
Video Question Answering with Spatio-Temporal Reasoning.
Int. J. Comput. Vis., 2019

M3D-GAN: Multi-Modal Multi-Domain Translation with Universal Attention.
CoRR, 2019

Characterizing Bias in Classifiers using Generative Models.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Neural TTS Stylization with Adversarial and Collaborative Games.
Proceedings of the 7th International Conference on Learning Representations, 2019

Unpaired Image-to-Speech Synthesis With Multimodal Information Bottleneck.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018
Cross-Modal Retrieval with Implicit Concept Association.
CoRR, 2018

Image2GIF: Generating Cinemagraphs Using Recurrent Deep Q-Networks.
Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision, 2018

Video Prediction with Appearance and Motion Conditions.
Proceedings of the 35th International Conference on Machine Learning, 2018

2017
Learning from Noisy Labels with Distillation.
CoRR, 2017

ElasticPlay: Interactive Video Summarization with Dynamic Time Budgets.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

Learning from Noisy Labels with Distillation.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Improving Pairwise Ranking for Multi-label Image Classification.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

TGIF-QA: Toward Spatio-Temporal Reasoning in Visual Question Answering.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

2016
Real-Time Video Highlights for Yahoo Esports.
CoRR, 2016

Mouse Activity as an Indicator of Interestingness in Video.
Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval, 2016

Balancing Appearance and Context in Sketch Interpretation.
Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016

TGIF: A New Dataset and Benchmark on Animated GIF Description.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Video2GIF: Automatic Generation of Animated GIFs from Video.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

To Click or Not To Click: Automatic Selection of Beautiful Thumbnails from Videos.
Proceedings of the 25th ACM International Conference on Information and Knowledge Management, 2016

Fast, Cheap, and Good: Why Animated GIFs Engage Us.
Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, 2016

2015
Continuous Body and Hand Gesture Recognition for Natural Human-Computer Interaction: Extended Abstract.
Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, 2015

Exploiting sparsity and co-occurrence structure for action unit recognition.
Proceedings of the 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition, 2015

TVSum: Summarizing web videos using titles.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Video co-summarization: Video summarization by visual co-occurrence.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

2014
Structured video content analysis: learning spatio-temporal and multimodal structures.
PhD thesis, 2014

#FluxFlow: Visual Analysis of Anomalous Information Spreading on Social Media.
IEEE Trans. Vis. Comput. Graph., 2014

2013
One-Class Conditional Random Fields for Sequential Anomaly Detection.
Proceedings of the IJCAI 2013, 2013

Learning a sparse codebook of facial and body microexpressions for emotion recognition.
Proceedings of the 2013 International Conference on Multimodal Interaction, 2013

Distribution-sensitive learning for imbalanced datasets.
Proceedings of the 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition, 2013

Action Recognition by Hierarchical Sequence Summarization.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

2012
Continuous body and hand gesture recognition for natural human-computer interaction.
ACM Trans. Interact. Intell. Syst., 2012

Multimodal human behavior analysis: learning correlation and interaction across modalities.
Proceedings of the International Conference on Multimodal Interaction, 2012

Multi-view latent variable discriminative models for action recognition.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

2011
Tracking body and hands for gesture recognition: NATOPS aircraft handling signals database.
Proceedings of the Ninth IEEE International Conference on Automatic Face and Gesture Recognition (FG 2011), 2011

Multi-signal gesture recognition using temporal smoothing hidden conditional random fields.
Proceedings of the Ninth IEEE International Conference on Automatic Face and Gesture Recognition (FG 2011), 2011


  Loading...