We stand with Ukraine

We stand with Ukraine

Boqing Gong

Orcid: 0000-0003-3915-5977

According to our database¹, Boqing Gong authored at least 148 papers between 2009 and 2026.

Collaborative distances:

Dijkstra number² of three.
Erdős number³ of three.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

Online presence:

on orcid.org

On csauthors.net:

Bibliography

2026

LiteFrame: Efficient Vision Encoders Unlock Frame Scaling in Video LLMs.

[DOI]

,

Nikhil Parthasarathy

,

,

,

,

,

Ming-Hsuan Yang

,

CoRR, May, 2026

Moiré Video Authentication: A Physical Signature Against AI Video Generation.

[DOI]

,

,

,

,

CoRR, April, 2026

Ego2Web: A Web Agent Benchmark Grounded in Egocentric Videos.

[DOI]

,

,

,

,

Srinivas Sunkara

,

,

,

,

CoRR, March, 2026

EgoReasoner: Learning Egocentric 4D Reasoning via Task-Adaptive Structured Thinking.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

CoRR, March, 2026

Lifting Data-Tracing Machine Unlearning to Knowledge-Tracing for Foundation Models.

[DOI]

,

Trans. Mach. Learn. Res., 2026

AdverIN: Monotonic adversarial intensity attack for domain generalization in medical image segmentation.

[DOI]

,

,

,

,

,

Matthew Antalek

,

,

Alpay Medetalibeyoglu

,

Concetto Spampinato

,

,

,

Medical Image Anal., 2026

2025

Image Diffusion Preview with Consistency Solver.

[DOI]

,

,

,

,

,

,

Ming-Hsuan Yang

,

,

,

,

CoRR, December, 2025

BabyVLM-V2: Toward Developmentally Grounded Pretraining and Benchmarking of Vision Foundation Models.

[DOI]

CoRR, December, 2025

Culture in Action: Evaluating Text-to-Image Models through Social Activities.

[DOI]

,

,

Adriana Kovashka

CoRR, November, 2025

Beyond Token Pruning: Operation Pruning in Vision-Language Models.

[DOI]

,

,

,

Bryan A. Plummer

CoRR, July, 2025

Vision LLMs Are Bad at Hierarchical Visual Understanding, and LLMs Are the Bottleneck.

[DOI]

,

,

CoRR, May, 2025

BabyVLM: Data-Efficient Pretraining of VLMs Inspired by Infant Learning.

[DOI]

,

,

,

Venkatesh Saligrama

,

CoRR, April, 2025

VideoAds for Fast-Paced Video Understanding: Where Opensource Foundation Models Beat GPT-4o & Gemini-1.5 Pro.

[DOI]

,

,

,

,

,

CoRR, April, 2025

Large-scale multi-center CT and MRI segmentation of pancreas with deep learning.

[DOI]

Medical Image Anal., 2025

Epsilon-VAE: Denoising as Visual Decoding.

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the Forty-second International Conference on Machine Learning, 2025

OmnixR: Evaluating Omni-modality Language Models on Reasoning across Modalities.

[DOI]

,

,

,

,

,

,

,

,

,

Ming-Hsuan Yang

,

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

The Crystal Ball Hypothesis in diffusion models: Anticipating object positions from initial noise.

[DOI]

,

,

,

,

,

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Attention to Neural Plagiarism: Diffusion Models Can Plagiarize Your Copyrighted Images!

[DOI]

,

,

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

VideoAds for Fast-Paced Video Understanding.

[DOI]

,

,

,

,

,

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

SITE: Towards Spatial Intelligence Thorough Evaluation.

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

BabyVLM: Data-Efficient Pretraining of VLMs Inspired by Infant Learning.

[DOI]

Shengao Wang Boston University

,

,

,

Venkatesh Saligrama

,

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

HYPDAE: Hyperbolic Diffusion Autoencoders for Hierarchical Few-Shot Image Generation.

[DOI]

,

,

,

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

Scaling Up Temporal Domain Generalization via Temporal Experts Averaging.

[DOI]

,

,

Venkatesh Saligrama

,

,

,

,

Bryan A. Plummer

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

2024

Open Long-Tailed Recognition in a Dynamic World.

[DOI]

,

,

,

,

,

IEEE Trans. Pattern Anal. Mach. Intell., March, 2024

VideoGLUE: Video General Understanding Evaluation of Foundation Models.

[DOI]

,

Nitesh Bharadwaj Gundavarapu

,

,

,

,

,

,

,

,

,

Mikhail Sirotenko

,

,

Florian Schroff

,

,

Ming-Hsuan Yang

,

,

Trans. Mach. Learn. Res., 2024

DAVE: Diverse Atomic Visual Elements Dataset with High Representation of Vulnerable Road Users in Complex and Unpredictable Environments.

[DOI]

,

Pedro Sandoval Segura

,

Chengyuan Zhang

,

,

,

,

,

,

,

CoRR, 2024

Neptune: The Long Orbit to Benchmarking Long Video Understanding.

[DOI]

,

,

,

,

Nitesh Bharadwaj Gundavarapu

,

,

,

,

,

Cordelia Schmid

,

Mikhail Sirotenko

,

,

CoRR, 2024

Diffusion Autoencoders for Few-shot Image Generation in Hyperbolic Space.

[DOI]

,

,

,

CoRR, 2024

ε-VAE: Denoising as Visual Decoding.

[DOI]

,

,

,

,

,

,

,

,

CoRR, 2024

SOAR: Self-supervision Optimized UAV Action Recognition with Efficient Object-Aware Pretraining.

[DOI]

,

,

,

,

,

CoRR, 2024

Automatic Jailbreaking of the Text-to-Image Generative AI Systems.

[DOI]

,

,

,

,

CoRR, 2024

Extending Video Masked Autoencoders to 128 frames.

[DOI]

Nitesh Bharadwaj Gundavarapu

,

,

,

,

Eirikur Agustsson

,

,

Mikhail Sirotenko

,

Ming-Hsuan Yang

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

On Discrete Prompt Optimization for Diffusion Models.

[DOI]

,

,

,

Proceedings of the Forty-first International Conference on Machine Learning, 2024

VideoPrism: A Foundational Visual Encoder for Video Understanding.

[DOI]

,

Nitesh Bharadwaj Gundavarapu

,

,

,

,

Jennifer J. Sun

,

,

,

,

,

,

Florian Schroff

,

Ming-Hsuan Yang

,

,

,

,

Mikhail Sirotenko

,

,

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Language Model Beats Diffusion - Tokenizer is key to visual generation.

[DOI]

,

,

Nitesh Bharadwaj Gundavarapu

,

,

,

,

,

,

,

Alexander G. Hauptmann

,

,

Ming-Hsuan Yang

,

,

,

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Structured Video-Language Modeling with Temporal Grouping and Spatial Grounding.

[DOI]

,

,

,

Ming-Hsuan Yang

,

Florian Schroff

,

,

,

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Understanding the Impact of Negative Prompts: When and How Do They Take Effect?

[DOI]

,

,

,

,

,

Proceedings of the Computer Vision - ECCV 2024, 2024

Instruct-Imagen: Image Generation with Multi-modal Instruction.

[DOI]

,

Kelvin C. K. Chan

,

,

,

,

,

,

,

,

William W. Cohen

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Distilling Vision-Language Models on Millions of Videos.

[DOI]

,

,

,

,

,

,

Florian Schroff

,

,

,

,

Philipp Krähenbühl

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023

Towards A Unified Neural Architecture for Visual Recognition and Reasoning.

[DOI]

,

,

,

CoRR, 2023

Multi-modal Domain Adaptation for REG via Relation Transfer.

[DOI]

,

,

CoRR, 2023

Federated Learning of Shareable Bases for Personalization-Friendly Image Classification.

[DOI]

,

,

,

,

,

,

,

CoRR, 2023

Identity Encoder for Personalized Diffusion.

[DOI]

,

Kelvin C. K. Chan

,

,

,

,

,

,

CoRR, 2023

Domain Generalization with Adversarial Intensity Attack for Medical Image Segmentation.

[DOI]

,

,

,

,

,

Ismail Baris Turkbey

,

,

CoRR, 2023

Taming Encoder for Zero Fine-tuning Image Customization with Text-to-Image Diffusion Models.

[DOI]

,

,

Kelvin C. K. Chan

,

,

,

,

,

,

CoRR, 2023

Spatiotemporally Discriminative Video-Language Pre-Training with Text Grounding.

[DOI]

,

,

,

Ming-Hsuan Yang

,

Florian Schroff

,

,

,

CoRR, 2023

Video Timeline Modeling For News Story Understanding.

[DOI]

,

,

,

,

Ming-Hsuan Yang

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Module-wise Adaptive Distillation for Multimodality Foundation Models.

[DOI]

,

,

Ming-Hsuan Yang

,

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Unified Visual Relationship Detection with Vision and Language Models.

[DOI]

,

,

,

,

Florian Schroff

,

Ming-Hsuan Yang

,

,

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

On Calibrating Semantic Segmentation Models: Analyses and An Algorithm.

[DOI]

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

2.5D visual relationship detection.

[DOI]

,

Soravit Changpinyo

,

,

Sathish Thoppay

,

,

,

,

,

,

Ming-Hsuan Yang

,

Comput. Vis. Image Underst., 2022

On Calibrating Semantic Segmentation Models: Analysis and An Algorithm.

[DOI]

,

,

CoRR, 2022

Federated Multi-Target Domain Adaptation.

[DOI]

,

,

,

,

,

Ming-Hsuan Yang

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022

Surrogate Gap Minimization Improves Sharpness-Aware Training.

[DOI]

,

,

,

,

,

Nicha C. Dvornek

,

Sekhar Tatikonda

,

James S. Duncan

,

Proceedings of the Tenth International Conference on Learning Representations, 2022

When Vision Transformers Outperform ResNets without Pre-training or Strong Data Augmentations.

[DOI]

,

,

Proceedings of the Tenth International Conference on Learning Representations, 2022

Anti-Neuron Watermarking: Protecting Personal Data Against Unauthorized Neural Networks.

[DOI]

,

,

Proceedings of the Computer Vision - ECCV 2022, 2022

LESS: Label-Efficient Semantic Segmentation for LiDAR Point Clouds.

[DOI]

,

,

,

,

,

Dragomir Anguelov

Proceedings of the Computer Vision - ECCV 2022, 2022

Contextualized Spatio-Temporal Contrastive Learning with Self-Supervision.

[DOI]

,

,

,

,

Florian Schroff

,

Ming-Hsuan Yang

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

medXGAN: Visual Explanations for Medical Classifiers through a Generative Latent Space.

[DOI]

,

Florian Schiffers

,

,

Aggelos K. Katsaggelos

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022

On Temporal Granularity in Self-Supervised Video Representation Learning.

[DOI]

,

,

,

,

,

,

Serge J. Belongie

,

Ming-Hsuan Yang

,

,

Proceedings of the 33rd British Machine Vision Conference 2022, 2022

2021

Towards a Unified Foundation Model: Jointly Pre-Training Transformers on Unpaired Images and Text.

[DOI]

,

,

,

,

,

Ming-Hsuan Yang

,

CoRR, 2021

Exploring Temporal Granularity in Self-Supervised Video Representation Learning.

[DOI]

,

,

,

,

,

,

Serge J. Belongie

,

Ming-Hsuan Yang

,

,

CoRR, 2021

Anti-Neuron Watermarking: Protecting Personal Data Against Unauthorized Neural Model Training.

[DOI]

,

,

CoRR, 2021

Bridging the Gap Between Object Detection and User Intent via Query-Modulation.

[DOI]

,

,

,

Kimberly Wilber

,

,

,

,

Andrew G. Howard

CoRR, 2021

When Vision Transformers Outperform ResNets without Pretraining or Strong Data Augmentations.

[DOI]

,

,

CoRR, 2021

A Simple and Effective Use of Object-Centric Images for Long-Tailed Object Detection.

[DOI]

,

,

,

,

,

Soravit Changpinyo

,

,

CoRR, 2021

Analyzing Deep Neural Network's Transferability via Fréchet Distance.

[DOI]

,

,

Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2021

On Model Calibration for Long-Tailed Object Detection and Instance Segmentation.

[DOI]

,

,

,

,

,

Soravit Changpinyo

,

,

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text.

[DOI]

,

,

,

Wei-Hong Chuang

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Large-Scale Meta-Learning with Continual Trajectory Shifting.

[DOI]

,

,

,

Proceedings of the 38th International Conference on Machine Learning, 2021

Contrastive Learning for Label Efficient Semantic Segmentation.

[DOI]

,

Raviteja Vemulapalli

,

Philip Andrew Mansfield

,

,

,

,

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

MosaicOS: A Simple and Effective Use of Object-Centric Images for Long-Tailed Object Detection.

[DOI]

,

,

,

,

,

Soravit Changpinyo

,

,

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

A Lazy Approach to Long-Horizon Gradient-Based Meta-Learning.

[DOI]

Muhammad Abdullah Jamal

,

,

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

CrossVQA: Scalably Generating Benchmarks for Systematically Testing VQA Generalization.

[DOI]

,

Soravit Changpinyo

,

,

,

,

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Complete & Label: A Domain Adaptation Approach to Semantic Segmentation of LiDAR Point Clouds.

[DOI]

,

,

Thomas A. Funkhouser

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Spatiotemporal Contrastive Video Representation Learning.

[DOI]

,

,

,

Ming-Hsuan Yang

,

,

Serge J. Belongie

,

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Ranking Neural Checkpoints.

[DOI]

,

,

,

,

,

,

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

MoViNets: Mobile Video Networks for Efficient Video Recognition.

[DOI]

,

,

,

,

,

,

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Adversarially Adaptive Normalization for Single Domain Generalization.

[DOI]

,

,

,

,

,

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Robust and Accurate Object Detection via Adversarial Learning.

[DOI]

,

,

,

,

,

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Class-Balanced Distillation for Long-Tailed Visual Recognition.

[DOI]

,

,

,

Cordelia Schmid

Proceedings of the 32nd British Machine Vision Conference 2021, 2021

2020

A Curriculum Domain Adaptation Approach to the Semantic Segmentation of Urban Scenes.

[DOI]

,

,

,

IEEE Trans. Pattern Anal. Mach. Intell., 2020

Classifier and Exemplar Synthesis for Zero-Shot Learning.

[DOI]

Soravit Changpinyo

,

,

,

Int. J. Comput. Vis., 2020

Smooth Adversarial Training.

[DOI]

,

,

,

,

CoRR, 2020

When Ensembling Smaller Models is More Efficient than Single Large Models.

[DOI]

,

,

,

CoRR, 2020

Look, Listen, and Act: Towards Audio-Visual Embodied Navigation.

[DOI]

,

,

,

,

Joshua B. Tenenbaum

Proceedings of the 2020 IEEE International Conference on Robotics and Automation, 2020

MACER: Attack-free and Scalable Robust Training via Maximizing Certified Radius.

[DOI]

,

,

,

,

,

Pradeep Ravikumar

,

,

Proceedings of the 8th International Conference on Learning Representations, 2020

Improving Object Detection with Selective Self-supervised Self-training.

[DOI]

,

,

,

,

Proceedings of the Computer Vision - ECCV 2020, 2020

Adversarial Examples Improve Image Recognition.

[DOI]

,

,

,

,

,

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Neural Networks Are More Productive Teachers Than Human Raters: Active Mixup for Data-Efficient Knowledge Distillation From a Blackbox Model.

[DOI]

,

,

,

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Rethinking Class-Balanced Methods for Long-Tailed Visual Recognition From a Domain Adaptation Perspective.

[DOI]

Muhammad Abdullah Jamal

,

,

Ming-Hsuan Yang

,

,

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

PolarNet: An Improved Grid Representation for Online LiDAR Point Clouds Semantic Segmentation.

[DOI]

,

,

,

,

,

,

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Open Compound Domain Adaptation.

[DOI]

,

,

,

,

,

,

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019

Compound Domain Adaptation in an Open World.

[DOI]

,

,

,

,

,

,

CoRR, 2019

Defending Against Adversarial Attacks Using Random Forests.

[DOI]

,

,

,

,

,

CoRR, 2019

Synthesized Policies for Transfer and Adaptation across Tasks and Environments.

[DOI]

,

,

,

CoRR, 2019

Joint Modeling of Dense and Incomplete Trajectories for Citywide Traffic Volume Inference.

[DOI]

,

,

,

,

,

,

Proceedings of the World Wide Web Conference, 2019

End-to-End Video Captioning With Multitask Reinforcement Learning.

[DOI]

,

Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2019

Optimize Deep Convolutional Neural Network with Ternarized Weights and High Accuracy.

[DOI]

,

,

Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2019

Facial Image-to-Video Translation by a Hidden Affine Transformation.

[DOI]

,

,

,

,

,

,

Proceedings of the 27th ACM International Conference on Multimedia, 2019

NATTACK: Learning the Distributions of Adversarial Examples for an Improved Black-Box Attack on Deep Neural Networks.

[DOI]

,

,

,

,

Proceedings of the 36th International Conference on Machine Learning, 2019

CAMOU: Learning Physical Vehicle Camouflages to Adversarially Attack Detectors in the Wild.

[DOI]

,

,

,

Proceedings of the 7th International Conference on Learning Representations, 2019

DHER: Hindsight Experience Replay for Dynamic Goals.

[DOI]

,

,

,

,

,

Proceedings of the 7th International Conference on Learning Representations, 2019

Domain Randomization and Pyramid Consistency: Simulation-to-Real Generalization Without Accessing Target Domain Data.

[DOI]

,

,

,

Alberto L. Sangiovanni-Vincentelli

,

,

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

A Fast and Accurate One-Stage Approach to Visual Grounding.

[DOI]

,

,

,

,

,

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Constructing Self-Motivated Pyramid Curriculums for Cross-Domain Semantic Segmentation: A Non-Adversarial Approach.

[DOI]

,

,

,

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Defending Against Adversarial Attacks Using Random Forest.

[DOI]

,

,

,

,

,

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019

Not All Frames Are Equal: Weakly-Supervised Video Grounding With Contextual Similarity and Visual Clustering Losses.

[DOI]

,

,

,

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Large-Scale Long-Tailed Recognition in an Open World.

[DOI]

,

,

,

,

,

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

A Robust Zero-Sum Game Framework for Pool-based Active Learning.

[DOI]

,

,

,

,

Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

Controllable Image-to-Video Translation: A Case Study on Facial Expression Generation.

[DOI]

,

,

,

,

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018

Exploring a SOT-MRAM Based In-Memory Computing for Data Processing.

[DOI]

,

,

,

,

IEEE Trans. Multi Scale Comput. Syst., 2018

Defend Deep Neural Networks Against Adversarial Examples via Fixed andDynamic Quantized Activation Functions.

[DOI]

Adnan Siraj Rakin

,

,

,

CoRR, 2018

Blind Pre-Processing: A Robust Defense Method Against Adversarial Examples.

[DOI]

Adnan Siraj Rakin

,

,

,

CoRR, 2018

A Semi-Supervised Two-Stage Approach to Learning from Noisy Labels.

[DOI]

,

,

,

Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision, 2018

Synthesize Policies for Transfer and Adaptation across Tasks and Environments.

[DOI]

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Improving the Improved Training of Wasserstein GANs: A Consistency Term and Its Dual Effect.

[DOI]

,

,

,

,

Proceedings of the 6th International Conference on Learning Representations, 2018

Improving Sequential Determinantal Point Processes for Supervised Video Summarization.

[DOI]

,

,

,

,

Proceedings of the Computer Vision - ECCV 2018, 2018

How Local Is the Local Diversity? Reinforcing Sequential Determinantal Point Processes with Dynamic Ground Sets for Supervised Video Summarization.

[DOI]

,

,

,

Proceedings of the Computer Vision - ECCV 2018, 2018

Deep Face Detector Adaptation Without Negative Transfer or Catastrophic Forgetting.

[DOI]

Muhammad Abdullah Jamal

,

,

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Geometry Guided Convolutional Neural Networks for Self-Supervised Video Representation Learning.

[DOI]

,

,

,

,

Leonidas J. Guibas

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

End-to-End Learning of Motion Representation for Video Understanding.

[DOI]

,

,

,

,

,

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

2017

Geodesic Flow Kernel and Landmarks: Kernel Methods for Unsupervised Domain Adaptation.

[DOI]

,

Kristen Grauman

,

Proceedings of the Domain Adaptation in Computer Vision Applications., 2017

A Multisource Domain Generalization Approach to Visual Attribute Detection.

[DOI]

,

,

Proceedings of the Domain Adaptation in Computer Vision Applications., 2017

Curriculum Domain Adaptation for Semantic Segmentation of Urban Scenes.

[DOI]

,

,

Proceedings of the IEEE International Conference on Computer Vision, 2017

VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic Segmentation.

[DOI]

,

,

,

,

Proceedings of the IEEE International Conference on Computer Vision, 2017

Query-Focused Video Summarization: Dataset, Evaluation, and a Memory Network Based Approach.

[DOI]

,

Jacob S. Laurel

,

Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Improving Facial Attribute Prediction Using Semantic Segmentation.

[DOI]

Mahdi M. Kalayeh

,

,

Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Weighted geodesic flow kernel for interpersonal mutual influence modeling and emotion recognition in dyadic interactions.

[DOI]

,

,

Shrikanth S. Narayanan

Proceedings of the Seventh International Conference on Affective Computing and Intelligent Interaction, 2017

2016

Infinite-Label Learning with Semantic Output Codes.

[DOI]

,

,

,

CoRR, 2016

Improved Dropout for Shallow and Deep Learning.

[DOI]

,

,

Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Learning a Multi-concept Video Retrieval Model with Multiple Latent Variables.

[DOI]

,

,

Proceedings of the IEEE International Symposium on Multimedia, 2016

Query-Focused Extractive Video Summarization.

[DOI]

,

,

Proceedings of the Computer Vision - ECCV 2016, 2016

Webly-Supervised Video Recognition by Mutually Voting for Relevant Web Images and Web Video Frames.

[DOI]

,

,

,

Proceedings of the Computer Vision - ECCV 2016, 2016

An Empirical Study and Analysis of Generalized Zero-Shot Learning for Object Recognition in the Wild.

[DOI]

,

Soravit Changpinyo

,

,

Proceedings of the Computer Vision - ECCV 2016, 2016

Fast Zero-Shot Image Tagging.

[DOI]

,

,

Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Learning Attributes Equals Multi-Source Domain Generalization.

[DOI]

,

,

Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Synthesized Classifiers for Zero-Shot Learning.

[DOI]

Soravit Changpinyo

,

,

,

Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

2015

Large-Margin Determinantal Point Processes.

[DOI]

,

,

Kristen Grauman

,

Proceedings of the Thirty-First Conference on Uncertainty in Artificial Intelligence, 2015

2014

Learning Kernels for Unsupervised Domain Adaptation with Applications to Visual Object Recognition.

[DOI]

,

Kristen Grauman

,

Int. J. Comput. Vis., 2014

Diverse Sequential Subset Selection for Supervised Video Summarization.

[DOI]

,

,

Kristen Grauman

,

Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

2013

Learning Semantic Signatures for 3D Object Retrieval.

[DOI]

,

,

,

IEEE Trans. Multim., 2013

Reshaping Visual Datasets for Domain Adaptation.

[DOI]

,

Kristen Grauman

,

Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

Connecting the Dots with Landmarks: Discriminatively Learning Domain-Invariant Features for Unsupervised Domain Adaptation.

[DOI]

,

Kristen Grauman

,

Proceedings of the 30th International Conference on Machine Learning, 2013

2012

Geodesic flow kernel for unsupervised domain adaptation.

[DOI]

,

,

,

Kristen Grauman

Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

2011

3D object retrieval with semantic attributes.

[DOI]

,

,

,

Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

2009

Boosting 3D object retrieval by object flexibility.

[DOI]

,

,

,

Proceedings of the 17th International Conference on Multimedia 2009, 2009

Automatic facial expression recognition on a single 3D face by exploring shape deformation.

[DOI]

,

,

,

Proceedings of the 17th International Conference on Multimedia 2009, 2009

Loading...