Juan Carlos Niebles

Keshigeyan Chandrasegaran

Honglu Zhou

Olga Russakovsky

Proceedings of the Forty-second International Conference on Machine Learning, 2025

Why Is Spatial Reasoning Hard for VLMs? An Attention Mechanism Perspective on Focus Areas.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

Strefer: Empowering Video LLMs with Space-Time Referring and Reasoning via Synthetic Instruction Data.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, ICCV 2025, 2025

BLIP-3: A Family of Open Large Multimodal Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, ICCV 2025, 2025

UniEgoMotion: A Unified Model for Egocentric Motion Reconstruction, Forecasting, and Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

ActionStudio: A Lightweight Framework for Data and Training of Large Action Models.

[BibT_eX]

[DOI]

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

Contra4: Evaluating Contrastive Cross-Modal Reasoning in Audio, Video, Image, and 3D.

[BibT_eX]

[DOI]

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

LATTE: Learning to Think with Vision Specialists.

[BibT_eX]

[DOI]

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

Re-thinking Temporal Search for Long-Form Video Understanding.

[BibT_eX]

[DOI]

Jinhui Ye

Zihan Wang

Haosen Sun

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

AdaVid: Adaptive Video-Language Pretraining.

[BibT_eX]

[DOI]

Chaitanya Patel

Ehsan Adeli

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2025

ViUniT: Visual Unit Tests for More Robust Visual Programming.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

LAM SIMULATOR: Advancing Data Generation for Large Action Model Training via Online Exploration and Trajectory Feedback.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024

ProVision: Programmatically Scaling Vision-centric Instruction Data for Multimodal Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

TACO: Learning Multi-modal Action Models with Synthetic Chains-of-Thought-and-Action.

[BibT_eX]

[DOI]

CoRR, 2024

SpecTool: A Benchmark for Characterizing Errors in Tool-Use LLMs.

[BibT_eX]

[DOI]

CoRR, 2024

PRACT: Optimizing Principled Reasoning and Acting of LLM Agent.

[BibT_eX]

[DOI]

CoRR, 2024

xGen-MM-Vid (BLIP-3-Video): You Only Need 32 Tokens to Represent a Video Even in VLMs.

[BibT_eX]

[DOI]

CoRR, 2024

xLAM: A Family of Large Action Models to Empower AI Agent Systems.

[BibT_eX]

[DOI]

CoRR, 2024

xGen-MM (BLIP-3): A Family of Open Large Multimodal Models.

[BibT_eX]

[DOI]

CoRR, 2024

APIGen: Automated Pipeline for Generating Verifiable and Diverse Function-Calling Datasets.

[BibT_eX]

[DOI]

CoRR, 2024

Artificial Intelligence Index Report 2024.

[BibT_eX]

[DOI]

CoRR, 2024

AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning.

[BibT_eX]

[DOI]

Tulika Manoj Awalgaonkar

CoRR, 2024

Editing Arbitrary Propositions in LLMs without Subject Labels.

[BibT_eX]

[DOI]

CoRR, 2024

On the Unlikelihood of D-Separation.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Probabilistic Graphical Models, 2024

APIGen: Automated PIpeline for Generating Verifiable and Diverse Function-Calling Datasets.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

IKEA Manuals at Work: 4D Grounding of Assembly Instructions on Internet Videos.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Streaming Detection of Queried Event Start.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Hierarchical Point Attention for Indoor 3D Object Detection.

[BibT_eX]

[DOI]

Manli Shu

Ning Yu

Proceedings of the IEEE International Conference on Robotics and Automation, 2024

Retroformer: Retrospective Large Language Agents with Policy Gradient Optimization.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

LayoutDETR: Detection Transformer Is a Good Multimodal Layout Designer.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

xGen-VideoSyn-1: High-Fidelity Text-to-Video Synthesis with Compressed Representations.

[BibT_eX]

[DOI]

Can Qin

Congying Xia

Krithika Ramakrishnan

Proceedings of the Computer Vision - ECCV 2024 Workshops, 2024

X-InstructBLIP: A Framework for Aligning Image, 3D, Audio, Video to LLMs and its Emergent Cross-Modal Reasoning.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

ULIP-2: Towards Scalable Multimodal Pre-Training for 3D Understanding.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Causal Layering via Conditional Entropy.

[BibT_eX]

[DOI]

Proceedings of the Causal Learning and Reasoning, 2024

2023

X-InstructBLIP: A Framework for aligning X-Modal instruction-aware representations to LLMs and Emergent Cross-modal Reasoning.

[BibT_eX]

[DOI]

CoRR, 2023

Artificial Intelligence Index Report 2023.

[BibT_eX]

[DOI]

CoRR, 2023

BOLAA: Benchmarking and Orchestrating LLM-augmented Autonomous Agents.

[BibT_eX]

[DOI]

CoRR, 2023

Retroformer: Retrospective Large Language Agents with Policy Gradient Optimization.

[BibT_eX]

[DOI]

CoRR, 2023

REX: Rapid Exploration and eXploitation for AI Agents.

[BibT_eX]

[DOI]

CoRR, 2023

HomE: Homography-Equivariant Video Representation Learning.

[BibT_eX]

[DOI]

CoRR, 2023

ULIP-2: Towards Scalable Multimodal Pre-training for 3D Understanding.

[BibT_eX]

[DOI]

CoRR, 2023

On the Unlikelihood of D-Separation.

[BibT_eX]

[DOI]

CoRR, 2023

Salesforce CausalAI Library: A Fast and Scalable Framework for Causal Analysis of Time Series and Tabular Data.

[BibT_eX]

[DOI]

CoRR, 2023

Model-Agnostic Hierarchical Attention for 3D Object Detection.

[BibT_eX]

[DOI]

Manli Shu

Ning Yu

Caiming Xiong

Ran Xu

CoRR, 2023

PreViTS: Contrastive Pretraining with Video Tracking Supervision.

[BibT_eX]

[DOI]

Brian Chen

Ramprasaath R. Selvaraju

Shih-Fu Chang

Nikhil Naik

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023

Temporally Disentangled Representation Learning under Unknown Nonstationarity.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

UniControl: A Unified Diffusion Model for Controllable Visual Generation In the Wild.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Deformer: Dynamic Fusion Transformer for Robust Hand Pose Estimation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Procedure-Aware Pretraining for Instructional Video Understanding.

[BibT_eX]

[DOI]

Honglu Zhou

Mubbasir Kapadia

Silvio Savarese

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

ULIP: Learning a Unified Representation of Language, Images, and Point Clouds for 3D Understanding.

[BibT_eX]

[DOI]

Mingfei Gao

Chen Xing

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Mask-Free OVIS: Open-Vocabulary Instance Segmentation without Manual Mask Annotations.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

ULIP: Learning Unified Representation of Language, Image and Point Cloud for 3D Understanding.

[BibT_eX]

[DOI]

Mingfei Gao

Chen Xing

CoRR, 2022

The AI Index 2022 Annual Report.

[BibT_eX]

[DOI]

CoRR, 2022

MOMA-LRG: Language-Refined Graphs for Multi-Object Multi-Actor Activity Parsing.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Identifying Auxiliary or Adversarial Tasks Using Necessary Condition Analysis for Adversarial Multi-task Video Understanding.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022 Workshops, 2022

PrivHAR: Recognizing Human Actions from Privacy-Preserving Lens.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Open Vocabulary Object Detection with Pseudo Bounding-Box Labels.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Align and Prompt: Video-and-Language Pre-training with Entity Prompts.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Revisiting the "Video" in Video-Language Understanding.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

Quantifying Parkinson's disease motor severity under uncertainty using MDS-UPDRS videos.

[BibT_eX]

[DOI]

Leila Montaser Kouhsari

Medical Image Anal., 2021

Towards Open Vocabulary Object Detection without Human-provided Bounding Boxes.

[BibT_eX]

[DOI]

CoRR, 2021

The AI Index 2021 Annual Report.

[BibT_eX]

[DOI]

CoRR, 2021

Representation Learning with Statistical Independence to Mitigate Bias.

[BibT_eX]

[DOI]

Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2021

MOMA: Multi-Object Multi-Actor Activity Parsing.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Detecting Human-Object Relationships in Videos.

[BibT_eX]

[DOI]

Jingwei Ji

Rishi Desai

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Learning Privacy-preserving Optics for Human Pose Estimation.

[BibT_eX]

[DOI]

Carlos Hinojosa

Henry Arguello

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

TRiPOD: Human Trajectory and Pose Dynamics Forecasting in the Wild.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Home Action Genome: Cooperative Compositional Action Understanding.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

CoCon: Cooperative-Contrastive Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2021

Metadata Normalization.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

TNT: Text-Conditioned Network with Transductive Inference for Few-Shot Video Classification.

[BibT_eX]

[DOI]

Andrés Villa

Juan-Manuel Pérez-Rúa

Proceedings of the 32nd British Machine Vision Conference 2021, 2021

2020

Spatiotemporal Relationship Reasoning for Pedestrian Intent Prediction.

[BibT_eX]

[DOI]

IEEE Robotics Autom. Lett., 2020

Segmenting the Future.

[BibT_eX]

[DOI]

Hsu-Kuang Chiu

Ehsan Adeli

IEEE Robotics Autom. Lett., 2020

Socially and Contextually Aware Human Motion and Pose Forecasting.

[BibT_eX]

[DOI]

IEEE Robotics Autom. Lett., 2020

Explaining VQA predictions using visual grounding and a knowledge base.

[BibT_eX]

[DOI]

Image Vis. Comput., 2020

Disentangling Human Dynamics for Pedestrian Locomotion Forecasting with Noisy Supervision.

[BibT_eX]

[DOI]

Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2020

Vision-Based Estimation of MDS-UPDRS Gait Scores for Assessing Parkinson's Disease Motor Severity.

[BibT_eX]

[DOI]

Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2020, 2020

Motion Reasoning for Goal-Based Imitation Learning.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Robotics and Automation, 2020

RubiksNet: Learnable 3D-Shift for Efficient Video Action Recognition.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

Procedure Planning in Instructional Videos.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

Spatio-Temporal Graph for Video Captioning With Knowledge Distillation.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Action Genome: Actions As Compositions of Spatio-Temporal Scene Graphs.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Few-Shot Video Classification via Temporal Alignment.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Adversarial Cross-Domain Action Recognition with Co-Attention.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

Action Genome: Actions as Composition of Spatio-temporal Scene Graphs.

[BibT_eX]

[DOI]

CoRR, 2019

Bias-Resilient Neural Network.

[BibT_eX]

[DOI]

CoRR, 2019

D<sup>3</sup>TW: Discriminative Differentiable Dynamic Time Warping for Weakly Supervised Action Alignment and Segmentation.

[BibT_eX]

[DOI]

CoRR, 2019

Interpretable Visual Question Answering by Visual Grounding From Attention Supervision Mining.

[BibT_eX]

[DOI]

Yundong Zhang

Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2019

Action-Agnostic Human Pose Forecasting.

[BibT_eX]

[DOI]

Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2019

Continuous Relaxation of Symbolic Planner for One-Shot Imitation Learning.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2019

Imitation Learning for Human Pose Prediction.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Learning Temporal Action Proposals With Fewer Labels.

[BibT_eX]

[DOI]

Jingwei Ji

Kaidi Cao

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Peeking Into the Future: Predicting Future Person Activities and Locations in Videos.

[BibT_eX]

[DOI]

Junwei Liang

Lu Jiang

Alexander G. Hauptmann

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019

Neural Task Graphs: Generalizing to Unseen Tasks From a Single Video Demonstration.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

D3TW: Discriminative Differentiable Dynamic Time Warping for Weakly Supervised Action Alignment and Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018

The ActivityNet Large-Scale Activity Recognition Challenge 2018 Summary.

[BibT_eX]

[DOI]

CoRR, 2018

Learning to Decompose and Disentangle Representations for Video Prediction.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

A Deep Learning Based Behavioral Approach to Indoor Autonomous Navigation.

[BibT_eX]

[DOI]

Gabriel Sepulveda

Proceedings of the 2018 IEEE International Conference on Robotics and Automation, 2018

Behavioral Indoor Navigation With Natural Language Directions.

[BibT_eX]

[DOI]

Proceedings of the Companion of the 2018 ACM/IEEE International Conference on Human-Robot Interaction, 2018

Translating Navigation Instructions in Natural Language to a High-Level Plan for Behavioral Robot Navigation.

[BibT_eX]

[DOI]

Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

Liquid Pouring Monitoring via Rich Sensory Inputs.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2018, 2018

Graph Distillation for Action Detection with Privileged Modalities.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2018, 2018

Temporal Modular Networks for Retrieving Complex Compositional Activities in Videos.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2018, 2018

End-to-End Joint Semantic Segmentation of Actors and Actions in Video.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2018, 2018

What Makes a Video a Video: Analyzing Temporal Information in Video Understanding Models and Datasets.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Finding "It": Weakly-Supervised Reference-Aware Visual Grounding in Instructional Videos.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

2017

Corrigendum to "Sparse Composition of Body Poses and Atomic Actions for Human Activity Recognition in RGB-D Videos" [Image Vis. Comput. 59 (2017) 63-75].

[BibT_eX]

[DOI]

Image Vis. Comput., 2017

Sparse composition of body poses and atomic actions for human activity recognition in RGB-D videos.

[BibT_eX]

[DOI]

Image Vis. Comput., 2017

Graph Distillation for Action Detection with Privileged Information.

[BibT_eX]

[DOI]

CoRR, 2017

ActivityNet Challenge 2017 Summary.

[BibT_eX]

[DOI]

CoRR, 2017

Risky Region Localization with Point Supervision.

[BibT_eX]

[DOI]

Kazuki Kozuka

Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops, 2017

Visual Forecasting by Imitating Dynamics in Natural Sequences.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Computer Vision, 2017

Dense-Captioning Events in Videos.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Computer Vision, 2017

Agent-Centric Risk Assessment: Accident Anticipation and Risky Region Localization.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Unsupervised Visual-Linguistic Reference Resolution in Instructional Videos.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

SST: Single-Stream Temporal Action Proposals.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

End-to-End, Single-Stream Temporal Action Detection in Untrimmed Videos.

[BibT_eX]

[DOI]

Proceedings of the British Machine Vision Conference 2017, 2017

Leveraging Video Descriptions to Learn Video Question Answering.

[BibT_eX]

[DOI]

Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016

Title Generation for User Generated Videos.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2016, 2016

Connectionist Temporal Modeling for Weakly Supervised Action Labeling.

[BibT_eX]

[DOI]

De-An Huang

Proceedings of the Computer Vision - ECCV 2016, 2016

DAPs: Deep Action Proposals for Action Understanding.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2016, 2016

A Hierarchical Pose-Based Approach to Complex Action Understanding Using Dictionaries of Actionlets and Motion Poselets.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Fast Temporal Activity Proposals for Efficient Detection of Human Actions in Untrimmed Videos.

[BibT_eX]

[DOI]

Fabian Caba Heilbron

Bernard Ghanem

Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

2015

ActivityNet: A large-scale video benchmark for human activity understanding.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Robust Manhattan Frame estimation from a single RGB-D image.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

On the relationship between visual attributes and convolutional networks.

[BibT_eX]

[DOI]

Victor Escorcia

Bernard Ghanem

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

2014

Collecting and Annotating Human Activities in Web Videos.

[BibT_eX]

[DOI]

Fabian Caba Heilbron

Proceedings of the International Conference on Multimedia Retrieval, 2014

Discriminative Hierarchical Modeling of Spatio-temporally Composable Human Activities.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

Camera Motion and Surrounding Scene Appearance as Context for Action Recognition.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ACCV 2014, 2014

2013

Vision-based action recognition of earthmoving equipment using spatio-temporal features and support vector machine classifiers.

[BibT_eX]

[DOI]

Mani Golparvar Fard

Arsalan Heydarian

Adv. Eng. Informatics, 2013

Spatio-temporal Human-Object Interactions for Action Recognition in Videos.

[BibT_eX]

[DOI]

Victor Escorcia

Proceedings of the 2013 IEEE International Conference on Computer Vision Workshops, 2013

2010

Modeling Temporal Structure of Decomposable Motion Segments for Activity Classification.

[BibT_eX]

[DOI]

Chih-Wei Chen

Proceedings of the Computer Vision, 2010

Efficient extraction of human motion volumes by tracking.

[BibT_eX]

[DOI]

Bohyung Han

Proceedings of the Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, 2010

2009

Mining discriminative adjectives and prepositions for natural scene recognition.

[BibT_eX]

[DOI]

Bangpeng Yao

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2009

2008

Extracting Moving People from Internet Videos.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision, 2008

2007

A Hierarchical Model of Shape and Appearance for Human Action Classification.

[BibT_eX]

[DOI]

Proceedings of the 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2007), 2007

OPTIMOL: A Framework for Online Picture Collection via Incremental Model Learning.

[BibT_eX]

[DOI]

Li-Jia Li

Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence, 2007

2006

Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words.

[BibT_eX]

[DOI]

Hongcheng Wang