David A. Ross

CoRR, May, 2026

2025

MALT Diffusion: Memory-Augmented Latent Transformers for Any-Length Video Generation.

[BibT_eX]

[DOI]

CoRR, February, 2025

A density version of a theorem of Banach.

[BibT_eX]

[DOI]

Nitesh Bharadwaj Gundavarapu

J. Log. Anal., 2025

Language-Guided Image Tokenization for Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024

SceneCraft: An LLM Agent for Synthesizing 3D Scene as Blender Code.

[BibT_eX]

[DOI]

CoRR, 2024

VideoPoet: A Large Language Model for Zero-Shot Video Generation.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

SceneCraft: An LLM Agent for Synthesizing 3D Scenes as Blender Code.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

VideoPrism: A Foundational Visual Encoder for Video Understanding.

[BibT_eX]

[DOI]

Long Zhao

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Language Model Beats Diffusion - Tokenizer is key to visual generation.

[BibT_eX]

[DOI]

Lijun Yu

José Lezama

Nitesh Bharadwaj Gundavarapu

Alexander G. Hauptmann

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Distribution Aware Metrics for Conditional Natural Language Generation.

[BibT_eX]

[DOI]

Yiming Ni

Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

2023

AVIS: Autonomous Visual Information Seeking with Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2023

IC<sup>3</sup>: Image Captioning by Committee Consensus.

[BibT_eX]

[DOI]

CoRR, 2023

SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs.

[BibT_eX]

[DOI]

Alexander G. Hauptmann

Lu Jiang

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

AVIS: Autonomous Visual Information Seeking with Large Language Model Agent.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

DaTaSeg: Taming a Universal Multi-Dataset Multi-Task Segmentation Model.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

UnLoc: A Unified Framework for Video Localization Tasks.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

IC3: Image Captioning by Committee Consensus.

[BibT_eX]

[DOI]

David Chan

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Reveal: Retrieval-Augmented Visual-Language Pre-Training with Multi-Source Multimodal Knowledge Memory.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

Open-Vocabulary Temporal Action Detection with Off-the-Shelf Image-Text Features.

[BibT_eX]

[DOI]

Vivek Rathod

Bryan Seybold

CoRR, 2022

im2nerf: Image to Neural Radiance Field in the Wild.

[BibT_eX]

[DOI]

CoRR, 2022

What's in a Caption? Dataset-Specific Linguistic Diversity and Its Effect on Visual Description Models and Metrics.

[BibT_eX]

[DOI]

Bryan Seybold

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022

2021

Optical Mouse: 3D Mouse Pose From Single-View Video.

[BibT_eX]

[DOI]

CoRR, 2021

Learn to Dance with AIST++: Music Conditioned 3D Dance Generation.

[BibT_eX]

[DOI]

CoRR, 2021

AI Choreographer: Music Conditioned 3D Dance Generation with AIST++.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

2020

Learning Video Representations from Textual Web Supervision.

[BibT_eX]

[DOI]

CoRR, 2020

The AVA-Kinetics Localized Human Actions Video Dataset.

[BibT_eX]

[DOI]

CoRR, 2020

D3D: Distilled 3D Networks for Video Action Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2020

Pillar-Based Object Detection for Autonomous Driving.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

Virtual Multi-view Fusion for 3D Semantic Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

An LSTM Approach to Temporal 3D Object Detection in LiDAR Point Clouds.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

DOPS: Learning to Detect 3D Objects and Predict Their 3D Shapes.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Speech2Action: Cross-Modal Supervision for Action Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Active Learning for Video Description with Cluster-Regularized Ensemble Ranking.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ACCV 2020 - 15th Asian Conference on Computer Vision, Kyoto, Japan, November 30, 2020

2018

High-resolution Functional Magnetic Resonance Imaging Reveals Configural Processing of Cars in Right Anterior Fusiform Face Area of Car Experts.

[BibT_eX]

[DOI]

Benjamin J. Tamber-Rosenau

J. Cogn. Neurosci., 2018

AVA: A Video Dataset of Spatio-Temporally Localized Atomic Visual Actions.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Rethinking the Faster R-CNN Architecture for Temporal Action Localization.

[BibT_eX]

[DOI]

Yu-Wei Chao

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

2017

AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions.

[BibT_eX]

[DOI]

Chunhui Gu

Chen Sun

CoRR, 2017

2014

Building the Next Generation of Quantitative Biologists.

[BibT_eX]

[DOI]

Proceedings of the Biocomputing 2014: Proceedings of the Pacific Symposium, 2014

2013

Extensions and applications of the S-measure construction.

[BibT_eX]

[DOI]

J. Symb. Log., 2013

2012

On Using Nearly-Independent Feature Families for High Precision and Confidence.

[BibT_eX]

[DOI]

Omid Madani

Manfred Georg

Proceedings of the 4th Asian Conference on Machine Learning, 2012

The Intervalgram: An Audio Feature for Large-Scale Cover-Song Recognition.

[BibT_eX]

[DOI]

Thomas C. Walters

Richard F. Lyon

Proceedings of the From Sounds to Music and Emotions - 9th International Symposium, 2012

2011

Survey and Evaluation of Audio Fingerprinting Schemes for Mobile Query-by-Example Applications.

[BibT_eX]

[DOI]

Vijay Chandrasekhar

Matt Sharifi

Proceedings of the 12th International Society for Music Information Retrieval Conference, 2011

The power of comparative reasoning.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Computer Vision, 2011

Automatic Language Identification in music videos with low level audio and visual features.

[BibT_eX]

[DOI]

Vijay Chandrasekhar

Mehmet Emre Sargin

Proceedings of the IEEE International Conference on Acoustics, 2011

Helping Hands versus ERSP Vision: Comparing Object Recognition Technologies for the Visually Impaired.

[BibT_eX]

[DOI]

Proceedings of the HCI International 2011 - Posters' Extended Abstracts, 2011

2010

Learning Articulated Structure and Motion.

[BibT_eX]

[DOI]

Daniel Tarlow

Int. J. Comput. Vis., 2010

SPEC hashing: Similarity preserving algorithm for entropy-based coding.

[BibT_eX]

[DOI]

Ruei-Sung Lin

Jay Yagnik

Proceedings of the Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, 2010

2009

Learning Probabilistic Models for Visual Motion.

[BibT_eX]

[DOI]

PhD thesis, 2009

Process engineering: A necessary step to a better public health system.

[BibT_eX]

[DOI]

Inf. Knowl. Syst. Manag., 2009

2008

Incremental Learning for Robust Visual Tracking.

[BibT_eX]

[DOI]

Int. J. Comput. Vis., 2008

Distributed online anomaly detection in high-content screening.

[BibT_eX]

[DOI]

Mahadev Satyanarayanan

Proceedings of the 2008 IEEE International Symposium on Biomedical Imaging: From Nano to Macro, 2008

Unsupervised Learning of Skeletons from Motion.

[BibT_eX]

[DOI]

Daniel Tarlow

Proceedings of the Computer Vision, 2008

Learning stick-figure models using nonparametric Bayesian priors over trees.

[BibT_eX]

[DOI]

Proceedings of the 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008), 2008

The Blind Leading the Blind: Toward Collaborative Online Route Information Management by Individuals with Visual Impairments.

[BibT_eX]

[DOI]

Proceedings of the Social Information Processing, 2008

2006

A nonstandard proof of a lemma from constructive measure theory.

[BibT_eX]

[DOI]

Math. Log. Q., 2006

Learning Parts-Based Representations of Data.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2006

Combining discriminative features to infer complex trajectories.

[BibT_eX]

[DOI]

Simon Osindero

Proceedings of the Machine Learning, 2006

2005

An Elementary Proof of Lyapunov's Theorem.

[BibT_eX]

[DOI]

Am. Math. Mon., 2005

Talking braille: a wireless ubiquitous computing network for orientation and wayfinding.

[BibT_eX]

[DOI]

Alexander Lightman

Proceedings of the ACM SIGACCESS Conference on Computers and Accessibility, 2005

2004

Cyber Crumbs for Successful Aging with Vision Loss.

[BibT_eX]

[DOI]

IEEE Pervasive Comput., 2004

Adaptive Discriminative Generative Model and Its Applications.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 17 [Neural Information Processing Systems, 2004

Incremental Learning for Visual Tracking.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 17 [Neural Information Processing Systems, 2004

Adaptive Probabilistic Visual Tracking with Incremental Subspace Update.

[BibT_eX]

[DOI]

Jongwoo Lim

Ming-Hsuan Yang

Proceedings of the Computer Vision, 2004

2003

Universal Design: Lessons for Wearable Computing.

[BibT_eX]

[DOI]

Maribeth Gandy

Thad E. Starner

IEEE Pervasive Comput., 2003

2002

Development of a Wearable Computer Orientation System.

[BibT_eX]

[DOI]

Bruce B. Blasch

Pers. Ubiquitous Comput., 2002

Multiple Cause Vector Quantization.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 15 [Neural Information Processing Systems, 2002

2001

Implementing Assistive Technology on Wearable Computers.

[BibT_eX]

[DOI]

IEEE Intell. Syst., 2001

2000

Evaluation of Orientation Interfaces for Wearable Computers.

[BibT_eX]

[DOI]

Bruce B. Blasch

Proceedings of the Fourth International Symposium on Wearable Computers (ISWC 2000), 2000

Wearable interfaces for orientation and wayfinding.

[BibT_eX]

[DOI]

Bruce B. Blasch

Proceedings of the ACM Conference on Assistive Technologies, 2000

1998

Wearable computers as a virtual environment interface for people with visual impairment.

[BibT_eX]

[DOI]

Virtual Real., 1998

1997

The Wearable Computer as a Remote Interface for People with Disabilities.

[BibT_eX]

[DOI]

Jon A. Sanford

Proceedings of the First International Symposium on Wearable Computers (ISWC 1997), 1997

1994

Toward functional magnetic stimulation (FMS) theory and experiment.

[BibT_eX]

[DOI]

Kent Davey

Lanbo Luo