Alexander Toshev

Orcid: 0000-0003-0925-638X

According to our database1, Alexander Toshev authored at least 57 papers between 2006 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training.
CoRR, 2024

Scalable Pre-training of Large Autoregressive Image Models.
CoRR, 2024

2023
Pre-trained Language Models Do Not Help Auto-regressive Text-to-Image Generation.
CoRR, 2023

Large Language Models as Generalizable Policies for Embodied Tasks.
CoRR, 2023

Data Filtering Networks.
CoRR, 2023

Mobile V-MoEs: Scaling Down Vision Transformers via Sparse Mixture-of-Experts.
CoRR, 2023

Principles and Guidelines for Evaluating Social Robot Navigation Algorithms.
CoRR, 2023

Value function estimation using conditional diffusion models for control.
CoRR, 2023

On Robustness in Multimodal Learning.
CoRR, 2023

STAIR: Learning Sparse Text and Image Representation in Grounded Tokens.
CoRR, 2023

Robustness in Multimodal Learning under Train-Test Modality Mismatch.
Proceedings of the International Conference on Machine Learning, 2023

Perceptual Grouping in Contrastive Vision-Language Models.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

STAIR: Learning Sparse Text and Image Representation in Grounded Tokens.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

2022
Socially CompliAnt Navigation Dataset (SCAND): A Large-Scale Dataset of Demonstrations for Social Navigation.
IEEE Robotics Autom. Lett., 2022

Perceptual Grouping in Vision-Language Models.
CoRR, 2022

Retrospectives on the Embodied AI Workshop.
CoRR, 2022

Gesture2Path: Imitation Learning for Gesture-aware Navigation.
CoRR, 2022

A Protocol for Validating Social Navigation Policies.
CoRR, 2022

Do As I Can, Not As I Say: Grounding Language in Robotic Affordances.
CoRR, 2022

GAUDI: A Neural Architect for Immersive 3D Scene Generation.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Value Function Spaces: Skill-Centric State Abstractions for Long-Horizon Reasoning.
Proceedings of the Tenth International Conference on Learning Representations, 2022


2021
ReLMoGen: Integrating Motion Generation in Reinforcement Learning for Mobile Manipulation.
Proceedings of the IEEE International Conference on Robotics and Automation, 2021

2020
Interactive Gibson Benchmark: A Benchmark for Interactive Navigation in Cluttered Environments.
IEEE Robotics Autom. Lett., 2020

ReLMoGen: Leveraging Motion Generation in Reinforcement Learning for Mobile Manipulation.
CoRR, 2020

ObjectNav Revisited: On Evaluation of Embodied Agents Navigating to Objects.
CoRR, 2020

Adversarial Generative Grammars for Human Activity Prediction.
Proceedings of the Computer Vision - ECCV 2020, 2020

Learning Object-conditioned Exploration using Distributed Soft Actor Critic.
Proceedings of the 4th Conference on Robot Learning, 2020

Modeling Long-horizon Tasks as Sequential Interaction Landscapes.
Proceedings of the 4th Conference on Robot Learning, 2020

2019
Interactive Gibson: A Benchmark for Interactive Navigation in Cluttered Environments.
CoRR, 2019

Long Range Neural Navigation Policies for the Real World.
Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2019

Visual Representations for Semantic Target Driven Navigation.
Proceedings of the International Conference on Robotics and Automation, 2019

Evolving Space-Time Neural Architectures for Videos.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Scene Memory Transformer for Embodied Agents in Long-Horizon Tasks.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018
Self-supervisory Signals for Object Discovery and Detection.
CoRR, 2018

Visual Representations for Semantic Target Driven Navigation.
CoRR, 2018

Sim2Real Viewpoint Invariant Visual Servoing by Recurrent Control.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

2017
Show and Tell: Lessons Learned from the 2015 MSCOCO Image Captioning Challenge.
IEEE Trans. Pattern Anal. Mach. Intell., 2017

Sim2Real View Invariant Visual Servoing by Recurrent Control.
CoRR, 2017

No Fuss Distance Metric Learning Using Proxies.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Towards Accurate Multi-person Pose Estimation in the Wild.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

2016
The Unreasonable Effectiveness of Noisy Data for Fine-Grained Recognition.
Proceedings of the Computer Vision - ECCV 2016, 2016

Chained Predictions Using Convolutional Neural Networks.
Proceedings of the Computer Vision - ECCV 2016, 2016

Generation and Comprehension of Unambiguous Object Descriptions.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

2015
Pose Embeddings: A Deep Architecture for Learning to Match Human Poses.
CoRR, 2015

Show and tell: A neural image caption generator.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

2014
Deep Convolutional Ranking for Multilabel Image Annotation.
Proceedings of the 2nd International Conference on Learning Representations, 2014

DeepPose: Human Pose Estimation via Deep Neural Networks.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

Scalable Object Detection Using Deep Neural Networks.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

2013
Deep Neural Networks for Object Detection.
Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

2012
Shape-Based Object Detection via Boundary Structure Segmentation.
Int. J. Comput. Vis., 2012

2010
Cascaded Models for Articulated Pose Estimation.
Proceedings of the Computer Vision, 2010

Object detection via boundary structure segmentation.
Proceedings of the Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, 2010

Detecting and parsing architecture at city scale from range data.
Proceedings of the Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, 2010

2009
Shape-based object recognition in videos using 3D synthetic object models.
Proceedings of the 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 2009

2007
Image Matching via Saliency Region Correspondences.
Proceedings of the 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2007), 2007

2006
An APRIORI-based Method for Frequent Composite Event Discovery in Videos.
Proceedings of the 2006 IEEE International Conference on Computer Vision Systems, 2006


  Loading...