Amir Bar

Orcid: 0000-0001-8066-0495

According to our database¹, Amir Bar authored at least 47 papers between 2010 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

PGT: Procedurally Generated Tasks for improving visual grounding in MLLMs.

[BibT_eX]

[DOI]

Rim Assouel

Amir Bar

Michal Drozdzal

Adriana Romero-Soriano

CoRR, May, 2026

Lifting Embodied World Models for Planning and Control.

[BibT_eX]

[DOI]

CoRR, April, 2026

Hierarchical Planning with Latent World Models.

[BibT_eX]

[DOI]

CoRR, April, 2026

V-JEPA 2.1: Unlocking Dense Features in Video Self-Supervised Learning.

[BibT_eX]

[DOI]

CoRR, March, 2026

Beyond Language Modeling: An Exploration of Multimodal Pretraining.

[BibT_eX]

[DOI]

CoRR, March, 2026

A Lightweight Library for Energy-Based Joint-Embedding Predictive Architectures.

[BibT_eX]

[DOI]

CoRR, February, 2026

Grounding Generated Videos in Feasible Plans via World Models.

[BibT_eX]

[DOI]

Christos Ziakas

Amir Bar

Alessandra Russo

CoRR, February, 2026

Parallel Stochastic Gradient-Based Planning for World Models.

[BibT_eX]

[DOI]

Michael Psenka

Michael Rabbat

Aditi S. Krishnapriyan

Yann LeCun

Amir Bar

CoRR, February, 2026

DeFM: Learning Foundation Representations from Depth for Robotics.

[BibT_eX]

[DOI]

CoRR, January, 2026

2025

World Models Can Leverage Human Videos for Dexterous Manipulation.

[BibT_eX]

[DOI]

Raktim Gautam Goswami

Prashanth Krishnamurthy

Michael Rabbat

Farshad Khorrami

Yann LeCun

CoRR, December, 2025

TV2TV: A Unified Framework for Interleaved Language and Video Generation.

[BibT_eX]

[DOI]

CoRR, December, 2025

From Generated Human Videos to Physically Plausible Robot Trajectories.

[BibT_eX]

[DOI]

CoRR, December, 2025

OpenApps: Simulating Environment Variations to Measure UI-Agent Reliability.

[BibT_eX]

[DOI]

CoRR, November, 2025

Forgotten Polygons: Multimodal Large Language Models are Shape-Blind.

[BibT_eX]

[DOI]

CoRR, February, 2025

Whole-Body Conditioned Egocentric Video Prediction.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

Vision-Language Models Create Cross-Modal Task Representations.

[BibT_eX]

[DOI]

Grace Luo

Trevor Darrell

Amir Bar

Proceedings of the Forty-second International Conference on Machine Learning, 2025

Scaling Language-Free Visual Representation Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

Pixels Versus Priors: Controlling Knowledge Priors in Vision-Language Models through Visual Counterfacts.

[BibT_eX]

[DOI]

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

Navigation World Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Forgotten Polygons: Multimodal Large Language Models are Shape-Blind.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024

IMProv: Inpainting-based Multimodal Prompting for Computer Vision Tasks.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2024

Task Vectors are Cross-Modal.

[BibT_eX]

[DOI]

Grace Luo

Trevor Darrell

Amir Bar

CoRR, 2024

Stochastic positional embeddings improve masked image modeling.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Finding Visual Task Vectors.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

EgoPet: Egomotion and Interaction Data from an Animal's Perspective.

[BibT_eX]

[DOI]

Jathushan Rajasegaran

Yann LeCun

Amir Globerson

Trevor Darrell

Proceedings of the Computer Vision - ECCV 2024, 2024

Sequential Modeling Enables Scalable Learning for Large Vision Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023

Object-based (yet Class-agnostic) Video Domain Adaptation.

[BibT_eX]

[DOI]

CoRR, 2023

Predicting masked tokens in stochastic locations improves masked image modeling.

[BibT_eX]

[DOI]

CoRR, 2023

A Cookbook of Self-Supervised Learning.

[BibT_eX]

[DOI]

CoRR, 2023

2022

Structured Video Tokens @ Ego4D PNR Temporal Localization Challenge 2022.

[BibT_eX]

[DOI]

CoRR, 2022

Using Machine Learning to Identify Intravenous Contrast Phases on Computed Tomography.

[BibT_eX]

[DOI]

Raouf Muhamedrahimov

Amir Bar

Jonathan Laserson

Ayelet Akselrod-Ballin

Eldad Elnekave

Comput. Methods Programs Biomed., 2022

Bringing Image Scene Structure to Video via Frame-Clip Consistency of Object Tokens.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Visual Prompting via Image Inpainting.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Object-Region Video Transformers.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

DETReg: Unsupervised Pretraining with Region Priors for Object Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

Learning Interclass Relations for Intravenous Contrast Phase Classification in CT.

[BibT_eX]

[DOI]

Raouf Muhamedrahimov

Amir Bar

Ayelet Akselrod-Ballin

Proceedings of the Medical Imaging with Deep Learning, 7-9 July 2021, Lübeck, Germany., 2021

Compositional Video Synthesis with Action Graphs.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

2020

Compositional Video Synthesis with Action Graphs.

[BibT_eX]

[DOI]

CoRR, 2020

Learning Interclass Relations for Image Classification.

[BibT_eX]

[DOI]

Raouf Muhamedrahimov

Amir Bar

Ayelet Akselrod-Ballin

CoRR, 2020

3D Convolutional Sequence to Sequence Model for Vertebral Compression Fractures Identification in CT.

[BibT_eX]

[DOI]

Ayelet Akselrod-Ballin

Amir Bar

Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2020, 2020

Learning Canonical Representations for Scene Graph to Image Generation.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

2019

PHT-bot: a deep learning based system for automatic risk stratification of COPD patients based upon signs of pulmonary hypertension.

[BibT_eX]

[DOI]

Proceedings of the Medical Imaging 2019: Computer-Aided Diagnosis, San Diego, 2019

Improved ICH Classification Using Task-Dependent Learning.

[BibT_eX]

[DOI]

Proceedings of the 16th IEEE International Symposium on Biomedical Imaging, 2019

Learning Individual Styles of Conversational Gesture.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2017

Language Generation with Recurrent Generative Adversarial Networks without Pre-training.

[BibT_eX]

[DOI]

CoRR, 2017

Compression fractures detection on CT.

[BibT_eX]

[DOI]

Proceedings of the Medical Imaging 2017: Computer-Aided Diagnosis, 2017

2010

Widespread Compensatory Evolution Conserves DNA-Encoded Nucleosome Organization in Yeast.

[BibT_eX]

[DOI]

PLoS Comput. Biol., 2010

Amir Bar

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...