Samuel Albanie

CoRR, January, 2026

2025

A Rosetta Stone for AI Benchmarks.

[BibT_eX]

[DOI]

Anson Ho

Jean-Stanislas Denain

David Atanasov

Rohin Shah

CoRR, December, 2025

A Good CREPE needs more than just Sugar: Investigating Biases in Compositional Vision-Language Benchmarks.

[BibT_eX]

[DOI]

CoRR, June, 2025

Control Tax: The Price of Keeping AI in Check.

[BibT_eX]

[DOI]

CoRR, June, 2025

A Sober Look at Progress in Language Model Reasoning: Pitfalls and Paths to Reproducibility.

[BibT_eX]

[DOI]

CoRR, April, 2025

An Approach to Technical AGI Safety and Security.

[BibT_eX]

[DOI]

CoRR, April, 2025

ZeroBench: An Impossible Visual Benchmark for Contemporary Large Multimodal Models.

[BibT_eX]

[DOI]

CoRR, February, 2025

TeachText: CrossModal text-video retrieval through generalized distillation.

[BibT_eX]

[DOI]

Artif. Intell., 2025

DeepMIM: Deep Supervision for Masked Image Modeling.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2025

Needle Threading: Can LLMs Follow Threads Through Near-Million-Scale Haystacks?

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Inverse Constitutional AI: Compressing Preferences into Principles.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

GRAB: A Challenging Graph Analysis Benchmark for Large Multimodal Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

Active Data Curation Effectively Distills Large-Scale Multimodal Models.

[BibT_eX]

[DOI]

Vishaal Udandarao

Nikhil Parthasarathy

Muhammad Ferjad Naeem

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

How to Merge Your Multimodal Models Over Time?

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

GAMEBoT: Transparent Assessment of LLM Reasoning in Games.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

ONEBench to Test Them All: Sample-Level Benchmarking Over Open-Ended Capabilities.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024

Foundational Challenges in Assuring Alignment and Safety of Large Language Models.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2024

Iterate Averaging in the Quest for Best Test Error.

[BibT_eX]

[DOI]

Diego Granziol

Nicholas P. Baskerville

Xingchen Wan

Stephen Roberts

J. Mach. Learn. Res., 2024

Beyond Outcomes: Transparent Assessment of LLM Reasoning in Games.

[BibT_eX]

[DOI]

CoRR, 2024

A Practitioner's Guide to Continual Multimodal Pretraining.

[BibT_eX]

[DOI]

CoRR, 2024

HelloFresh: LLM Evaluations on Streams of Real-World Human Editorial Actions across X Community Notes and Wikipedia edits.

[BibT_eX]

[DOI]

Tim Franzmeyer

Aleksandar Shtedritski

CoRR, 2024

A Tale of Two Languages: Large-Vocabulary Continuous Sign Language Recognition from Spoken Language Supervision.

[BibT_eX]

[DOI]

CoRR, 2024

Foundational Challenges in Assuring Alignment and Safety of Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

Lifelong Benchmarks: Efficient Model Evaluation in an Era of Rapid Progress.

[BibT_eX]

[DOI]

CoRR, 2024

A Practitioner's Guide to Real-World Continual Multimodal Pretraining.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Efficient Lifelong Model Evaluation in an Era of Rapid Progress.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

On scalable oversight with weak LLMs judging strong LLMs.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

SciFIBench: Benchmarking Large Multimodal Models for Scientific Figure Interpretation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Visual Data-Type Understanding does not emerge from scaling Vision-Language Models.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

A Sound Approach: Using Large Language Models to Generate Audio Descriptions for Egocentric Text-Audio Retrieval.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

InstructVideo: Instructing Video Diffusion Models with Human Feedback.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Charting New Territories: Exploring the Geographic and Geospatial Capabilities of Multimodal LLMs.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

HelloFresh: LLM Evalutions on Streams of Real-World Human Editorial Actions across X Community Notes and Wikipedia edits.

[BibT_eX]

[DOI]

Tim Franzmeyer

Aleksandar Shtedritski

Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023

Audio Retrieval With Natural Language Queries: A Benchmark Study.

[BibT_eX]

[DOI]

A. Sophia Koepke

Zeynep Akata

IEEE Trans. Multim., 2023

arXiVeri: Automatic table verification with GPT.

[BibT_eX]

[DOI]

CoRR, 2023

GPT4GEO: How a Language Model Sees the World's Geography.

[BibT_eX]

[DOI]

CoRR, 2023

SATIN: A Multi-Task Metadataset for Classifying Satellite Imagery using Vision-Language Models.

[BibT_eX]

[DOI]

CoRR, 2023

Can GPT-4 Perform Neural Architecture Search?

[BibT_eX]

[DOI]

CoRR, 2023

Large Language Models are Few-shot Publication Scoopers.

[BibT_eX]

[DOI]

CoRR, 2023

DeepMIM: Deep Supervision for Masked Image Modeling.

[BibT_eX]

[DOI]

CoRR, 2023

RLIPv2: Fast Scaling of Relational Language-Image Pre-training.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

SuS-X: Training-Free Name-Only Transfer of Vision-Language Models.

[BibT_eX]

[DOI]

Vishaal Udandarao

Ankush Gupta

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Simple Baselines for Interactive Video Retrieval with Questions and Answers.

[BibT_eX]

[DOI]

Kaiqu Liang

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Moment Detection in Long Tutorial Videos.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

NamedMask: Distilling Segmenters from Complementary Foundation Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Zero-shot Unsupervised Transfer Instance Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Crosslingual Generalization through Multitask Finetuning.

[BibT_eX]

[DOI]

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022

Scaling Up Sign Spotting Through Sign Language Dictionaries.

[BibT_eX]

[DOI]

Int. J. Comput. Vis., 2022

A 23 MW data centre is all you need.

[BibT_eX]

[DOI]

Dylan Campbell

CoRR, 2022

RLIP: Relational Language-Image Pre-training for Human-Object Interaction Detection.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

ReCo: Retrieve and Co-segment for Zero-shot Transfer.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Automatic Dense Annotation of Large-Vocabulary Sign Language Videos.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Unsupervised Salient Object Detection with Spectral Cluster Voting.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022

Sign Language Video Retrieval with Free-Form Textual Queries.

[BibT_eX]

[DOI]

Amanda Cardoso Duarte

Xavier Giró-i-Nieto

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Cross Modal Retrieval with Querybank Normalisation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Weakly-supervised Fingerspelling Recognition in British Sign Language Videos.

[BibT_eX]

[DOI]

Proceedings of the 33rd British Machine Vision Conference 2022, 2022

2021

BBC-Oxford British Sign Language Dataset.

[BibT_eX]

[DOI]

CoRR, 2021

On the Origin of Species of Self-Supervised Learning.

[BibT_eX]

[DOI]

Erika Lu

CoRR, 2021

Quantum Self-Supervised Learning.

[BibT_eX]

[DOI]

CoRR, 2021

Preface.

[BibT_eX]

[DOI]

Luca Bertinetto

Alex Hernández-García

Hazel Doughty

Proceedings of the NeurIPS 2021 Workshop on Pre-Registration in Machine Learning, 2021

Audio Retrieval with Natural Language Queries.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

All you need are a few pixels: semantic segmentation with PixelPick.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2021

TeachText: CrossModal Generalized Distillation for Text-Video Retrieval.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Aligning Subtitles in Sign Language Videos.

[BibT_eX]

[DOI]

Hannah Bull

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Sign Language Segmentation with Temporal Convolutional Networks.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

QUERYD: A Video Dataset with High-Quality Text and Audio Narrations.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

SeeHear: Signer Diarisation and a New Dataset.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Read and Attend: Temporal Localisation in Sign Language Videos.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Sign Segmentation With Changepoint-Modulated Pseudo-Labelling.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2021

Adaptive Cross-Modal Prototypes for Cross-Domain Visual-Language Retrieval.

[BibT_eX]

[DOI]

Yang Liu

Qingchao Chen

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Mind-the-Gap! Unsupervised Domain Adaptation for Text-Video Retrieval.

[BibT_eX]

[DOI]

Qingchao Chen

Yang Liu

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020

Squeeze-and-Excitation Networks.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2020

QuerYD: A video dataset with high-quality textual and audio narrations.

[BibT_eX]

[DOI]

CoRR, 2020

Explaining the Adaptive Generalisation Gap.

[BibT_eX]

[DOI]

CoRR, 2020

The End-of-End-to-End: A Video Understanding Pentathlon Challenge (2020).

[BibT_eX]

[DOI]

CoRR, 2020

State-of-Art-Reviewing: A Radical Proposal to Improve Scientific Publication.

[BibT_eX]

[DOI]

CoRR, 2020

Preface.

[BibT_eX]

[DOI]

Proceedings of the NeurIPS 2020 Workshop on Pre-registration in Machine Learning, 2020

Disentangled Speech Embeddings Using Cross-Modal Self-Supervision.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

SLRTP 2020: The Sign Language Recognition, Translation & Production Workshop.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020 Workshops, 2020

BSL-1K: Scaling Up Co-articulated Sign Language Recognition Using Mouthing Cues.

[BibT_eX]

[DOI]

Joon Son Chung

Neil Fox

Proceedings of the Computer Vision - ECCV 2020, 2020

Seeing wake words: Audio-visual Keyword Spotting.

[BibT_eX]

[DOI]

Themos Stafylakis

Proceedings of the 31st British Machine Vision Conference 2020, 2020

Watch, Read and Lookup: Learning to Spot Signs from Multiple Supervisors.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ACCV 2020 - 15th Asian Conference on Computer Vision, Kyoto, Japan, November 30, 2020

2019

Deep Industrial Espionage.

[BibT_eX]

[DOI]

CoRR, 2019

Unsupervised Learning of Landmarks by Descriptor Vector Exchange.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Small Steps and Giant Leaps: Minimal Newton Solvers for Deep Learning.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Use What You Have: Video retrieval using representations from collaborative experts.

[BibT_eX]

[DOI]

Proceedings of the 30th British Machine Vision Conference 2019, 2019

2018

Substitute Teacher Networks: Learning with Almost No Supervision.

[BibT_eX]

[DOI]

James Thewlis

CoRR, 2018

Gather-Excite: Exploiting Feature Context in Convolutional Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Emotion Recognition in Speech using Cross-Modal Transfer in the Wild.

[BibT_eX]

[DOI]

Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Semi-convolutional Operators for Instance Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2018, 2018

Learnable PINs: Cross-modal Embeddings for Person Identity.

[BibT_eX]

[DOI]

Arsha Nagrani

Proceedings of the Computer Vision - ECCV 2018, 2018

Self-Supervised Learning of Geometrically Stable Features Through Probabilistic Introspection.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Seeing Voices and Hearing Faces: Cross-Modal Biometric Matching.

[BibT_eX]

[DOI]

Arsha Nagrani

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

2017

Unknowable Manipulators: Social Network Curator Algorithms.

[BibT_eX]

[DOI]

Hillary Shakespeare

Tom Gunter

CoRR, 2017

Stopping GAN Violence: Generative Unadversarial Networks.

[BibT_eX]

[DOI]

Sébastien Ehrhardt

CoRR, 2017

2016

Learning Grimaces by Watching TV.

[BibT_eX]

[DOI]