We stand with Ukraine

We stand with Ukraine

Thomas Mensink

Orcid: 0000-0002-5730-713X

According to our database¹, Thomas Mensink authored at least 77 papers between 2007 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

Online presence:

on orcid.org

On csauthors.net:

Bibliography

2026

Dual-Rate Diffusion: Accelerating diffusion models with an interleaved heavy-light network.

[DOI]

Grigory Bartosh

,

,

Emiel Hoogeboom

,

,

,

CoRR, May, 2026

Beyond Single Tokens: Distilling Discrete Diffusion Models via Discrete MMD.

[DOI]

Emiel Hoogeboom

,

,

,

,

CoRR, March, 2026

Unified Latents (UL): How to train your latents.

[DOI]

,

Emiel Hoogeboom

,

,

CoRR, February, 2026

2025

Simpler Diffusion: 1.5 FID on ImageNet512 with Pixel-space Diffusion.

[DOI]

Emiel Hoogeboom

,

,

,

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024

Simpler Diffusion (SiD2): 1.5 FID on ImageNet512 with pixel-space diffusion.

[DOI]

Emiel Hoogeboom

,

,

,

,

,

CoRR, 2024

HAMMR: HierArchical MultiModal React agents for generic VQA.

[DOI]

Lluís Castrejón

,

,

,

Vittorio Ferrari

,

,

Jasper R. R. Uijlings

CoRR, 2024

Multistep Distillation of Diffusion Models via Moment Matching.

[DOI]

,

,

,

Emiel Hoogeboom

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

2023

Scaling Vision Transformers to 22 Billion Parameters.

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Encyclopedic VQA: Visual questions about detailed properties of fine-grained categories.

[DOI]

,

Jasper R. R. Uijlings

,

Lluís Castrejón

,

,

,

,

,

,

Vittorio Ferrari

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

How (not) to ensemble LVLMs for VQA.

[DOI]

,

Lluís Castrejón

,

Mostafa Dehghani

,

,

Jasper R. R. Uijlings

,

Proceedings of the Proceedings on "I Can't Believe It's Not Better: Failure Modes in the Age of Foundation Models" at NeurIPS 2023 Workshops, 2023

Infinite Class Mixup.

[DOI]

,

Proceedings of the 34th British Machine Vision Conference 2023, 2023

2022

Factors of Influence for Transfer Learning Across Diverse Appearance Domains and Task Types.

[DOI]

,

Jasper R. R. Uijlings

,

Alina Kuznetsova

,

,

Vittorio Ferrari

IEEE Trans. Pattern Anal. Mach. Intell., 2022

The Missing Link: Finding Label Relations Across Datasets.

[DOI]

Jasper R. R. Uijlings

,

,

Vittorio Ferrari

Proceedings of the Computer Vision - ECCV 2022, 2022

How Stable Are Transferability Metrics Evaluations?

[DOI]

Andrea Agostinelli

,

,

Jasper R. R. Uijlings

,

,

Vittorio Ferrari

Proceedings of the Computer Vision - ECCV 2022, 2022

Transferability Estimation using Bhattacharyya Class Separability.

[DOI]

,

Andrea Agostinelli

,

Jasper R. R. Uijlings

,

Vittorio Ferrari

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Transferability Metrics for Selecting Source Model Ensembles.

[DOI]

Andrea Agostinelli

,

Jasper R. R. Uijlings

,

,

Vittorio Ferrari

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

Automatic generation of dense non-rigid optical flow.

[DOI]

,

Tushar Nimbhorkar

,

,

Anil S. Baslamisli

,

,

Comput. Vis. Image Underst., 2021

EDEN: Multimodal Synthetic Dataset of Enclosed GarDEN Scenes.

[DOI]

,

,

,

,

Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2021

Multi-Loss Weighting with Coefficient of Variations.

[DOI]

Rick Groenendijk

,

,

,

Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2021

Neural Feature Matching in Implicit 3D Representations.

[DOI]

,

Basura Fernando

,

,

,

Efstratios Gavves

Proceedings of the 38th International Conference on Machine Learning, 2021

Calibration of Neural Networks using Splines.

[DOI]

,

,

Thalaiyasingam Ajanthan

,

,

Cristian Sminchisescu

,

Richard Hartley

Proceedings of the 9th International Conference on Learning Representations, 2021

2020

On the benefit of adversarial training for monocular depth estimation.

[DOI]

Rick Groenendijk

,

,

,

Comput. Vis. Image Underst., 2020

Post-hoc Calibration of Neural Networks.

[DOI]

,

,

Thalaiyasingam Ajanthan

,

,

Cristian Sminchisescu

,

Richard Hartley

CoRR, 2020

PointMixup: Augmentation for Point Clouds.

[DOI]

,

,

Efstratios Gavves

,

,

,

,

Cees G. M. Snoek

Proceedings of the Computer Vision - ECCV 2020, 2020

Range Conditioned Dilated Convolutions for Scale Invariant 3D Object Detection.

[DOI]

,

,

,

Dragomir Anguelov

,

Cristian Sminchisescu

Proceedings of the 4th Conference on Robot Learning, 2020

Novel View Synthesis from Single Images via Point Cloud Transformation.

[DOI]

,

,

,

Proceedings of the 31st British Machine Vision Conference 2020, 2020

2019

New Modality: Emoji Challenges in Prediction, Anticipation, and Retrieval.

[DOI]

Spencer Cappallo

,

Stacey Svetlichnaya

,

Pierre Garrigues

,

,

Cees G. M. Snoek

IEEE Trans. Multim., 2019

IterGANs: Iterative GANs to learn and control 3D object transformation.

[DOI]

,

Comput. Vis. Image Underst., 2019

Interactive Exploration of Journalistic Video Footage through Multimodal Semantic Matching.

[DOI]

,

,

,

,

,

,

Maurits van der Goes

,

,

Emiel van Miltenburg

,

,

,

,

Proceedings of the 27th ACM International Conference on Multimedia, 2019

3D Neighborhood Convolution: Learning Depth-Aware Features for RGB-D and RGB Semantic Segmentation.

[DOI]

,

,

Efstratios Gavves

Proceedings of the 2019 International Conference on 3D Vision, 2019

2018

Guest Editorial.

[DOI]

Lamberto Ballan

,

,

,

,

,

Rahul Sukthankar

Comput. Vis. Image Underst., 2018

Unsupervised Generation of Optical Flow Datasets from Videos in the Wild.

[DOI]

,

Tushar Nimbhorkar

,

,

Anil S. Baslamisli

,

,

CoRR, 2018

DeepNCM: Deep Nearest Class Mean Classifiers.

[DOI]

Samantha Guerriero

,

,

Proceedings of the 6th International Conference on Learning Representations, 2018

Iterative GANs for Rotating Visual Objects.

[DOI]

,

Proceedings of the 6th International Conference on Learning Representations, 2018

Three for one and one for three: Flow, Segmentation, and Surface Normals.

[DOI]

,

Anil S. Baslamisli

,

,

Proceedings of the British Machine Vision Conference 2018, 2018

2017

Video2vec Embeddings Recognize Events When Examples Are Scarce.

[DOI]

AmirHossein Habibian

,

,

Cees G. M. Snoek

IEEE Trans. Pattern Anal. Mach. Intell., 2017

Music-Guided Video Summarization using Quadratic Assignments.

[DOI]

,

Thomas Jongstra

,

,

Cees G. M. Snoek

Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval, 2017

Spotting Audio-Visual Inconsistencies (SAVI) in Manipulated Video.

[DOI]

Robert C. Bolles

,

,

Martin Graciarena

,

,

,

Mitchell McLaren

,

Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2017

2016

Online Open World Recognition.

[DOI]

,

,

CoRR, 2016

Learning to Reuse Visual Knowledge.

[DOI]

Proceedings of the 2nd International Workshop on Multimedia Assisted Dietary Management, 2016

Pooling Objects for Recognizing Scenes without Examples.

[DOI]

Svetlana Kordumova

,

,

Cees G. M. Snoek

Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval, 2016

Video Stream Retrieval of Unseen Queries using Semantic Memory.

[DOI]

Spencer Cappallo

,

,

Proceedings of the British Machine Vision Conference 2016, 2016

2015

VideoStory Embeddings Recognize Events when Examples are Scarce.

[DOI]

AmirHossein Habibian

,

,

Cees G. M. Snoek

CoRR, 2015

Image2Emoji: Zero-shot Emoji Prediction for Visual Media.

[DOI]

Spencer Cappallo

,

,

Cees G. M. Snoek

Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

Query-by-Emoji Video Search.

[DOI]

Spencer Cappallo

,

,

Cees G. M. Snoek

Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

Bag-of-Fragments: Selecting and Encoding Video Fragments for Event Detection and Recounting.

[DOI]

,

Jan C. van Gemert

,

Spencer Cappallo

,

,

Cees G. M. Snoek

Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, 2015

Discovering Semantic Vocabularies for Cross-Media Retrieval.

[DOI]

AmirHossein Habibian

,

,

Cees G. M. Snoek

Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, 2015

Latent Factors of Visual Popularity Prediction.

[DOI]

Spencer Cappallo

,

,

Cees G. M. Snoek

Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, 2015

Objects2action: Classifying and Localizing Actions without Any Video Example.

[DOI]

,

Jan C. van Gemert

,

,

Cees G. M. Snoek

Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Active Transfer Learning with Zero-Shot Priors: Reusing Past Datasets for Future Tasks.

[DOI]

Efstratios Gavves

,

,

Tatiana Tommasi

,

Cees G. M. Snoek

,

Tinne Tuytelaars

Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Event Fisher Vectors: Robust Encoding Visual Diversity of Visual Streams.

[DOI]

,

,

Cees G. M. Snoek

Proceedings of the British Machine Vision Conference 2015, 2015

2014

Robustifying Descriptor Instability Using Fisher Vectors.

[DOI]

,

Jan C. van Gemert

,

,

IEEE Trans. Image Process., 2014

MediaMill at TRECVID 2014: Searching Concepts, Objects, Instances and Events in Video.

[DOI]

Cees G. M. Snoek

,

Koen E. A. van de Sande

,

Daniel Fontijne

,

Spencer Cappallo

,

,

AmirHossein Habibian

,

,

,

,

Dennis C. Koelma

,

Arnold W. M. Smeulders

Proceedings of the 2014 TREC Video Retrieval Evaluation, 2014

The 2014 SESAME Multimedia Event Detection and Recounting System.

[DOI]

Robert C. Bolles

,

,

James A. Herson

,

Gregory K. Myers

,

Julien van Hout

,

,

,

,

AmirHossein Habibian

,

Dennis C. Koelma

,

,

Arnold W. M. Smeulders

,

Cees G. M. Snoek

,

,

,

,

,

,

Proceedings of the 2014 TREC Video Retrieval Evaluation, 2014

VideoStory: A New Multimedia Embedding for Few-Example Recognition and Translation of Events.

[DOI]

AmirHossein Habibian

,

,

Cees G. M. Snoek

Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014

The Rijksmuseum Challenge: Museum-Centered Visual Recognition.

[DOI]

,

Jan C. van Gemert

Proceedings of the International Conference on Multimedia Retrieval, 2014

Composite Concept Discovery for Zero-Shot Video Event Detection.

[DOI]

AmirHossein Habibian

,

,

Cees G. M. Snoek

Proceedings of the International Conference on Multimedia Retrieval, 2014

Attributes Make Sense on Segmented Objects.

[DOI]

,

Efstratios Gavves

,

,

Cees G. M. Snoek

Proceedings of the Computer Vision - ECCV 2014, 2014

COSTA: Co-Occurrence Statistics for Zero-Shot Classification.

[DOI]

,

Efstratios Gavves

,

Cees G. M. Snoek

Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

2013

Large Scale Metric Learning for Distance-Based Image Classification on Open Ended Data Sets.

[DOI]

,

,

Florent Perronnin

Proceedings of the Advanced Topics in Computer Vision, 2013

Distance-Based Image Classification: Generalizing to New Classes at Near-Zero Cost.

[DOI]

,

,

Florent Perronnin

,

Gabriela Csurka

IEEE Trans. Pattern Anal. Mach. Intell., 2013

Tree-Structured CRF Models for Interactive Image Labeling.

[DOI]

,

,

Gabriela Csurka

IEEE Trans. Pattern Anal. Mach. Intell., 2013

Image Classification with the Fisher Vector: Theory and Practice.

[DOI]

,

Florent Perronnin

,

,

Int. J. Comput. Vis., 2013

2012

Learning Image Classification and Retrieval Models. (Apprentissage de Modèles pour la Classification et la Recherche d'Images).

[DOI]

PhD thesis, 2012

Face Recognition from Caption-Based Supervision.

[DOI]

Matthieu Guillaumin

,

,

,

Cordelia Schmid

Int. J. Comput. Vis., 2012

Metric Learning for Large Scale Image Classification: Generalizing to New Classes at Near-Zero Cost.

[DOI]

,

,

Florent Perronnin

,

Gabriela Csurka

Proceedings of the Computer Vision - ECCV 2012, 2012

2011

Learning structured prediction models for interactive image labeling.

[DOI]

,

,

Gabriela Csurka

Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011

2010

Image annotation with tagprop on the MIRFLICKR set.

[DOI]

,

Matthieu Guillaumin

,

,

Cordelia Schmid

Proceedings of the 11th ACM SIGMM International Conference on Multimedia Information Retrieval, 2010

Improving the Fisher Kernel for Large-Scale Image Classification.

[DOI]

Florent Perronnin

,

,

Proceedings of the Computer Vision, 2010

EP for Efficient Stochastic Control with Obstacles.

[DOI]

,

,

Proceedings of the ECAI 2010, 2010

LEAR and XRCE's Participation to Visual Concept Detection Task - ImageCLEF 2010.

[DOI]

,

Gabriela Csurka

,

Florent Perronnin

,

,

Proceedings of the CLEF 2010 LABs and Workshops, 2010

Trans Media Relevance Feedback for Image Autoannotation.

[DOI]

,

,

Gabriela Csurka

Proceedings of the British Machine Vision Conference, 2010

2009

TagProp: Discriminative metric learning in nearest neighbor models for image auto-annotation.

[DOI]

Matthieu Guillaumin

,

,

,

Cordelia Schmid

Proceedings of the IEEE 12th International Conference on Computer Vision, ICCV 2009, Kyoto, Japan, September 27, 2009

INRIA-LEAR's Participation in ImageCLEF 2009.

[DOI]

,

Matthieu Guillaumin

,

,

Cordelia Schmid

,

Proceedings of the Working Notes for CLEF 2009 Workshop co-located with the 13th European Conference on Digital Libraries (ECDL 2009) , Corfù, Greece, September 30, 2009

2008

Improving People Search Using Query Expansions.

[DOI]

,

Proceedings of the Computer Vision, 2008

Automatic face naming with caption-based supervision.

[DOI]

Matthieu Guillaumin

,

,

,

Cordelia Schmid

Proceedings of the 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008), 2008

2007

Distributed EM Learning for Appearance Based Multi-Camera Tracking.

[DOI]

,

Wojciech Zajdel

,

Ben J. A. Kröse

Proceedings of the 2007 First ACM/IEEE International Conference on Distributed Smart Cameras, 2007

Loading...