Bin Zhu

Orcid: 0000-0002-9213-2611

Affiliations:

Singapore Management University, School of Computing and Information Systems, Singapore
University of Bristol, UK (former)
City University of Hong Kong, Department of Computer Science, Kowloon Tong, Hong Kong (PhD 2021)

According to our database¹, Bin Zhu authored at least 38 papers between 2019 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Bibliography

2025

CVLP-NaVD: Contrastive Visual-language Pre-training Models for Non-annotated Visual Description.

[BibT_eX]

[DOI]

ACM Trans. Multim. Comput. Commun. Appl., November, 2025

Efficient Test-Time Retrieval Augmented Generation.

[BibT_eX]

[DOI]

CoRR, November, 2025

Exploring Object Status Recognition for Recipe Progress Tracking in Non-Visual Cooking.

[BibT_eX]

[DOI]

CoRR, July, 2025

Reasoning Models Are More Easily Gaslighted Than You Think.

[BibT_eX]

[DOI]

CoRR, June, 2025

Don't Deceive Me: Mitigating Gaslighting through Attention Reallocation in LMMs.

[BibT_eX]

[DOI]

CoRR, April, 2025

Calling a Spade a Heart: Gaslighting Multimodal Large Language Models via Negation.

[BibT_eX]

[DOI]

CoRR, January, 2025

FoodLMM: A Versatile Food Assistant Using Large Multi-Modal Model.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2025

From Canteen Food to Daily Meals: Generalizing Food Recognition to More Practical Scenarios.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2025

Retrieval Augmented Recipe Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2025

Advancing Food Nutrition Estimation via Visual-Ingredient Feature Fusion.

[BibT_eX]

[DOI]

Proceedings of the 2025 International Conference on Multimedia Retrieval, 2025

Efficient Prompt Tuning for Hierarchical Ingredient Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2025

HD-EPIC: A Highly-Detailed Egocentric Video Dataset.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

OSCAR: Object Status and Contextual Awareness for Recipes to Support Non-Visual Cooking.

[BibT_eX]

[DOI]

Proceedings of the Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, 2025

RAGG: Retrieval-Augmented Grasp Generation Model.

[BibT_eX]

[DOI]

Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024

Efficient Unsupervised Video Hashing With Contextual Modeling and Structural Controlling.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2024

Visual Cue Enhancement and Dual Low-Rank Adaptation for Efficient Visual Instruction Fine-Tuning.

[BibT_eX]

[DOI]

CoRR, 2024

RoDE: Linear Rectified Mixture of Diverse Experts for Food Large Multi-Modal Models.

[BibT_eX]

[DOI]

CoRR, 2024

Model Inversion Attacks Through Target-Specific Conditional Diffusion Models.

[BibT_eX]

[DOI]

CoRR, 2024

Active Object Segmentation: A New Modality for Egocentric Action Recognition.

[BibT_eX]

[DOI]

Proceedings of the 6th ACM International Conference on Multimedia in Asia, 2024

Navigating Weight Prediction with Diet Diary.

[BibT_eX]

[DOI]

Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Video Editing for Video Retrieval.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024 Workshops, 2024

Enhancing Recipe Retrieval with Foundation Models: A Data Augmentation Perspective.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

2023

CAR: Consolidation, Augmentation and Regulation for Recipe Retrieval.

[BibT_eX]

[DOI]

CoRR, 2023

Cross-domain Food Image-to-Recipe Retrieval by Weighted Adversarial Learning.

[BibT_eX]

[DOI]

CoRR, 2023

CgT-GAN: CLIP-guided Text GAN for Image Captioning.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

2022

Learning From Web Recipe-Image Pairs for Food Recognition: Problem, Baselines and Performance.

[BibT_eX]

[DOI]

Bin Zhu

Chong-Wah Ngo

Wing Kwong Chan

IEEE Trans. Multim., 2022

Text-driven Video Prediction.

[BibT_eX]

[DOI]

CoRR, 2022

EPIC-KITCHENS VISOR Benchmark: VIdeo Segmentations and Object Relations.

[BibT_eX]

[DOI]

Richard E. L. Higgins

Sanja Fidler

David Fouhey

Dima Damen

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Mix-DANN and Dynamic-Modal-Distillation for Video Domain Adaptation.

[BibT_eX]

[DOI]

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Unsupervised Video Hashing with Multi-granularity Contextualization and Multi-structure Preservation.

[BibT_eX]

[DOI]

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Cross-lingual Adaptation for Recipe Retrieval with Mixup.

[BibT_eX]

[DOI]

Proceedings of the ICMR '22: International Conference on Multimedia Retrieval, Newark, NJ, USA, June 27, 2022

2021

Learning to Match Anchor-Target Video Pairs With Dual Attentional Holographic Networks.

[BibT_eX]

[DOI]

Yanbin Hao

Chong-Wah Ngo

Bin Zhu

IEEE Trans. Image Process., 2021

A Study of Multi-Task and Region-Wise Deep Learning for Food Ingredient Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2021

Pyramid Fusion Dark Channel Prior for Single Image Dehazing.

[BibT_eX]

[DOI]

Qiyuan Liang

Bin Zhu

Chong-Wah Ngo

CoRR, 2021

2020

Cross-domain Cross-modal Food Transfer.

[BibT_eX]

[DOI]

Bin Zhu

Chong-Wah Ngo

Jingjing Chen

Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Person-level Action Recognition in Complex Events via TSD-TSM Networks.

[BibT_eX]

[DOI]

Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

CookGAN: Causality Based Text-to-Image Synthesis.

[BibT_eX]

[DOI]

Bin Zhu

Chong-Wah Ngo

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019

R2GAN: Cross-Modal Recipe Retrieval With Generative Adversarial Network.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Bin Zhu

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...