Bin Zhu

Orcid: 0000-0002-9213-2611

Affiliations:
  • Singapore Management University, School of Computing and Information Systems, Singapore
  • University of Bristol, UK (former)
  • City University of Hong Kong, Department of Computer Science, Kowloon Tong, Hong Kong (PhD 2021)


According to our database1, Bin Zhu authored at least 36 papers between 2019 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
Exploring Object Status Recognition for Recipe Progress Tracking in Non-Visual Cooking.
CoRR, July, 2025

Reasoning Models Are More Easily Gaslighted Than You Think.
CoRR, June, 2025

Efficient Prompt Tuning for Hierarchical Ingredient Recognition.
CoRR, April, 2025

Don't Deceive Me: Mitigating Gaslighting through Attention Reallocation in LMMs.
CoRR, April, 2025

Calling a Spade a Heart: Gaslighting Multimodal Large Language Models via Negation.
CoRR, January, 2025

From Canteen Food to Daily Meals: Generalizing Food Recognition to More Practical Scenarios.
IEEE Trans. Multim., 2025

Retrieval Augmented Recipe Generation.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2025

Advancing Food Nutrition Estimation via Visual-Ingredient Feature Fusion.
Proceedings of the 2025 International Conference on Multimedia Retrieval, 2025

HD-EPIC: A Highly-Detailed Egocentric Video Dataset.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

OSCAR: Object Status and Contextual Awareness for Recipes to Support Non-Visual Cooking.
Proceedings of the Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, 2025

RAGG: Retrieval-Augmented Grasp Generation Model.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024
Efficient Unsupervised Video Hashing With Contextual Modeling and Structural Controlling.
IEEE Trans. Multim., 2024

Visual Cue Enhancement and Dual Low-Rank Adaptation for Efficient Visual Instruction Fine-Tuning.
CoRR, 2024

RoDE: Linear Rectified Mixture of Diverse Experts for Food Large Multi-Modal Models.
CoRR, 2024

Model Inversion Attacks Through Target-Specific Conditional Diffusion Models.
CoRR, 2024

Active Object Segmentation: A New Modality for Egocentric Action Recognition.
Proceedings of the 6th ACM International Conference on Multimedia in Asia, 2024

Navigating Weight Prediction with Diet Diary.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Video Editing for Video Retrieval.
Proceedings of the Computer Vision - ECCV 2024 Workshops, 2024

Enhancing Recipe Retrieval with Foundation Models: A Data Augmentation Perspective.
Proceedings of the Computer Vision - ECCV 2024, 2024

2023
FoodLMM: A Versatile Food Assistant using Large Multi-modal Model.
CoRR, 2023

CAR: Consolidation, Augmentation and Regulation for Recipe Retrieval.
CoRR, 2023

Cross-domain Food Image-to-Recipe Retrieval by Weighted Adversarial Learning.
CoRR, 2023

CgT-GAN: CLIP-guided Text GAN for Image Captioning.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

2022
Learning From Web Recipe-Image Pairs for Food Recognition: Problem, Baselines and Performance.
IEEE Trans. Multim., 2022

Text-driven Video Prediction.
CoRR, 2022

EPIC-KITCHENS VISOR Benchmark: VIdeo Segmentations and Object Relations.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Mix-DANN and Dynamic-Modal-Distillation for Video Domain Adaptation.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Unsupervised Video Hashing with Multi-granularity Contextualization and Multi-structure Preservation.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Cross-lingual Adaptation for Recipe Retrieval with Mixup.
Proceedings of the ICMR '22: International Conference on Multimedia Retrieval, Newark, NJ, USA, June 27, 2022

2021
Learning to Match Anchor-Target Video Pairs With Dual Attentional Holographic Networks.
IEEE Trans. Image Process., 2021

A Study of Multi-Task and Region-Wise Deep Learning for Food Ingredient Recognition.
IEEE Trans. Image Process., 2021

Pyramid Fusion Dark Channel Prior for Single Image Dehazing.
CoRR, 2021

2020
Cross-domain Cross-modal Food Transfer.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Person-level Action Recognition in Complex Events via TSD-TSM Networks.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

CookGAN: Causality Based Text-to-Image Synthesis.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019
R2GAN: Cross-Modal Recipe Retrieval With Generative Adversarial Network.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019


  Loading...