Rao Muhammad Anwer

CoRR, April, 2026

CoVR-R:Reason-Aware Composed Video Retrieval.

[BibT_eX]

[DOI]

Bogireddy Sai Prasanna Teja

Dmitry Demidov

Vaishnav Potlapalli

Viswanatha Reddy Gajjala

Alaa Mostafa Lasheen

CoRR, March, 2026

MediX-R1: Open Ended Medical Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, February, 2026

Audit After Segmentation: Reference-Free Mask Quality Assessment for Language-Referred Audio-Visual Segmentation.

[BibT_eX]

[DOI]

CoRR, February, 2026

MedROV: Towards Real-Time Open-Vocabulary Detection Across Diverse Medical Imaging Modalities.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2026

DuwatBench: Bridging Language and Visual Heritage through an Arabic Calligraphy Benchmark for Multimodal Understanding.

[BibT_eX]

[DOI]

Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics, 2026

Think Before You Segment: An Object-aware Reasoning Agent for Referring Audio-Visual Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025

Bridge the Intra-Class Gap: K-Shot Multi-Scale Intermediate Prototype Mining Transformer for Few-Shot Semantic Segmentation.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., December, 2025

Thinking Beyond Labels: Vocabulary-Free Fine-Grained Recognition using Reasoning-Augmented LMMs.

[BibT_eX]

[DOI]

CoRR, December, 2025

A Benchmark and Agentic Framework for Omni-Modal Reasoning and Tool Use in Long Videos.

[BibT_eX]

[DOI]

Jaseel Muhammad Kaithakkodan

Jinxing Zhou

CoRR, December, 2025

EvoLMM: Self-Evolving Large Multimodal Models with Continuous Rewards.

[BibT_eX]

[DOI]

Shravan Venkatraman

Ritesh Thawkar

CoRR, November, 2025

Advanced Discriminative Co-Saliency and Background Mining Transformer for Co-Salient Object Detection.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., September, 2025

How Good are Foundation Models in Step-by-Step Embodied Reasoning?

[BibT_eX]

[DOI]

CoRR, September, 2025

AI in Agriculture: A Survey of Deep Learning Techniques for Crops, Fisheries and Livestock.

[BibT_eX]

[DOI]

Umair Nawaz

Muhammad Zaigham Zaheer

CoRR, July, 2025

TerraFM: A Scalable Foundation Model for Unified Multisensor Earth Observation.

[BibT_eX]

[DOI]

Muhammad Sohail Danish

Muhammad Akhtar Munir

Syed Roshaan Ali Shah

CoRR, June, 2025

NTRENet++: Unleashing the Power of Non-Target Knowledge for Few-Shot Semantic Segmentation.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., May, 2025

Agent-X: Evaluating Deep Multimodal Reasoning in Vision-Centric Agentic Tasks.

[BibT_eX]

[DOI]

CoRR, May, 2025

ARB: A Comprehensive Arabic Multimodal Reasoning Benchmark.

[BibT_eX]

[DOI]

CoRR, May, 2025

OpenSeg-R: Improving Open-Vocabulary Segmentation via Step-by-Step Visual Reasoning.

[BibT_eX]

[DOI]

CoRR, May, 2025

Foundation Models Defining a New Era in Vision: A Survey and Outlook.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., April, 2025

Adapting In-Domain Few-Shot Segmentation to New Domains without Retraining.

[BibT_eX]

[DOI]

CoRR, April, 2025

Tracking Meets Large Multimodal Models for Driving Scenario Understanding.

[BibT_eX]

[DOI]

CoRR, March, 2025

LLM Post-Training: A Deep Dive into Reasoning Large Language Models.

[BibT_eX]

[DOI]

CoRR, February, 2025

CLIMB-3D: Continual Learning for Imbalanced 3D Instance Segmentation.

[BibT_eX]

[DOI]

CoRR, February, 2025

AIN: The Arabic INclusive Large Multimodal Model.

[BibT_eX]

[DOI]

CoRR, February, 2025

Video Instance Segmentation in an Open-World.

[BibT_eX]

[DOI]

Int. J. Comput. Vis., January, 2025

Video Instance Segmentation Without Using Mask and Identity Supervision.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2025

Palo: A Polyglot Large Multimodal Model for 5B People.

[BibT_eX]

[DOI]

Ali Husain Salem Abdulla Alharthi

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2025

AgroGPT : Efficient Agricultural Vision-Language Model with Expert Tuning.

[BibT_eX]

[DOI]

Muhammad Awais

Amandeep Kumar

Hisham Cholakkal

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2025

DEFT: Decompositional Efficient Fine-Tuning for Text-to-Image Models.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

CAMEL-Bench: A Comprehensive Arabic LMM Benchmark.

[BibT_eX]

[DOI]

Sara Ghaboura

Ahmed Heakl

Ali Husain Salem Abdulla Alharthi

Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025

MIRA: A Novel Framework for Fusing Modalities in Medical RAG.

[BibT_eX]

[DOI]

Proceedings of the 33rd ACM International Conference on Multimedia, 2025

SPARTA: Spectral Prompt Agnostic Adversarial Attack on Medical Vision-Language Models.

[BibT_eX]

[DOI]

Proceedings of the Uncertainty for Safe Utilization of Machine Learning in Medical Imaging, 2025

IHA-YOLO: Inter-Head Attention for Real-Time Cell Detection.

[BibT_eX]

[DOI]

Proceedings of the 22nd IEEE International Symposium on Biomedical Imaging, 2025

BOrg: A Brain Organoid-Based Mitosis Dataset for Automatic Analysis of Brain Diseases.

[BibT_eX]

[DOI]

Muhammad Awais

Mehaboobathunnisa Sahul Hameed

Proceedings of the 22nd IEEE International Symposium on Biomedical Imaging, 2025

DriveLMM-o1: A Step-by-Step Reasoning Dataset and Large Multimodal Model for Driving Scenario Understanding.

[BibT_eX]

[DOI]

Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2025

Open3DTrack: Towards Open-Vocabulary 3D Multi-Object Tracking.

[BibT_eX]

[DOI]

Ayesha Ishaq

Proceedings of the IEEE International Conference on Robotics and Automation, 2025

Open-YOLO 3D: Towards Fast and Accurate Open-Vocabulary 3D Instance Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

RAGNet: Large-Scale Reasoning-Based Affordance Segmentation Benchmark Towards General Grasping.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

Beyond Simple Edits: Composed Video Retrieval with Dense Modifications.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

TAViS: Text-bridged Audio-Visual Segmentation with Foundation Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

All in One: Visual-Description-Guided Unified Point Cloud Segmentation.

[BibT_eX]

[DOI]

Zongyan Han

Jiahua Dong

Jinhong Wang

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

Adapting In-Domain Few-Shot Segmentation to New Domains Without Source Domain Retraining.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

BiMediX2 : Bio-Medical EXpert LMM for Diverse Medical Modalities.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

MAviS: A Multimodal Conversational Assistant For Avian Species.

[BibT_eX]

[DOI]

Yevheniia Kryklyvets

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

Fann or Flop: A Multigenre, Multiera Benchmark for Arabic Poetry Understanding in LLMs.

[BibT_eX]

[DOI]

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages.

[BibT_eX]

[DOI]

Henok Biadglign Ademtew

Mihail Minkov Mihaylov

Chao Qin

Feno Heriniaina Rabevohitra

Mike Zhang

Mahardika Krisna Ihsani

Fadillah Adamsyah Maani

Azril Hafizi Amirudin

Muhammad Ridzuan

Daniya Najiha Abdul Kareem

Amirpouya Ghasemaghaei

Johan S. Obando-Ceron

Nathan Augusto Zacarias Xavier

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

AgriCLIP: Adapting CLIP for Agriculture and Livestock via Domain-Specialized Cross-Model Alignment.

[BibT_eX]

[DOI]

Proceedings of the 31st International Conference on Computational Linguistics, 2025

LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2025

LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM.

[BibT_eX]

[DOI]

Sambal Shikhar

Proceedings of the Findings of the Association for Computational Linguistics, 2025

Time Travel: A Comprehensive Benchmark to Evaluate LMMs on Historical and Cultural Artifacts.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024

Understanding Whitening Loss in Self-Supervised Learning.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., December, 2024

BEVRefiner: Improving 3D Object Detection in Bird's-Eye-View via Dual Refinement.

[BibT_eX]

[DOI]

IEEE Trans. Intell. Transp. Syst., October, 2024

Robust Perception and Precise Segmentation for Scribble-Supervised RGB-D Saliency Detection.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., January, 2024

Remote Sensing Change Detection With Transformers Trained From Scratch.

[BibT_eX]

[DOI]

IEEE Trans. Geosci. Remote. Sens., 2024

Multi-query and multi-level enhanced network for semantic segmentation.

[BibT_eX]

[DOI]

Pattern Recognit., 2024

Guided-attention and gated-aggregation network for medical image segmentation.

[BibT_eX]

[DOI]

Pattern Recognit., 2024

All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages.

[BibT_eX]

[DOI]

Henok Biadglign Ademtew

Feno Heriniaina Rabevohitra

Mike Zhang

Mahardika Krisna Ihsani

Fadillah Adamsyah Maani

Amirpouya Ghasemaghaei

Johan S. Obando-Ceron

CoRR, 2024

CAMEL-Bench: A Comprehensive Arabic LMM Benchmark.

[BibT_eX]

[DOI]

Sara Ghaboura

Ahmed Heakl

Ali Husain Salem Abdulla Alharthi

CoRR, 2024

CDChat: A Large Multimodal Model for Remote Sensing Change Description.

[BibT_eX]

[DOI]

CoRR, 2024

BOrg: A Brain Organoid-Based Mitosis Dataset for Automatic Analysis of Brain Diseases.

[BibT_eX]

[DOI]

Muhammad Awais

Mehaboobathunnisa Sahul Hameed

Bidisha Bhattacharya

Orly Reiner

CoRR, 2024

Multi-modal Generation via Cross-Modal In-Context Learning.

[BibT_eX]

[DOI]

CoRR, 2024

MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT.

[BibT_eX]

[DOI]

CoRR, 2024

DB-SAM: Delving into High Quality Universal Medical Image Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2024, 2024

BAPLe: Backdoor Attacks on Medical Foundational Models Using Prompt Learning.

[BibT_eX]

[DOI]

Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2024, 2024

Long-Tailed 3D Semantic Segmentation with Adaptive Weight Constraint and Sampling.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Robotics and Automation, 2024

Bidirectional Reciprocative Information Communication for Few-Shot Semantic Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Modulate Your Spectrum in Self-Supervised Learning.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

BiMediX: Bilingual Medical Mixture of Experts LLM.

[BibT_eX]

[DOI]

Sara Pieri

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

CONDA: Condensed Deep Association Learning for Co-salient Object Detection.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

Efficient 3D-Aware Facial Image Editing via Attribute-Specific Prompt Learning.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

Continual Learning and Unknown Object Discovery in 3D Scenes via Self-distillation.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

Composed Video Retrieval via Enriched Context and Discriminative Embeddings.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

GLaMM: Pixel Grounding Large Multimodal Model.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Rethinking Transformers Pre-training for Multi-Spectral Satellite Imagery.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

XrayGPT: Chest Radiographs Summarization using Large Medical Vision-Language Models.

[BibT_eX]

[DOI]

Proceedings of the 23rd Workshop on Biomedical Natural Language Processing, 2024

Semi-supervised Open-World Object Detection.

[BibT_eX]

[DOI]

Abhishek Singh Gehlot

Hisham Cholakkal

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

Transformers in Remote Sensing: A Survey.

[BibT_eX]

[DOI]

Abdulaziz Amer Aleissaee

Remote. Sens., April, 2023

SipMaskv2: Enhanced Fast Image and Video Instance Segmentation.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., March, 2023

Foundational Models Defining a New Era in Vision: A Survey and Outlook.

[BibT_eX]

[DOI]

CoRR, 2023

XrayGPT: Chest Radiographs Summarization using Medical Vision-Language Models.

[BibT_eX]

[DOI]

CoRR, 2023

DFormer: Diffusion-guided Transformer for Universal Image Segmentation.

[BibT_eX]

[DOI]

CoRR, 2023

LEAPS: End-to-End One-Step Person Search With Learnable Proposals.

[BibT_eX]

[DOI]

CoRR, 2023

SAT: Scale-Augmented Transformer for Person Search.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023

Surface-Biased Multi-Level Context 3D Object Detection.

[BibT_eX]

Sultan Abu Ghazal

Jean Lahoud

Proceedings of the 18th International Joint Conference on Computer Vision, 2023

3D Indoor Instance Segmentation in an Open-World.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

3D Mitochondria Instance Segmentation with Spatio-Temporal Transformers.

[BibT_eX]

[DOI]

Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2023, 2023

A Spatial-Temporal Deformable Attention Based Framework for Breast Lesion Detection in Videos.

[BibT_eX]

[DOI]

Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2023, 2023

Cross-Modulated Few-Shot Image Generation for Colorectal Tissue Classification.

[BibT_eX]

[DOI]

Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2023, 2023

Multi-grained Temporal Prototype Learning for Few-shot Video Object Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Generative Multiplane Neural Radiance for 3D-Aware Image Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Arabic Mini-ClimateGPT : A Climate Change and Sustainability Tailored Arabic LLM.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Discriminative Co-Saliency and Background Mining Transformer for Co-Salient Object Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Person Image Synthesis via Denoising Diffusion Model.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

PSM-PS: Part-Based Signal Modulation for Person Search.

[BibT_eX]

[DOI]

Reem Abdalla Sharif

Mustansar Fiaz

Proceedings of the Computer Analysis of Images and Patterns, 2023

SA2-Net: Scale-aware Attention Network for Microscopic Image Segmentation.

[BibT_eX]

[DOI]

Proceedings of the 34th British Machine Vision Conference 2023, 2023

2022

Multi-scale Feature Aggregation for Crowd Counting.

[BibT_eX]

[DOI]

CoRR, 2022

3D Vision with Transformers: A Survey.

[BibT_eX]

[DOI]

CoRR, 2022

An Investigation into Whitening Loss for Self-supervised Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

CMR3D: Contextualized Multi-Stage Refinement for 3D Object Detection.

[BibT_eX]

[DOI]

Proceedings of the 4th ACM International Conference on Multimedia in Asia, 2022

On the Robustness of 3D Object Detectors.

[BibT_eX]

[DOI]

Proceedings of the 4th ACM International Conference on Multimedia in Asia, 2022

Learning a Dynamic Cross-Modal Network for Multispectral Pedestrian Detection.

[BibT_eX]

[DOI]

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Video Instance Segmentation via Multi-Scale Spatio-Temporal Split Attention Transformer.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for Mobile Vision Applications.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022 Workshops, 2022

Class-Agnostic Object Detection with Multi-modal Transformer.

[BibT_eX]

[DOI]

Vineeth N. Balasubramanian

Proceedings of the Computer Vision - ECCV 2022, 2022

DoodleFormer: Creative Sketch Drawing with Transformers.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Spatio-temporal Relation Modeling for Few-shot Action Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Energy-based Latent Aligner for Incremental Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

PSTR: End-to-End One-Step Person Search With Transformers.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

PS-ARM: An End-to-End Attention-Aware Relation Mixer Network for Person Search.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ACCV 2022, 2022

2021

Mask-Guided Attention Network and Occlusion-Sensitive Hard Example Mining for Occluded Pedestrian Detection.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2021

Compact Deep Color Features for Remote Sensing Scene Classification.

[BibT_eX]

[DOI]

Jorma Laaksonen

Neural Process. Lett., 2021

Multi-modal Transformers Excel at Class-agnostic Object Detection.

[BibT_eX]

[DOI]

Vineeth N. Balasubramanian

CoRR, 2021

PSC-Net: learning part spatial co-occurrence for occluded pedestrian detection.

[BibT_eX]

[DOI]

Sci. China Inf. Sci., 2021

Handwriting Transformers.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Structured Latent Embeddings for Recognizing Unseen Classes in Unseen Domains.

[BibT_eX]

[DOI]

Ling Shao

Proceedings of the 32nd British Machine Vision Conference 2021, 2021

2020

PSC-Net: Learning Part Spatial Co-occurence for Occluded Pedestrian Detection.

[BibT_eX]

[DOI]

CoRR, 2020

Count- and Similarity-Aware R-CNN for Pedestrian Detection.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

SipMask: Spatial Information Preservation for Fast Image and Video Instance Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

D2Det: Towards High Quality Object Detection and Instance Segmentation.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019

Deep Contextual Attention for Human-Object Interaction Detection.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Learning Rich Features at High-Speed for Single-Shot Object Detection.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Mask-Guided Attention Network for Occluded Pedestrian Detection.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Enriched Feature Guided Refinement Network for Object Detection.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Efficient Featurized Image Pyramid Network for Single Shot Detector.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Multi-stream Convolutional Networks for Indoor Scene Recognition.

[BibT_eX]

[DOI]

Proceedings of the Computer Analysis of Images and Patterns, 2019

2018

Scale coding bag of deep features for human attribute and action recognition.

[BibT_eX]

[DOI]

Mach. Vis. Appl., 2018

Bottom-Up Attention Guidance for Recurrent Image Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Image Processing, 2018

Two-Stream Part-Based Deep Representation for Human Attribute Recognition.

[BibT_eX]

[DOI]

Jorma Laaksonen

Proceedings of the 2018 International Conference on Biometrics, 2018

2017

Binary Patterns Encoded Convolutional Neural Networks for Texture Recognition and Remote Sensing Scene Classification.

[BibT_eX]

[DOI]

CoRR, 2017

Top-Down Deep Appearance Attention for Action Recognition.

[BibT_eX]

[DOI]

Proceedings of the Image Analysis - 20th Scandinavian Conference, 2017

TEX-Nets: Binary Patterns Encoded Convolutional Neural Networks for Texture Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval, 2017

2016

Combining Holistic and Part-based Deep Representations for Computational Painting Categorization.

[BibT_eX]

[DOI]

Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval, 2016

2015

Recognizing Actions Through Action-Specific Person Detection.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2015

Compact color-texture description for texture classification.

[BibT_eX]

[DOI]

Pattern Recognit. Lett., 2015

PicSOM Experiments in TRECVID 2015.

[BibT_eX]

[DOI]

Proceedings of the 2015 TREC Video Retrieval Evaluation, 2015

Deep Semantic Pyramids for Human Attributes and Action Recognition.

[BibT_eX]

[DOI]

Proceedings of the Image Analysis - 19th Scandinavian Conference, 2015

2014

Semantic Pyramids for Gender and Action Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2014

PicSOM Experiments in TRECVID 2014.

[BibT_eX]

[DOI]

Proceedings of the 2014 TREC Video Retrieval Evaluation, 2014

2013

Color for Object Detection and Action Recognition.

[BibT_eX]

[DOI]

PhD thesis, 2013

Coloring Action Recognition in Still Images.

[BibT_eX]

[DOI]

Int. J. Comput. Vis., 2013

2012

Color attributes for object detection.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

2011

Opponent Colors for Human Detection.

[BibT_eX]

[DOI]

David Vázquez

Antonio M. López

Proceedings of the Pattern Recognition and Image Analysis - 5th Iberian Conference, 2011

Color Contribution to Part-Based Person Detection in Different Types of Scenarios.

[BibT_eX]

[DOI]