We stand with Ukraine

We stand with Ukraine

Xiaoshan Yang

Orcid: 0000-0001-5453-9755

According to our database¹, Xiaoshan Yang authored at least 85 papers between 2011 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

Online presence:

on orcid.org

On csauthors.net:

Bibliography

2024

Cross-Modal Federated Human Activity Recognition.

[BibT_eX]

[DOI]

,

,

,

IEEE Trans. Pattern Anal. Mach. Intell., August, 2024

CLIP-VG: Self-Paced Curriculum Adapting of CLIP for Visual Grounding.

[BibT_eX]

[DOI]

,

,

,

,

,

IEEE Trans. Multim., 2024

Recovering Generalization via Pre-Training-Like Knowledge Distillation for Out-of-Distribution Visual Question Answering.

[BibT_eX]

[DOI]

,

,

,

IEEE Trans. Multim., 2024

SgVA-CLIP: Semantic-Guided Visual Adapting of Vision-Language Models for Few-Shot Image Classification.

[BibT_eX]

[DOI]

,

,

,

,

IEEE Trans. Multim., 2024

ICT-empowered rural e-commerce development in China: an adaptive structuration perspective.

[BibT_eX]

[DOI]

,

,

,

Int. J. Technol. Manag., 2024

Self-supervised spatial-temporal feature enhancement for one-shot video object detection.

[BibT_eX]

[DOI]

,

Neurocomputing, 2024

An open chest X-ray dataset with benchmarks for automatic radiology report generation in French.

[BibT_eX]

[DOI]

,

Neurocomputing, 2024

A Comprehensive Review of Few-shot Action Recognition.

[BibT_eX]

[DOI]

,

,

,

CoRR, 2024

HiVG: Hierarchical Multimodal Fine-grained Modulation for Visual Grounding.

[BibT_eX]

[DOI]

,

,

,

,

CoRR, 2024

Part-Aware Prompt Tuning for Weakly Supervised Referring Expression Grounding.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the MultiMedia Modeling - 30th International Conference, 2024

Libra: Building Decoupled Vision System on Large Language Models.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the Forty-first International Conference on Machine Learning, 2024

2023

Postpartum pelvic organ prolapse assessment via adversarial feature complementation in heterogeneous data.

[BibT_eX]

[DOI]

,

Neural Comput. Appl., July, 2023

Self-supervised Calorie-aware Heterogeneous Graph Networks for Food Recommendation.

[BibT_eX]

[DOI]

,

,

ACM Trans. Multim. Comput. Commun. Appl., February, 2023

Dual Scene Graph Convolutional Network for Motivation Prediction.

[BibT_eX]

[DOI]

,

,

,

ACM Trans. Multim. Comput. Commun. Appl., 2023

Multi-Source Knowledge Reasoning Graph Network for Multi-Modal Commonsense Inference.

[BibT_eX]

[DOI]

,

,

ACM Trans. Multim. Comput. Commun. Appl., 2023

Counterfactual Scenario-relevant Knowledge-enriched Multi-modal Emotion Reasoning.

[BibT_eX]

[DOI]

,

,

ACM Trans. Multim. Comput. Commun. Appl., 2023

Many Hands Make Light Work: Transferring Knowledge From Auxiliary Tasks for Video-Text Retrieval.

[BibT_eX]

[DOI]

,

,

,

IEEE Trans. Multim., 2023

Zero-Shot Predicate Prediction for Scene Graph Parsing.

[BibT_eX]

[DOI]

,

,

,

,

IEEE Trans. Multim., 2023

Category Knowledge-Guided Parameter Calibration for Few-Shot Object Detection.

[BibT_eX]

[DOI]

,

,

,

,

IEEE Trans. Image Process., 2023

Towards a multimodal human activity dataset for healthcare.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Multim. Syst., 2023

Exploring Multi-Modal Contextual Knowledge for Open-Vocabulary Object Detection.

[BibT_eX]

[DOI]

,

,

,

CoRR, 2023

CLIP-VG: Self-paced Curriculum Adapting of CLIP via Exploiting Pseudo-Language Labels for Visual Grounding.

[BibT_eX]

[DOI]

,

,

,

,

,

CoRR, 2023

Multi-modal Queried Object Detection in the Wild.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Health-Oriented Multimodal Food Question Answering.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the MultiMedia Modeling - 29th International Conference, 2023

mPLUG-Octopus: The Versatile Assistant Empowered by A Modularized End-to-End Multimodal LLM.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

Proceedings of the 31st ACM International Conference on Multimedia, 2023

Client-Adaptive Cross-Model Reconstruction Network for Modality-Incomplete Multimodal Federated Learning.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the 31st ACM International Conference on Multimedia, 2023

Iterative Learning with Extra and Inner Knowledge for Long-tail Dynamic Scene Graph Generation.

[BibT_eX]

[DOI]

,

,

Proceedings of the 31st ACM International Conference on Multimedia, 2023

Fine-grained Primitive Representation Learning for Compositional Zero-shot Classification.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

Active Exploration of Multimodal Complementarity for Few-Shot Action Recognition.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

Learning Hierarchical Video Graph Networks for One-Stop Video Delivery.

[BibT_eX]

[DOI]

,

,

,

ACM Trans. Multim. Comput. Commun. Appl., 2022

The Model May Fit You: User-Generalized Cross-Modal Retrieval.

[BibT_eX]

[DOI]

,

,

,

IEEE Trans. Multim., 2022

Holographic Feature Learning of Egocentric-Exocentric Videos for Multi-Domain Action Recognition.

[BibT_eX]

[DOI]

,

,

,

IEEE Trans. Multim., 2022

A unified framework for multi-modal federated learning.

[BibT_eX]

[DOI]

,

,

,

Neurocomputing, 2022

SgVA-CLIP: Semantic-guided Visual Adapting of Vision-Language Models for Few-shot Image Classification.

[BibT_eX]

[DOI]

,

,

CoRR, 2022

Relative Alignment Network for Source-Free Multimodal Video Domain Adaptation.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Attribute-guided Dynamic Routing Graph Network for Transductive Few-shot Learning.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Shifting More Attention to Visual Backbone: Query-modulated Refinement Networks for End-to-End Visual Grounding.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Dynamic Scene Graph Generation via Anticipatory Pre-training.

[BibT_eX]

[DOI]

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Cross-Modal Federated Human Activity Recognition via Modality-Agnostic and Modality-Specific Representation Learning.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021

Health Status Prediction with Local-Global Heterogeneous Behavior Graph.

[BibT_eX]

[DOI]

,

,

,

ACM Trans. Multim. Comput. Commun. Appl., 2021

Knowledge-driven Egocentric Multimodal Activity Recognition.

[BibT_eX]

[DOI]

,

,

,

,

ACM Trans. Multim. Comput. Commun. Appl., 2021

Learning Coarse-to-Fine Graph Neural Networks for Video-Text Retrieval.

[BibT_eX]

[DOI]

,

,

,

IEEE Trans. Multim., 2021

Emotion Knowledge Driven Video Highlight Detection.

[BibT_eX]

[DOI]

,

,

IEEE Trans. Multim., 2021

Unsupervised Video Summarization via Relation-Aware Assignment Learning.

[BibT_eX]

[DOI]

,

,

,

IEEE Trans. Multim., 2021

Dynamic Hypergraph Convolutional Networks for Skeleton-Based Action Recognition.

[BibT_eX]

[DOI]

,

,

,

,

,

CoRR, 2021

Few-shot Egocentric Multimodal Activity Recognition.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the MMAsia '21: ACM Multimedia Asia, Gold Coast, Australia, December 1, 2021

Zero-shot Video Emotion Recognition via Multimodal Protagonist-aware Transformer Network.

[BibT_eX]

[DOI]

,

,

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Multimodal Global Relation Knowledge Distillation for Egocentric Action Anticipation.

[BibT_eX]

[DOI]

,

,

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Few-shot Learning for Multi-Modality Tasks.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

ECKPN: Explicit Class Knowledge Propagation Network for Transductive Few-Shot Learning.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020

Cross-domain personalized image captioning.

[BibT_eX]

[DOI]

,

,

Multim. Tools Appl., 2020

Asymmetric multi-stage CNNs for small-scale pedestrian detection.

[BibT_eX]

[DOI]

,

,

,

Neurocomputing, 2020

Discriminative multimodal embedding for event classification.

[BibT_eX]

[DOI]

,

,

,

Neurocomputing, 2020

Data-driven Image Restoration with Option-driven Learning for Big and Small Astronomical Image Datasets.

[BibT_eX]

[DOI]

,

,

,

,

CoRR, 2020

Multi-hop Interactive Cross-Modal Retrieval.

[BibT_eX]

[DOI]

,

,

Proceedings of the MultiMedia Modeling - 26th International Conference, 2020

Structured Neural Motifs: Scene Graph Parsing via Enhanced Context.

[BibT_eX]

[DOI]

,

,

Proceedings of the MultiMedia Modeling - 26th International Conference, 2020

Find Objects and Focus on Highlights: Mining Object Semantics for Video Highlight Detection via Graph Neural Networks.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

Image Captioning by Asking Questions.

[BibT_eX]

[DOI]

,

ACM Trans. Multim. Comput. Commun. Appl., 2019

Time-Guided High-Order Attention Model of Longitudinal Heterogeneous Healthcare Data.

[BibT_eX]

[DOI]

,

,

Proceedings of the PRICAI 2019: Trends in Artificial Intelligence, 2019

Multimodal Attribute and Feature Embedding for Activity Recognition.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the MMAsia '19: ACM Multimedia Asia, Beijing, China, December 16-18, 2019, 2019

Biomedia ACM MM Grand Challenge 2019: Using Data Enhancement to Solve Sample Unbalance.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 27th ACM International Conference on Multimedia, 2019

Exploring Feature Representation and Training Strategies in Temporal Action Localization.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the 2019 IEEE International Conference on Image Processing, 2019

2018

A New Incentive Policy for Improving Data Service in P2P Networks.

[BibT_eX]

[DOI]

,

,

Wirel. Pers. Commun., 2018

Text2Video: An End-to-end Learning Framework for Expressing Text With Videos.

[BibT_eX]

[DOI]

,

,

IEEE Trans. Multim., 2018

Deep-Structured Event Modeling for User-Generated Photos.

[BibT_eX]

[DOI]

,

,

IEEE Trans. Multim., 2018

Three-Dimensional Attention-Based Deep Ranking Model for Video Highlight Detection.

[BibT_eX]

[DOI]

,

,

,

,

,

IEEE Trans. Multim., 2018

P2T: Part-to-Target Tracking via Deep Regression Learning.

[BibT_eX]

[DOI]

,

,

,

IEEE Trans. Image Process., 2018

A Unified Framework for Multimodal Domain Adaptation.

[BibT_eX]

[DOI]

,

,

Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

A Standalone Demo for Quiz Game "Describe and Guess".

[BibT_eX]

[DOI]

,

,

Proceedings of the IEEE 1st Conference on Multimedia Information Processing and Retrieval, 2018

Attribute-Assisted Domain Transfer from Image to Sketch.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the IEEE 1st Conference on Multimedia Information Processing and Retrieval, 2018

2017

Deep Relative Tracking.

[BibT_eX]

[DOI]

,

,

,

IEEE Trans. Image Process., 2017

Video Highlight Detection via Deep Ranking Modeling.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the Image and Video Technology - 8th Pacific-Rim Symposium, 2017

Research on Evaluation Method of Big Data Storage Utilization.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the 4th Intl Conf on Applied Computing and Information Technology/3rd Intl Conf on Computational Science/Intelligence and Applied Informatics/1st Intl Conf on Big Data, 2017

Research on endurance evaluation for NAND flash-based solid state drive.

[BibT_eX]

[DOI]

,

,

Proceedings of the 16th IEEE/ACIS International Conference on Computer and Information Science, 2017

2016

Semantic Feature Mining for Video Event Understanding.

[BibT_eX]

[DOI]

,

,

ACM Trans. Multim. Comput. Commun. Appl., 2016

Deep Relative Attributes.

[BibT_eX]

[DOI]

,

,

,

,

M. Shamim Hossain

,

IEEE Trans. Multim., 2016

Abnormal Event Discovery in User Generated Photos.

[BibT_eX]

[DOI]

,

,

Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

2015

Boosted Multifeature Learning for Cross-Domain Transfer.

[BibT_eX]

[DOI]

,

,

,

Ming-Hsuan Yang

ACM Trans. Multim. Comput. Commun. Appl., 2015

Automatic Visual Concept Learning for Social Event Understanding.

[BibT_eX]

[DOI]

,

,

,

M. Shamim Hossain

IEEE Trans. Multim., 2015

Cross-Domain Feature Learning in Multimedia.

[BibT_eX]

[DOI]

,

,

IEEE Trans. Multim., 2015

A new discriminative coding method for image classification.

[BibT_eX]

[DOI]

,

,

Multim. Syst., 2015

2013

Intrinsic Image Decomposition Using Optimization and User Scribbles.

[BibT_eX]

[DOI]

,

,

,

IEEE Trans. Cybern., 2013

Graph-Guided Fusion Penalty Based Sparse Coding for Image Classification.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the Advances in Multimedia Information Processing - PCM 2013, 2013

Locality discriminative coding for image classification.

[BibT_eX]

[DOI]

,

,

Proceedings of the International Conference on Internet Multimedia Computing and Service, 2013

2011

Intrinsic images using optimization.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011

Loading...