Xiaoshan Yang

Orcid: 0000-0001-5453-9755

According to our database1, Xiaoshan Yang authored at least 80 papers between 2011 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
CLIP-VG: Self-Paced Curriculum Adapting of CLIP for Visual Grounding.
IEEE Trans. Multim., 2024

Recovering Generalization via Pre-Training-Like Knowledge Distillation for Out-of-Distribution Visual Question Answering.
IEEE Trans. Multim., 2024

SgVA-CLIP: Semantic-Guided Visual Adapting of Vision-Language Models for Few-Shot Image Classification.
IEEE Trans. Multim., 2024

ICT-empowered rural e-commerce development in China: an adaptive structuration perspective.
Int. J. Technol. Manag., 2024

Part-Aware Prompt Tuning for Weakly Supervised Referring Expression Grounding.
Proceedings of the MultiMedia Modeling - 30th International Conference, 2024

2023
Postpartum pelvic organ prolapse assessment via adversarial feature complementation in heterogeneous data.
Neural Comput. Appl., July, 2023

Self-supervised Calorie-aware Heterogeneous Graph Networks for Food Recommendation.
ACM Trans. Multim. Comput. Commun. Appl., February, 2023

Dual Scene Graph Convolutional Network for Motivation Prediction.
ACM Trans. Multim. Comput. Commun. Appl., 2023

Multi-Source Knowledge Reasoning Graph Network for Multi-Modal Commonsense Inference.
ACM Trans. Multim. Comput. Commun. Appl., 2023

Counterfactual Scenario-relevant Knowledge-enriched Multi-modal Emotion Reasoning.
ACM Trans. Multim. Comput. Commun. Appl., 2023

Many Hands Make Light Work: Transferring Knowledge From Auxiliary Tasks for Video-Text Retrieval.
IEEE Trans. Multim., 2023

Zero-Shot Predicate Prediction for Scene Graph Parsing.
IEEE Trans. Multim., 2023

Category Knowledge-Guided Parameter Calibration for Few-Shot Object Detection.
IEEE Trans. Image Process., 2023

Towards a multimodal human activity dataset for healthcare.
Multim. Syst., 2023

Exploring Multi-Modal Contextual Knowledge for Open-Vocabulary Object Detection.
CoRR, 2023

Multi-modal Queried Object Detection in the Wild.
CoRR, 2023

CLIP-VG: Self-paced Curriculum Adapting of CLIP via Exploiting Pseudo-Language Labels for Visual Grounding.
CoRR, 2023

Multi-modal Queried Object Detection in the Wild.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Health-Oriented Multimodal Food Question Answering.
Proceedings of the MultiMedia Modeling - 29th International Conference, 2023

mPLUG-Octopus: The Versatile Assistant Empowered by A Modularized End-to-End Multimodal LLM.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Client-Adaptive Cross-Model Reconstruction Network for Modality-Incomplete Multimodal Federated Learning.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Iterative Learning with Extra and Inner Knowledge for Long-tail Dynamic Scene Graph Generation.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Fine-grained Primitive Representation Learning for Compositional Zero-shot Classification.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

Active Exploration of Multimodal Complementarity for Few-Shot Action Recognition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
Learning Hierarchical Video Graph Networks for One-Stop Video Delivery.
ACM Trans. Multim. Comput. Commun. Appl., 2022

The Model May Fit You: User-Generalized Cross-Modal Retrieval.
IEEE Trans. Multim., 2022

Holographic Feature Learning of Egocentric-Exocentric Videos for Multi-Domain Action Recognition.
IEEE Trans. Multim., 2022

A unified framework for multi-modal federated learning.
Neurocomputing, 2022

SgVA-CLIP: Semantic-guided Visual Adapting of Vision-Language Models for Few-shot Image Classification.
CoRR, 2022

Relative Alignment Network for Source-Free Multimodal Video Domain Adaptation.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Attribute-guided Dynamic Routing Graph Network for Transductive Few-shot Learning.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Shifting More Attention to Visual Backbone: Query-modulated Refinement Networks for End-to-End Visual Grounding.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Dynamic Scene Graph Generation via Anticipatory Pre-training.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Cross-Modal Federated Human Activity Recognition via Modality-Agnostic and Modality-Specific Representation Learning.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Health Status Prediction with Local-Global Heterogeneous Behavior Graph.
ACM Trans. Multim. Comput. Commun. Appl., 2021

Knowledge-driven Egocentric Multimodal Activity Recognition.
ACM Trans. Multim. Comput. Commun. Appl., 2021

Learning Coarse-to-Fine Graph Neural Networks for Video-Text Retrieval.
IEEE Trans. Multim., 2021

Emotion Knowledge Driven Video Highlight Detection.
IEEE Trans. Multim., 2021

Unsupervised Video Summarization via Relation-Aware Assignment Learning.
IEEE Trans. Multim., 2021

Dynamic Hypergraph Convolutional Networks for Skeleton-Based Action Recognition.
CoRR, 2021

Few-shot Egocentric Multimodal Activity Recognition.
Proceedings of the MMAsia '21: ACM Multimedia Asia, Gold Coast, Australia, December 1, 2021

Zero-shot Video Emotion Recognition via Multimodal Protagonist-aware Transformer Network.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Multimodal Global Relation Knowledge Distillation for Egocentric Action Anticipation.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Few-shot Learning for Multi-Modality Tasks.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

ECKPN: Explicit Class Knowledge Propagation Network for Transductive Few-Shot Learning.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020
Cross-domain personalized image captioning.
Multim. Tools Appl., 2020

Asymmetric multi-stage CNNs for small-scale pedestrian detection.
Neurocomputing, 2020

Discriminative multimodal embedding for event classification.
Neurocomputing, 2020

Data-driven Image Restoration with Option-driven Learning for Big and Small Astronomical Image Datasets.
CoRR, 2020

Multi-hop Interactive Cross-Modal Retrieval.
Proceedings of the MultiMedia Modeling - 26th International Conference, 2020

Structured Neural Motifs: Scene Graph Parsing via Enhanced Context.
Proceedings of the MultiMedia Modeling - 26th International Conference, 2020

Find Objects and Focus on Highlights: Mining Object Semantics for Video Highlight Detection via Graph Neural Networks.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Image Captioning by Asking Questions.
ACM Trans. Multim. Comput. Commun. Appl., 2019

Time-Guided High-Order Attention Model of Longitudinal Heterogeneous Healthcare Data.
Proceedings of the PRICAI 2019: Trends in Artificial Intelligence, 2019

Multimodal Attribute and Feature Embedding for Activity Recognition.
Proceedings of the MMAsia '19: ACM Multimedia Asia, Beijing, China, December 16-18, 2019, 2019

Biomedia ACM MM Grand Challenge 2019: Using Data Enhancement to Solve Sample Unbalance.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Exploring Feature Representation and Training Strategies in Temporal Action Localization.
Proceedings of the 2019 IEEE International Conference on Image Processing, 2019

2018
A New Incentive Policy for Improving Data Service in P2P Networks.
Wirel. Pers. Commun., 2018

Text2Video: An End-to-end Learning Framework for Expressing Text With Videos.
IEEE Trans. Multim., 2018

Deep-Structured Event Modeling for User-Generated Photos.
IEEE Trans. Multim., 2018

Three-Dimensional Attention-Based Deep Ranking Model for Video Highlight Detection.
IEEE Trans. Multim., 2018

P2T: Part-to-Target Tracking via Deep Regression Learning.
IEEE Trans. Image Process., 2018

A Unified Framework for Multimodal Domain Adaptation.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

A Standalone Demo for Quiz Game "Describe and Guess".
Proceedings of the IEEE 1st Conference on Multimedia Information Processing and Retrieval, 2018

Attribute-Assisted Domain Transfer from Image to Sketch.
Proceedings of the IEEE 1st Conference on Multimedia Information Processing and Retrieval, 2018

2017
Deep Relative Tracking.
IEEE Trans. Image Process., 2017

Video Highlight Detection via Deep Ranking Modeling.
Proceedings of the Image and Video Technology - 8th Pacific-Rim Symposium, 2017

Research on Evaluation Method of Big Data Storage Utilization.
Proceedings of the 4th Intl Conf on Applied Computing and Information Technology/3rd Intl Conf on Computational Science/Intelligence and Applied Informatics/1st Intl Conf on Big Data, 2017

Research on endurance evaluation for NAND flash-based solid state drive.
Proceedings of the 16th IEEE/ACIS International Conference on Computer and Information Science, 2017

2016
Semantic Feature Mining for Video Event Understanding.
ACM Trans. Multim. Comput. Commun. Appl., 2016

Deep Relative Attributes.
IEEE Trans. Multim., 2016

Abnormal Event Discovery in User Generated Photos.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

2015
Boosted Multifeature Learning for Cross-Domain Transfer.
ACM Trans. Multim. Comput. Commun. Appl., 2015

Automatic Visual Concept Learning for Social Event Understanding.
IEEE Trans. Multim., 2015

Cross-Domain Feature Learning in Multimedia.
IEEE Trans. Multim., 2015

A new discriminative coding method for image classification.
Multim. Syst., 2015

2013
Intrinsic Image Decomposition Using Optimization and User Scribbles.
IEEE Trans. Cybern., 2013

Graph-Guided Fusion Penalty Based Sparse Coding for Image Classification.
Proceedings of the Advances in Multimedia Information Processing - PCM 2013, 2013

Locality discriminative coding for image classification.
Proceedings of the International Conference on Internet Multimedia Computing and Service, 2013

2011
Intrinsic images using optimization.
Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011


  Loading...