Xiaoyong Wei

Orcid: 0000-0002-5706-5177

According to our database1, Xiaoyong Wei authored at least 62 papers between 2005 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.



In proceedings 
PhD thesis 


On csauthors.net:


Towards Bridged Vision and Language: Learning Cross-Modal Knowledge Representation for Relation Extraction.
IEEE Trans. Circuits Syst. Video Technol., January, 2024

Design and Compensation Control of Modular Variable Stiffness Continuum Manipulator for Nasal Surgery.
IEEE Trans. Instrum. Meas., 2024

A Survey on Personalized Content Synthesis with Diffusion Models.
CoRR, 2024

Generative Active Learning for Image Synthesis Personalization.
CoRR, 2024

A Picture Is Worth a Graph: Blueprint Debate on Graph for Multimodal Reasoning.
CoRR, 2024

Instruct Once, Chat Consistently in Multiple Rounds: An Efficient Tuning Framework for Dialogue.
CoRR, 2024

FedConv: A Learning-on-Model Paradigm for Heterogeneous Federated Clients.
Proceedings of the 22nd Annual International Conference on Mobile Systems, 2024

Compositional Inversion for Stable Diffusion Models.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Entity-Graph Enhanced Cross-Modal Pretraining for Instance-Level Product Retrieval.
IEEE Trans. Pattern Anal. Mach. Intell., November, 2023

Large-scale Multi-modal Pre-trained Models: A Comprehensive Survey.
Mach. Intell. Res., August, 2023

DRAKE: Deep Pair-Wise Relation Alignment for Knowledge-Enhanced Multimodal Scene Graph Generation in Social Media Posts.
IEEE Trans. Circuits Syst. Video Technol., July, 2023

Region Attentive Action Unit Intensity Estimation With Uncertainty Weighted Multi-Task Learning.
IEEE Trans. Affect. Comput., 2023

Untargeted Black-box Attacks for Social Recommendations.
CoRR, 2023

Empowering Molecule Discovery for Molecule-Caption Translation with Large Language Models: A ChatGPT Perspective.
CoRR, 2023

UniDiff: Advancing Vision-Language Models with Generative and Discriminative Learning.
CoRR, 2023

Open-Scenario Domain Adaptive Object Detection in Autonomous Driving.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Rethinking Multimodal Entity and Relation Extraction from a Translation Point of View.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

A multi-scale multi-attention network for dynamic facial expression recognition.
Multim. Syst., 2022

Deep learning-based person re-identification methods: A survey and outlook of recent works.
Image Vis. Comput., 2022

Indicative Image Retrieval: Turning Blackbox Learning into Grey.
CoRR, 2022

Conceptor Learning for Class Activation Mapping.
CoRR, 2022

Identifying the kind behind SMILES - anatomical therapeutic chemical classification using structure-only representations.
Briefings Bioinform., 2022

M5Product: Self-harmonized Contrastive Learning for E-commercial Multi-modal Pretraining.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Deep Collocative Learning for Immunofixation Electrophoresis Image Analysis.
IEEE Trans. Medical Imaging, 2021

Analysis of a novel manipulator with low melting point alloy initiated stiffness variation and shape detection for minimally invasive surgery.
Ind. Robot, 2021

Multi-continuum manipulators shape reconstruction using inertial navigation sensors and cameras.
Ind. Robot, 2021

Deep learning-based person re-identification methods: A survey and outlook of recent works.
CoRR, 2021

Global-Local Dynamic Feature Alignment Network for Person Re-Identification.
CoRR, 2021

M5Product: A Multi-modal Pretraining Benchmark for E-commercial Product Downstream Tasks.
CoRR, 2021

MDA-GCNFTG: identifying miRNA-disease associations based on graph convolutional networks via graph sampling through the feature and topology graph.
Briefings Bioinform., 2021

A Piezoelectric Tactile Sensor for Tissue Stiffness Detection with Arbitrary Contact Angle.
Sensors, 2020

Editorial for the ICMR 2019 special issue.
Int. J. Multim. Inf. Retr., 2020

Exploring Entity-Level Spatial Relationships for Image-Text Matching.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Multi-View Weighted Feature Fusion Using CNN for Pneumonia Detection on Chest X-Rays.
Proceedings of the 22nd IEEE International Conference on E-health Networking, 2020

Palpation-Based Multi-Tumor Detection Method Considering Moving Distance for Robot-assisted Minimally Invasive Surgery.
Proceedings of the 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2020

Development of a variable-stiffness and shape-detection manipulator based on low-melting-point-alloy for minimally invasive surgery.
Proceedings of the 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2020

ParNet: Position-aware Aggregated Relation Network for Image-Text matching.
CoRR, 2019

A Piezoelectric Tactile Sensor and Human-inspired Tactile Exploration Strategy for Lump Palpation in Tele-operative Robotic Minimally Invasive Surgery.
Proceedings of the 2019 IEEE International Conference on Robotics and Biomimetics, 2019

Healthism@MediaEval 2019 - Insights for Wellbeing Task: Factors Related to Subjective and Objective Health.
Proceedings of the Working Notes Proceedings of the MediaEval 2019 Workshop, 2019

Attention on Attention for Image Captioning.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Contextual Noise Reduction for Domain Adaptive Near-Duplicate Retrieval on Merchandize Images.
IEEE Trans. Image Process., 2017

Visual Typo Correction by Collocative Optimization: A Case Study on Merchandize Images.
IEEE Trans. Image Process., 2014

Collaborative error reduction for hierarchical classification.
Comput. Vis. Image Underst., 2014

Coaching the Exploration and Exploitation in Active Learning for Interactive Video Retrieval.
IEEE Trans. Image Process., 2013

Free-gram phrase identification for modeling Chinese text.
Inf. Process. Lett., 2013

Error recovered hierarchical classification.
Proceedings of the ACM Multimedia Conference, 2013

Mining in-class social networks for large-scale pedagogical analysis.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Concept-Driven Multi-Modality Fusion for Video Search.
IEEE Trans. Circuits Syst. Video Technol., 2011

Coached active learning for interactive video search.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

VIREO at TRECVID 2010: Semantic Indexing, Known-Item Search, and Content-Based Copy Detection.
Proceedings of the TRECVID 2010 workshop participants notebook papers, 2010

VIREO/DVMM at TRECVID 2009: High-Level Feature Extraction, Automatic Video Search, and Content-Based Copy Detection.
Proceedings of the TRECVID 2009 workshop participants notebook papers, 2009

Exploring inter-concept relationship with context space for semantic video indexing.
Proceedings of the 8th ACM International Conference on Image and Video Retrieval, 2009

Selection of Concept Detectors for Video Search by Ontology-Enriched Semantic Spaces.
IEEE Trans. Multim., 2008

Beyond Semantic Search: What You Observe May Not Be What You Think.
Proceedings of the TRECVID 2008 workshop participants notebook papers, 2008

Fusing semantics, observability, reliability and diversity of concept detectors for video search.
Proceedings of the 16th International Conference on Multimedia 2008, 2008

Experimenting VIREO-374: Bag-of-Visual-Words and Visual-Based Ontology for Semantic Video Indexing and search.
Proceedings of the TRECVID 2007 workshop participants notebook papers, 2007

Ontology-enriched semantic space for video search.
Proceedings of the 15th International Conference on Multimedia 2007, 2007

Modeling Local Interest Points for Semantic Detection and Video Search at TRECVID 2006.
Proceedings of the 2006 TREC Video Retrieval Evaluation, 2006

Hierarchical Hidden Markov Model for Rushes Structuring and Indexing.
Proceedings of the Image and Video Retrieval, 5th International Conference, 2006

Motion Driven Approaches to Shot Boundary Detection, Low-Level Feature Extraction and BBC Rushes Characterization at TRECVID 2005.
Proceedings of the 2005 TREC Video Retrieval Evaluation, 2005

Authorization Based on Palmprint.
Proceedings of the Advances in Intelligent Computing, 2005

Multibiometrics Based on Palmprint and Handgeometry.
Proceedings of the 4th Annual ACIS International Conference on Computer and Information Science (ICIS 2005), 2005