Xiangyu Zhao

This page is a disambiguation page, it actually contains multiple papers from persons of the same or a similar name.

Known people with the same name:

Bibliography

2026
MG-LLaVA: Toward Multi-Granularity Visual Instruction Tuning.
IEEE Trans. Circuits Syst. Video Technol., April, 2026

From Local Indices to Global Identifiers: Generative Reranking for Recommender Systems via Global Action Space.
CoRR, April, 2026

GRADE: Benchmarking Discipline-Informed Reasoning in Image Editing.
CoRR, March, 2026

Trust Your Critic: Robust Reward Modeling and Reinforcement Learning for Faithful Image Editing and Generation.
CoRR, March, 2026

EvoTok: A Unified Image Tokenizer via Residual Latent Evolution for Visual Understanding and Generation.
CoRR, March, 2026

RC-NF: Robot-Conditioned Normalizing Flow for Real-Time Anomaly Detection in Robotic Manipulation.
CoRR, March, 2026

Fairness Begins with State: Purifying Latent Preferences for Hierarchical Reinforcement Learning in Interactive Recommendation.
CoRR, March, 2026

When Safety Collides: Resolving Multi-Category Harmful Conflicts in Text-to-Image Diffusion via Adaptive Safety Guidance.
CoRR, February, 2026

RISE-Video: Can Video Generators Decode Implicit World Rules?
CoRR, February, 2026

PsychEthicsBench: Evaluating Large Language Models Against Australian Mental Health Ethics.
CoRR, January, 2026

Multi-level functional network-based PD identification <i>via</i> graph deep learning.
PeerJ Comput. Sci., 2026

scDIAGRAM: detecting chromatin compartments from individual single-cell Hi-C matrix without imputation or reference features.
Briefings Bioinform., 2026

CHiRPE: A Step Towards Real-World Clinical NLP with Clinician-Oriented Model Explanations.
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics, 2026

2025
ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning.
CoRR, December, 2025

It Hears, It Sees too: Multi-Modal LLM for Depression Detection By Integrating Visual Understanding into Audio Language Models.
CoRR, November, 2025

Efficient Reasoning via Reward Model.
CoRR, November, 2025

<i>ReAL</i>: Improving Image-Text Retrieval with Authentic Negative Repository Learning.
ACM Trans. Multim. Comput. Commun. Appl., October, 2025

MetaCaptioner: Towards Generalist Visual Captioning with Open-source Suites.
CoRR, October, 2025

MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization.
CoRR, October, 2025

Autonomous collaborative rescue of drone swarms in cluttered environments.
Clust. Comput., October, 2025

GenExam: A Multidisciplinary Text-to-Image Exam.
CoRR, September, 2025

GLEAM: Learning to Match and Explain in Cross-View Geo-Localization.
CoRR, September, 2025

A Survey of Scientific Large Language Models: From Data Foundations to Agent Frontiers.
CoRR, August, 2025

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency.
CoRR, August, 2025

MMBench-GUI: Hierarchical Multi-Platform Evaluation Framework for GUI Agents.
CoRR, July, 2025

GEMeX-ThinkVG: Towards Thinking with Visual Grounding in Medical VQA via Reinforcement Learning.
CoRR, June, 2025

Elicit and Enhance: Advancing Multimodal Reasoning in Medical Scenarios.
CoRR, May, 2025

MSEarth: A Benchmark for Multimodal Scientific Comprehension of Earth Science.
CoRR, May, 2025

EarthSE: A Benchmark for Evaluating Earth Scientific Exploration Capability of LLMs.
CoRR, May, 2025

Interleave-VLA: Enhancing Robot Manipulation with Interleaved Image-Text Instructions.
CoRR, May, 2025

Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing.
CoRR, April, 2025

MetaLoRA: Tensor-Enhanced Adaptive Low-Rank Fine-tuning.
CoRR, April, 2025

Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLM.
CoRR, March, 2025

VisualPRM: An Effective Process Reward Model for Multimodal Reasoning.
CoRR, March, 2025

SolBench: A Dataset and Benchmark for Evaluating Functional Correctness in Solidity Code Completion and Repair.
CoRR, March, 2025

OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference.
CoRR, February, 2025

FSGformer: Frequency Separation and Guidance Transformer for Pansharpening.
IEEE Trans. Geosci. Remote. Sens., 2025

Large multimodal models evaluation: a survey.
Sci. China Inf. Sci., 2025

Two-Dimensional Coupling-Enhanced Cubic Hyperchaotic Map with Exponential Parameters: Construction, Analysis, and Application in Hierarchical Significance-Aware Multi-Image Encryption.
Axioms, 2025

Research and Implementation of Secure Dual Algorithm-Based Transmission System for Sensitive Data of Large Models in an Open Environment.
IEEE Access, 2025

Discrete Event Simulation-Driven Method Solving Permutation Flowshop Scheduling Problem in Digital Twins.
IEEE Access, 2025

A Multi-Step-Release Power Management Circuit with Small-Volume Inductors for Triboelectric-Powered IoT Nodes.
Proceedings of the 11th IEEE World Forum on Internet of Things, 2025

A High-Accuracy Wireless Peak Detection Module for Triboelectric Nanogenerators Power Management.
Proceedings of the 11th IEEE World Forum on Internet of Things, 2025

GOBench: Benchmarking Geometric Optics Generation and Understanding of MLLMs.
Proceedings of the 33rd ACM International Conference on Multimedia, 2025

GEMeX-RMCoT: An Enhanced Med-VQA Dataset for Region-Aware Multimodal Chain-of-Thought Reasoning.
Proceedings of the 33rd ACM International Conference on Multimedia, 2025

MeloDance: Dance Generation Guided by Music Structure and Emotion.
Proceedings of the 2025 International Conference on Multimedia Retrieval, 2025

Twice the Gradient, Twice the Privacy Risk in Federated Learning? A Case Study of Federated Recommendation Systems.
Proceedings of the International Joint Conference on Neural Networks, 2025

GEST: Dual Structured Exploration with Graph ODE for Spatio-Temporal Dynamic System Modeling.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2025

WeatherGFM: Learning a Weather Generalist Foundation Model via In-context Learning.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

GenieBlue: Integrating Both Linguistic and Multimodal Capabilities for Large Language Models on Mobile Devices.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

Creation-Mmbench: Assessing Context-Aware Creative Intelligence in Mllms.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

MM-IFEngine: Towards Multimodal Instruction Following.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

DiffSR: Learning Radar Reflectivity Synthesis via Diffusion Model from Satellite Observations.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

Redundancy Principles for MLLMs Benchmarks.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024
Effective Backdoor Attack on Graph Neural Networks in Spectral Domain.
IEEE Internet Things J., April, 2024

Atrial Fibrillation Detection with Single-Lead Electrocardiogram Based on Temporal Convolutional Network-ResNet.
Sensors, January, 2024

Regression-enhanced Entrotaxis as an autonomous search algorithm for seeking an unknown gas leakage source.
Expert Syst. Appl., 2024

Prompt Recursive Search: A Living Framework with Adaptive Growth in LLM Auto-Prompting.
CoRR, 2024

MG-LLaVA: Towards Multi-Granularity Visual Instruction Tuning.
CoRR, 2024

Enhancing Real-World Complex Network Representations with Hyperedge Augmentation.
CoRR, 2024

An Open and Comprehensive Pipeline for Unified Object Grounding and Detection.
CoRR, 2024

Transferable Watermarking to Self-supervised Pre-trained Graph Encoders by Trigger Embeddings.
Proceedings of the IEEE International Workshop on Information Forensics and Security, 2024

MMBench-Video: A Long-Form Multi-Shot Benchmark for Holistic Video Understanding.
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

UniFashion: A Unified Vision-Language Model for Multimodal Fashion Retrieval and Generation.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Enhancing Node Representations for Real-World Complex Networks with Topological Augmentation.
Proceedings of the ECAI 2024 - 27th European Conference on Artificial Intelligence, 19-24 October 2024, Santiago de Compostela, Spain, 2024

MGVul: a Multi-Granularity Detection Framework for Software Vulnerability.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2024

ET-SSM: Linear-Time Encrypted Traffic Classification Method Based On Structured State Space Model.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2024

EasyGen: Easing Multimodal Generation with BiDiffuser and LLMs.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023
Excess Emission Sources Identification via Sparse Monitoring Networks.
IEEE Trans. Ind. Informatics, October, 2023

Multilayer Feature Boosting Framework for Pipeline Inspection Using an Intelligent Pig System.
IEEE Trans. Ind. Informatics, July, 2023

The repertoire of copy number alteration signatures in human cancer.
Briefings Bioinform., March, 2023

Making Multimodal Generation Easier: When Diffusion Models Meet LLMs.
CoRR, 2023

Hybrid Graph: A Unified Graph Representation with Datasets and Benchmarks for Complex Graphs.
CoRR, 2023

Task-Agnostic Graph Neural Network Evaluation via Adversarial Collaboration.
CoRR, 2023

Matching-to-Detecting: Establishing Dense and Reliable Correspondences Between Images.
Proceedings of the Pattern Recognition and Computer Vision - 6th Chinese Conference, 2023

Will More Expressive Graph Neural Networks Do Better on Generative Tasks?
Proceedings of the Learning on Graphs Conference, 27-30 November 2023, Virtual Event., 2023

2022
Wavelet-Attention CNN for image classification.
Multim. Syst., 2022

"Are you okay, honey?": Recognizing Emotions among Couples Managing Diabetes in Daily Life using Multimodal Real-World Smartwatch Data.
CoRR, 2022

Building a 3-Player Mahjong AI using Deep Reinforcement Learning.
CoRR, 2022

Wavelet-Attention CNN for Image Classification.
CoRR, 2022

Intelligent UAV Equipped with Infrared Equipment Applied in Measuring Zero Insulatorat in Different Temperatures.
Proceedings of the 7th International Conference on Cyber Security and Information Engineering, 2022

Improving Dialogue Generation via Proactively Querying Grounded Knowledge.
Proceedings of the IEEE International Conference on Acoustics, 2022

Towards a Competitive 3-Player Mahjong AI using Deep Reinforcement Learning.
Proceedings of the IEEE Conference on Games, CoG 2022, Beijing, 2022

2021
Sampling and Comparator Speed-Enhancement Techniques for Near-Threshold SAR ADCs.
IEEE Open J. Circuits Syst., 2021

Structural Watermarking to Deep Neural Networks via Network Channel Pruning.
Proceedings of the IEEE International Workshop on Information Forensics and Security, 2021

A Difficulty-Aware Framework for Churn Prediction and Intervention in Games.
Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021

Watermarking Graph Neural Networks by Random Graphs.
Proceedings of the 9th International Symposium on Digital Forensics and Security, 2021

2020
Pulmonary nodule detection on chest radiographs using balanced convolutional neural network and classic candidate detection.
Artif. Intell. Medicine, 2020

Multiple Knowledge Syncretic Transformer for Natural Dialogue Generation.
Proceedings of the WWW '20: The Web Conference 2020, Taipei, Taiwan, April 20-24, 2020, 2020

Hierarchical Interactive Matching Network for Multi-turn Response Selection in Retrieval-Based Chatbots.
Proceedings of the Neural Information Processing - 27th International Conference, 2020

2019
A Gaussian process mixture model-based hard-cut iterative learning algorithm for air quality prediction.
Appl. Soft Comput., 2019

Static-Dynamically Attentive Variational Network for Dialogue Generation.
Aust. J. Intell. Inf. Process. Syst., 2019

A Loss-Compensated 5-Bit Ka-Band Digital Phase Shifter with Low RMS Phase/Gain Error Over Wide Temperature Ranges.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2019

A Semi-Supervised Stable Variational Network for Promoting Replier-Consistency in Dialogue Generation.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Research and Design of Home Care System of Internet of Things Based on Wireless Network.
Proceedings of the Communications, Signal Processing, and Systems, 2019

2018
Multiobjective Optimization of a Hollow Plunger Type Solenoid for High Speed On/Off Valve.
IEEE Trans. Ind. Electron., 2018

Adaptive CPG-Based Impedance Control for Assistive Lower Limb Exoskeleton.
Proceedings of the IEEE International Conference on Robotics and Biomimetics, 2018

Coarse Label Refined Knowledge Reasoning for Fine-Grained Visual Categorization.
Proceedings of the Intelligence Science and Big Data Engineering, 2018

2017
Multi-Objective Optimal Design of a Toroidally Wound Radial-Flux Halbach Permanent Magnet Array Limited Angle Torque Motor.
IEEE Trans. Ind. Electron., 2017

DeepGait: A Learning Deep Convolutional Representation for Gait Recognition.
Proceedings of the Biometric Recognition - 12th Chinese Conference, 2017

Study on super-resolution reconstruction algorithm based on sparse representation and dictionary learning for remote sensing image.
Proceedings of the 10th International Congress on Image and Signal Processing, 2017

2016
A Coupled User Clustering Algorithm for Web-based Learning Systems.
Proceedings of the 9th International Conference on Educational Data Mining, 2016

2015
Acoustic Source Localization with Distributed Smartphone Arrays.
Proceedings of the 2015 IEEE Global Communications Conference, 2015

2014
Identifying effective multiple spreaders by coloring complex networks.
CoRR, 2014

Considering Rating as Probability Distribution of Attitude in Recommender System.
Proceedings of the Web-Age Information Management, 2014

A Personalized User Evaluation Model for Web-Based Learning Systems.
Proceedings of the 26th IEEE International Conference on Tools with Artificial Intelligence, 2014

2009
A Method of Gear Fault Detection Based on Wavelet Transform.
Proceedings of the Business Intelligence: Artificial Intelligence in Business, 2009

2005
Linguistic Model for the Controlled Object.
Proceedings of the Fuzzy Systems and Knowledge Discovery, Second International Conference, 2005


  Loading...