Yu Wang

Orcid: 0000-0001-9500-081X

Affiliations:
  • Shanghai Jiao Tong University, Cooperative Medianet Innovation Center, China
  • University of Cambridge, Department of Engineering, UK
  • Imperial College London, Speech and Audio Processing Group, UK (PhD 2015)


According to our database1, Yu Wang authored at least 83 papers between 2008 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
HeteroRAG: A Heterogeneous Retrieval-Augmented Generation Framework for Medical Vision Language Tasks.
CoRR, August, 2025

Bridging the Dynamic Perception Gap: Training-Free Draft Chain-of-Thought for Dynamic Multimodal Spatial Reasoning.
CoRR, May, 2025

RARE: Retrieval-Augmented Reasoning Modeling.
CoRR, March, 2025

DSVD: Dynamic Self-Verify Decoding for Faithful Generation in Large Language Models.
CoRR, March, 2025

Strategic priorities for transformative progress in advancing biology with proteomics and artificial intelligence.
CoRR, February, 2025

MedS<sup>3</sup>: Towards Medical Small Language Models with Self-Evolved Slow Thinking.
CoRR, January, 2025

Redundancy-Adaptive Multimodal Learning for imperfect data.
Neural Networks, 2025

Fine-tuning with Reserved Majority for Noise Reduction.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

AuscMLLM: Bridging Classification and Reasoning in Heart Sound Analysis with a Multimodal Large Language Model.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

EvolveBench: A Comprehensive Benchmark for Assessing Temporal Awareness in LLMs on Evolving Knowledge.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

ReflecTool: Towards Reflection-Aware Tool-Augmented Clinical Agents.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

Towards Omni-RAG: Comprehensive Retrieval-Augmented Generation for Large Language Models in Medical Applications.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024
Leveraging Diverse Modeling Contexts With Collaborating Learning for Neural Machine Translation.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

DialogMCF: Multimodal Context Flow for Audio Visual Scene-Aware Dialog.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

Drawing the Line: Enhancing Trustworthiness of MLLMs Through the Power of Refusal.
CoRR, 2024

AuscultaBase: A Foundational Step Towards AI-Powered Body Sound Diagnostics.
CoRR, 2024

Med-PMC: Medical Personalized Multi-modal Consultation with a Proactive Ask-First-Observe-Next Paradigm.
CoRR, 2024

HSDreport: Heart Sound Diagnosis with Echocardiography Reports.
CoRR, 2024

Decoding Linguistic Representations of Human Brain.
CoRR, 2024

Reconstruct the Pruned Model without Any Retraining.
CoRR, 2024

MedCare: Advancing Medical LLMs through Decoupling Clinical Alignment and Knowledge Aggregation.
CoRR, 2024

MING-MOE: Enhancing Medical Multi-Task Learning in Large Language Models with Sparse Mixture of Low-Rank Adapter Experts.
CoRR, 2024

M<sup>3</sup>AV: A Multimodal, Multigenre, and Multipurpose Audio-Visual Academic Lecture Dataset.
CoRR, 2024

Automatic Interactive Evaluation for Large Language Models with State Aware Patient Simulator.
CoRR, 2024

Post-decoder Biasing for End-to-End Speech Recognition of Multi-turn Medical Interview.
CoRR, 2024

M2K-VDG: Model-Adaptive Multimodal Knowledge Anchor Enhanced Video-grounded Dialogue Generation.
CoRR, 2024

Self-supervised learning on millions of primary RNA sequences from 72 vertebrates improves sequence-based RNA splicing prediction.
Briefings Bioinform., 2024

Annotation-free Audio-Visual Segmentation.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

TAIA: Large Language Models are Out-of-Distribution Data Learners.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

MSG-BART: Multi-Granularity Scene Graph-Enhanced Encoder-Decoder Language Model for Video-Grounded Dialogue Generation.
Proceedings of the IEEE International Conference on Acoustics, 2024

RA2FD: Distilling Faithfulness into Efficient Dialogue Systems.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

MedCare: Advancing Medical LLMs through Decoupling Clinical Alignment and Knowledge Aggregation.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

CE-VDG: Counterfactual Entropy-based Bias Reduction for Video-grounded Dialogue Generation.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

GP-nano: a geometric graph network for nanobody polyreactivity prediction.
Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2024

SDA: Semantic Discrepancy Alignment for Text-conditioned Image Retrieval.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

MM-SAP: A Comprehensive Benchmark for Assessing Self-Awareness of Multimodal Large Language Models in Perception.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

DictLLM: Harnessing Key-Value Data Structures with Large Language Models for Enhanced Medical Diagnostics.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

M³AV: A Multimodal, Multigenre, and Multipurpose Audio-Visual Academic Lecture Dataset.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023
Self-Supervised Masking for Unsupervised Anomaly Detection and Localization.
IEEE Trans. Multim., 2023

Improving the Reliability of Large Language Models by Leveraging Uncertainty-Aware In-Context Learning.
CoRR, 2023

An Automatic Evaluation Framework for Multi-turn Medical Consultations Capabilities of Large Language Models.
CoRR, 2023

LibriSQA: Advancing Free-form and Open-ended Spoken Question Answering with a Novel Dataset and Framework.
CoRR, 2023

Audio-aware Query-enhanced Transformer for Audio-Visual Segmentation.
CoRR, 2023

SelfEvolve: A Code Evolution Framework via Large Language Models.
CoRR, 2023

DiffusionSeg: Adapting Diffusion Towards Unsupervised Object Discovery.
CoRR, 2023

Knowledge-aware Bayesian Co-attention for Multimodal Emotion Recognition.
CoRR, 2023

Uncertainty-Guided End-to-End Audio-Visual Speaker Diarization for Far-Field Recordings.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Contrastive Learning Based ASR Robust Knowledge Selection For Spoken Dialogue System.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Unsupervised Active Learning: Optimizing Labeling Cost-Effectiveness for Automatic Speech Recognition.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Pushing the Limits of Unsupervised Unit Discovery for SSL Speech Representation.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Self-Improvement of Non-autoregressive Model via Sequence-Level Distillation.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Enhanced Multimodal Representation Learning with Cross-modal KD.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
Alignment-free metal ion-binding site prediction from protein sequence through pretrained language model and multi-task learning.
Briefings Bioinform., 2022

Unsupervised Ensemble Distillation for Multi-Organ Segmentation.
Proceedings of the 19th IEEE International Symposium on Biomedical Imaging, 2022

Multi-level Fusion of Wav2vec 2.0 and BERT for Multimodal Emotion Recognition.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

LAR-SR: A Local Autoregressive Model for Image Super-Resolution.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
Efficient Use of End-to-End Data in Spoken Language Processing.
Proceedings of the IEEE International Conference on Acoustics, 2021

2020
Spoken Language 'Grammatical Error Correction'.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Non-Native Children's Automatic Speech Recognition: The INTERSPEECH 2020 Shared Task ALTA Systems.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

2019
General Sequence Teacher-Student Learning.
IEEE ACM Trans. Audio Speech Lang. Process., 2019

Exploiting Future Word Contexts in Neural Network Language Models for Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2019

Non-native Speaker Verification for Spoken Language Assessment.
CoRR, 2019

Disfluency Detection for Spoken Learner English.
Proceedings of the 8th ISCA International Workshop on Speech and Language Technology in Education, 2019

Impact of ASR Performance on Spoken Grammatical Error Detection.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Non-Intrusive POLQA Estimation of Speech Quality using Recurrent Neural Networks.
Proceedings of the 27th European Signal Processing Conference, 2019

Learning Between Different Teacher and Student Models in ASR.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018
Model-Based Speech Enhancement in the Modulation Domain.
IEEE ACM Trans. Audio Speech Lang. Process., 2018

Towards automatic assessment of spontaneous spoken English.
Speech Commun., 2018

Confidence Estimation and Deletion Prediction Using Bidirectional Recurrent Neural Networks.
CoRR, 2018

Sequence Teacher-Student Training of Acoustic Models for Automatic Free Speaking Language Assessment.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Speaker Adaptation and Adaptive Training for Jointly Optimised Tandem Systems.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Impact of ASR Performance on Free Speaking Language Assessment.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Phonetic and Graphemic Systems for Multi-Genre Broadcast Transcription.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
Future Word Contexts in Neural Network Language Models.
CoRR, 2017

An attention based model for off-topic spontaneous spoken response detection: An Initial Study.
Proceedings of the 7th ISCA International Workshop on Speech and Language Technology in Education, 2017

Use of Graphemic Lexicons for Spoken Language Assessment.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

2016
A data-driven non-intrusive measure of speech quality and intelligibility.
Speech Commun., 2016

Speech enhancement using an MMSE spectral amplitude estimator based on a modulation domain Kalman filter with a Gamma prior.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Off-topic Response Detection for Spontaneous Spoken English Assessment.
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016

2014
Speech enhancement usinga modulation domain Kalman filter post-processor with a Gaussian Mixture noise model.
Proceedings of the IEEE International Conference on Acoustics, 2014

2013
Speech enhancement using a robust Kalman filter post-processor in the modulation domain.
Proceedings of the IEEE International Conference on Acoustics, 2013

A subspace method for speech enhancement in the modulation domain.
Proceedings of the 21st European Signal Processing Conference, 2013

2008
A Resource Management Mechanism and Its Implementation for Virtual Machines.
Proceedings of the Systems and Virtualization Management. Standards and New Technologies, 2008


  Loading...