Yu Wang

Orcid: 0000-0001-9500-081X

Affiliations:
  • Shanghai Jiao Tong University, Cooperative Medianet Innovation Center, China
  • University of Cambridge, Department of Engineering, UK
  • Imperial College London, Speech and Audio Processing Group, UK (PhD 2015)


According to our database1, Yu Wang authored at least 41 papers between 2013 and 2024.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
DialogMCF: Multimodal Context Flow for Audio Visual Scene-Aware Dialog.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

Automatic Interactive Evaluation for Large Language Models with State Aware Patient Simulator.
CoRR, 2024

Post-decoder Biasing for End-to-End Speech Recognition of Multi-turn Medical Interview.
CoRR, 2024

2023
Self-Supervised Masking for Unsupervised Anomaly Detection and Localization.
IEEE Trans. Multim., 2023

An Automatic Evaluation Framework for Multi-turn Medical Consultations Capabilities of Large Language Models.
CoRR, 2023

Unsupervised Active Learning: Optimizing Labeling Cost-Effectiveness for Automatic Speech Recognition.
CoRR, 2023

LibriSQA: Advancing Free-form and Open-ended Spoken Question Answering with a Novel Dataset and Framework.
CoRR, 2023

Audio-aware Query-enhanced Transformer for Audio-Visual Segmentation.
CoRR, 2023

Pushing the Limits of Unsupervised Unit Discovery for SSL Speech Representation.
CoRR, 2023

Annotation-free Audio-Visual Segmentation.
CoRR, 2023

DiffusionSeg: Adapting Diffusion Towards Unsupervised Object Discovery.
CoRR, 2023

Knowledge-aware Bayesian Co-attention for Multimodal Emotion Recognition.
CoRR, 2023

Uncertainty-Guided End-to-End Audio-Visual Speaker Diarization for Far-Field Recordings.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

2022
Unsupervised Ensemble Distillation for Multi-Organ Segmentation.
Proceedings of the 19th IEEE International Symposium on Biomedical Imaging, 2022

Multi-level Fusion of Wav2vec 2.0 and BERT for Multimodal Emotion Recognition.
Proceedings of the Interspeech 2022, 2022

LAR-SR: A Local Autoregressive Model for Image Super-Resolution.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
Efficient Use of End-to-End Data in Spoken Language Processing.
Proceedings of the IEEE International Conference on Acoustics, 2021

2020
Spoken Language 'Grammatical Error Correction'.
Proceedings of the Interspeech 2020, 2020

Non-Native Children's Automatic Speech Recognition: The INTERSPEECH 2020 Shared Task ALTA Systems.
Proceedings of the Interspeech 2020, 2020

2019
General Sequence Teacher-Student Learning.
IEEE ACM Trans. Audio Speech Lang. Process., 2019

Exploiting Future Word Contexts in Neural Network Language Models for Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2019

Non-native Speaker Verification for Spoken Language Assessment.
CoRR, 2019

Disfluency Detection for Spoken Learner English.
Proceedings of the 8th ISCA International Workshop on Speech and Language Technology in Education, 2019

Impact of ASR Performance on Spoken Grammatical Error Detection.
Proceedings of the Interspeech 2019, 2019

Non-Intrusive POLQA Estimation of Speech Quality using Recurrent Neural Networks.
Proceedings of the 27th European Signal Processing Conference, 2019

Learning Between Different Teacher and Student Models in ASR.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018
Model-Based Speech Enhancement in the Modulation Domain.
IEEE ACM Trans. Audio Speech Lang. Process., 2018

Towards automatic assessment of spontaneous spoken English.
Speech Commun., 2018

Confidence Estimation and Deletion Prediction Using Bidirectional Recurrent Neural Networks.
CoRR, 2018

Speaker Adaptation and Adaptive Training for Jointly Optimised Tandem Systems.
Proceedings of the Interspeech 2018, 2018

Impact of ASR Performance on Free Speaking Language Assessment.
Proceedings of the Interspeech 2018, 2018

Phonetic and Graphemic Systems for Multi-Genre Broadcast Transcription.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
Future Word Contexts in Neural Network Language Models.
CoRR, 2017

An attention based model for off-topic spontaneous spoken response detection: An Initial Study.
Proceedings of the 7th ISCA International Workshop on Speech and Language Technology in Education, 2017

Use of Graphemic Lexicons for Spoken Language Assessment.
Proceedings of the Interspeech 2017, 2017

2016
A data-driven non-intrusive measure of speech quality and intelligibility.
Speech Commun., 2016

Speech enhancement using an MMSE spectral amplitude estimator based on a modulation domain Kalman filter with a Gamma prior.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Off-topic Response Detection for Spontaneous Spoken English Assessment.
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016

2014
Speech enhancement usinga modulation domain Kalman filter post-processor with a Gaussian Mixture noise model.
Proceedings of the IEEE International Conference on Acoustics, 2014

2013
Speech enhancement using a robust Kalman filter post-processor in the modulation domain.
Proceedings of the IEEE International Conference on Acoustics, 2013

A subspace method for speech enhancement in the modulation domain.
Proceedings of the 21st European Signal Processing Conference, 2013


  Loading...