Yu Wu

Orcid: 0000-0002-5715-3011

Affiliations:
  • Microsoft Research Asia, Beijing, China
  • Beihang University, State Key Lab of Software Development Environment, Beijing, China


According to our database1, Yu Wu authored at least 84 papers between 2015 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Advanced Long-Content Speech Recognition With Factorized Neural Transducer.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

2023
WavMark: Watermarking for Audio Generation.
CoRR, 2023

On decoder-only architecture for speech-to-text and large language model integration.
CoRR, 2023

VioLA: Unified Codec Language Models for Speech Recognition, Synthesis, and Translation.
CoRR, 2023

Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling.
CoRR, 2023

Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers.
CoRR, 2023

LongFNT: Long-Form Speech Recognition with Factorized Neural Transducer.
Proceedings of the IEEE International Conference on Acoustics, 2023

Prompting Large Language Models for Zero-Shot Domain Adaptation in Speech Recognition.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing.
IEEE J. Sel. Top. Signal Process., 2022

BEATs: Audio Pre-Training with Acoustic Tokenizers.
CoRR, 2022

Speech separation with large-scale self-supervised learning.
CoRR, 2022

LAMASSU: Streaming Language-Agnostic Multilingual Speech Recognition and Translation Using Neural Transducers.
CoRR, 2022

Exploring WavLM on Speech Enhancement.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Two-Stream Network for Sign Language Recognition and Translation.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Supervision-Guided Codebooks for Masked Prediction in Speech Pre-training.
Proceedings of the Interspeech 2022, 2022

Speech Pre-training with Acoustic Piece.
Proceedings of the Interspeech 2022, 2022

Internal Language Model Adaptation with Text-Only Data for End-to-End Speech Recognition.
Proceedings of the Interspeech 2022, 2022

Streaming Multi-Talker ASR with Token-Level Serialized Output Training.
Proceedings of the Interspeech 2022, 2022

Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings.
Proceedings of the Interspeech 2022, 2022

Why does Self-Supervised Learning for Speech Recognition Benefit Speaker Recognition?
Proceedings of the Interspeech 2022, 2022

Improving Self-Supervised Learning for Speech Recognition with Intermediate Layer Supervision.
Proceedings of the IEEE International Conference on Acoustics, 2022

Wav2vec-Switch: Contrastive Learning from Original-Noisy Speech Pairs for Robust Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2022

Unispeech-Sat: Universal Speech Representation Learning With Speaker Aware Pre-Training.
Proceedings of the IEEE International Conference on Acoustics, 2022

Large-Scale Self-Supervised Speech Representation Learning for Automatic Speaker Verification.
Proceedings of the IEEE International Conference on Acoustics, 2022

SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

2021
Self-Supervised Learning for speech recognition with Intermediate layer supervision.
CoRR, 2021

WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing.
CoRR, 2021

SpeechT5: Unified-Modal Encoder-Decoder Pre-training for Spoken Language Processing.
CoRR, 2021

Investigation of Practical Aspects of Single Channel Speech Separation for ASR.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Improving Multilingual Transformer Transducer Models by Reducing Language Confusions.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Minimum Word Error Rate Training with Language Model Fusion for End-to-End Speech Recognition.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Large-Scale Pre-Training of End-to-End Multi-Talker ASR for Meeting Transcription with Single Distant Microphone.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Ultra Fast Speech Separation Model with Teacher Student Learning.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data.
Proceedings of the 38th International Conference on Machine Learning, 2021

Microsoft Speaker Diarization System for the Voxceleb Speaker Recognition Challenge 2020.
Proceedings of the IEEE International Conference on Acoustics, 2021

Continuous Speech Separation with Conformer.
Proceedings of the IEEE International Conference on Acoustics, 2021

Developing Real-Time Streaming Transformer Transducer for Speech Recognition on Large-Scale Dataset.
Proceedings of the IEEE International Conference on Acoustics, 2021

Don't Shoot Butterfly with Rifles: Multi-Channel Continuous Speech Separation with Early Exit Transformer.
Proceedings of the IEEE International Conference on Acoustics, 2021

Knowledge Enhanced Fine-Tuning for Better Handling Unseen Entities in Dialogue Generation.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Template-Based Named Entity Recognition Using BART.
Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021

On Commonsense Cues in BERT for Solving Commonsense Tasks.
Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021

2020
Don't shoot butterfly with rifles: Multi-channel Continuous Speech Separation with Early Exit Transformer.
CoRR, 2020

Continuous Speech Separation with Conformer.
CoRR, 2020

Does BERT Solve Commonsense Task via Commonsense Knowledge?
CoRR, 2020

On the Comparison of Popular End-to-End Models for Large Scale Speech Recognition.
Proceedings of the Interspeech 2020, 2020

Low Latency End-to-End Streaming Speech Recognition with a Scout Network.
Proceedings of the Interspeech 2020, 2020

Semantic Mask for Transformer Based End-to-End Speech Recognition.
Proceedings of the Interspeech 2020, 2020

Formality Style Transfer with Shared Latent Space.
Proceedings of the 28th International Conference on Computational Linguistics, 2020

Curriculum Pre-training for End-to-End Speech Translation.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

A Retrieve-and-Rewrite Initialization Method for Unsupervised Machine Translation.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

MuTual: A Dataset for Multi-Turn Dialogue Reasoning.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

A Dataset for Low-Resource Stylized Sequence-to-Sequence Generation.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Bridging the Gap between Pre-Training and Fine-Tuning for End-to-End Speech Translation.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

RobuTrans: A Robust Transformer-Based Text-to-Speech Model.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
A Sequential Matching Framework for Multi-Turn Response Selection in Retrieval-Based Chatbots.
Comput. Linguistics, 2019

Neural Melody Composition from Lyrics.
Proceedings of the Natural Language Processing and Chinese Computing, 2019

Unsupervised Context Rewriting for Open Domain Conversation.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Harnessing Pre-Trained Neural Networks with Rules for Formality Style Transfer.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Explicit Cross-lingual Pre-training for Unsupervised Machine Translation.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Dictionary-Guided Editing Networks for Paraphrase Generation.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Response Generation by Context-Aware Prototype Editing.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Response selection with topic clues for retrieval-based chatbots.
Neurocomputing, 2018

Text Morphing.
CoRR, 2018

Towards Explainable and Controllable Open Domain Dialogue Generation with Dialogue Acts.
CoRR, 2018

Dictionary-Guided Editing Networks for Paraphrase Generation.
CoRR, 2018

Response Generation by Context-aware Prototype Editing.
CoRR, 2018

Keyphrase Generation with Correlation Constraints.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

Learning Matching Models with Weak Supervision for Response Selection in Retrieval-based Chatbots.
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018

Hierarchical Recurrent Attention Network for Response Generation.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Neural Response Generation With Dynamic Vocabularies.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Knowledge Enhanced Hybrid Neural Network for Text Matching.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
Neural Response Generation with Dynamic Vocabularies.
CoRR, 2017

Hierarchical Recurrent Attention Network for Response Generation.
CoRR, 2017

Beihang-MSRA at SemEval-2017 Task 3: A Ranking System with Neural Matching Features for Community Question Answering.
Proceedings of the 11th International Workshop on Semantic Evaluation, 2017

Beihang at the NTCIR-13 STC-2 Task.
Proceedings of the 13th NTCIR Conference, 2017

Sequential Matching Network: A New Architecture for Multi-turn Response Selection in Retrieval-Based Chatbots.
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017

Topic Aware Neural Response Generation.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016
Topic Augmented Neural Response Generation with a Joint Attention Mechanism.
CoRR, 2016

Sequential Match Network: A New Architecture for Multi-turn Response Selection in Retrieval-based Chatbots.
CoRR, 2016

Knowledge Enhanced Hybrid Neural Network for Text Matching.
CoRR, 2016

Topic Augmented Neural Network for Short Text Conversation.
CoRR, 2016

Detecting Context Dependent Messages in a Conversational Environment.
Proceedings of the COLING 2016, 2016

Improving Recommendation of Tail Tags for Questions in Community Question Answering.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

2015
Mining Query Subtopics from Questions in Community Question Answering.
Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015


  Loading...