Yonghui Wu

This page is a disambiguation page, it actually contains mutiple papers from persons of the same or a similar name.

Known people with the same name:

Bibliography

2025
Seedream 4.0: Toward Next-generation Multimodal Image Generation.
CoRR, September, 2025

Agentic AutoSurvey: Let LLMs Survey LLMs.
CoRR, September, 2025

Balanced Actor Initialization: Stable RLHF Training of Distillation-Based Reasoning Models.
CoRR, September, 2025

Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed Inference.
CoRR, August, 2025

Reliable Indoor Localization in Multibuilding Environments: Leveraging Environment-Invariant and Position-Related Features.
IEEE Internet Things J., July, 2025

Seed-Prover: Deep and Broad Reasoning for Automated Theorem Proving.
CoRR, July, 2025

Seed LiveInterpret 2.0: End-to-end Simultaneous Speech-to-speech Translation with Your Voice.
CoRR, July, 2025

Seed-X: Building Strong Multilingual Translation LLM with 7B Parameters.
CoRR, July, 2025

Truncated Proximal Policy Optimization.
CoRR, June, 2025

Seed-Coder: Let the Code Model Curate Data for Itself.
CoRR, June, 2025

EvaLearn: Quantifying the Learning Capability and Efficiency of LLMs via Sequential Problem Solving.
CoRR, June, 2025

Scaling Up Biomedical Vision-Language Models: Fine-Tuning, Instruction Tuning, and Multi-Modal Learning.
CoRR, May, 2025

Model Merging in Pre-training of Large Language Models.
CoRR, May, 2025

Natural Language Generation in Healthcare: A Review of Methods and Applications.
CoRR, May, 2025

VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks.
CoRR, April, 2025

A Unified Pairwise Framework for RLHF: Bridging Generative Reward Modeling and Policy Optimization.
CoRR, April, 2025

DAPO: An Open-Source LLM Reinforcement Learning System at Scale.
CoRR, March, 2025

FSLNet: Filter sensitivity-based lightweight network for rice leaf disease recognition.
Comput. Electron. Agric., 2025

Optimization of motion strategy for a micro multi-functional chassis based on RBF neural network in intercropping mode.
Comput. Electron. Agric., 2025

2024
Automatic Summarization of Doctor-Patient Encounter Dialogues Using Large Language Model through Prompt Tuning.
CoRR, 2024

Llama-TCR: Generate De Novo TCR with Large Language Model.
Proceedings of the IEEE Conference on Artificial Intelligence, 2024

TTCR: Accurate TCR-Epitope Binding Affinity Prediction Using Transformers.
Proceedings of the IEEE Conference on Artificial Intelligence, 2024

Constructions of Teaching Materials, Curriculums, and the Teaching System Cross-Region for "Solving Problems by Programming".
Proceedings of the Computing and Combinatorics - 30th International Conference, 2024

2023
Combined scaling for zero-shot transfer learning.
Neurocomputing, October, 2023

SLM: Bridge the thin gap between speech and text foundation models.
CoRR, 2023

Efficient Adapters for Giant Speech Models.
CoRR, 2023

Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages.
CoRR, 2023

AnyTOD: A Programmable Task-Oriented Dialog System.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

VILA: Learning Image Aesthetics from User Comments with Vision-Language Pretraining.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

SLM: Bridge the Thin Gap Between Speech and Text Foundation Models.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

A High Precision Capacitive Isolation Amplifier for Current Sensing Applications.
Proceedings of the 15th IEEE International Conference on ASIC, 2023

2022
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation.
Trans. Mach. Learn. Res., 2022

CoCa: Contrastive Captioners are Image-Text Foundation Models.
Trans. Mach. Learn. Res., 2022

BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition.
IEEE J. Sel. Top. Signal Process., 2022

Video-Text Modeling with Zero-Shot Transfer from Contrastive Captioners.
CoRR, 2022

N-Grammer: Augmenting Transformers with latent n-grams.
CoRR, 2022

Building Machine Translation Systems for the Next Thousand Languages.
CoRR, 2022

Description-Driven Task-Oriented Dialog Modeling.
CoRR, 2022

Confusing Traffic against Intra-domain Webpage Fingerprinting Attacks.
Proceedings of the IEEE International Conference on Trust, 2022

Show, Don't Tell: Demonstrations Outperform Descriptions for Schema-Guided Task-Oriented Dialogue.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

Pathways: Asynchronous Distributed Dataflow for ML.
Proceedings of the Fifth Conference on Machine Learning and Systems, 2022

Training Text-To-Speech Systems From Synthetic Data: A Practical Approach For Accent Transfer Tasks.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022


Self-supervised learning with random-projection quantizer for speech recognition.
Proceedings of the International Conference on Machine Learning, 2022

Vector-quantized Image Modeling with Improved VQGAN.
Proceedings of the Tenth International Conference on Learning Representations, 2022


SGD-X: A Benchmark for Robust Generalization in Schema-Guided Dialogue Systems.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
GSPMD: General and Scalable Parallelization for ML Computation Graphs.
CoRR, 2021

Improving Longer-range Dialogue State Tracking.
CoRR, 2021

Distilling Interpretable Models into Human-Readable Code.
CoRR, 2021

Interpretable Ranking with Generalized Additive Models.
Proceedings of the WSDM '21, 2021

RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Dual-mode ASR: Unify and Improve Streaming ASR with Full-context Modeling.
Proceedings of the 9th International Conference on Learning Representations, 2021

FastEmit: Low-Latency Streaming ASR with Sequence-Level Emission Regularization.
Proceedings of the IEEE International Conference on Acoustics, 2021

A Better and Faster end-to-end Model for Streaming ASR.
Proceedings of the IEEE International Conference on Acoustics, 2021

Parallel Tacotron: Non-Autoregressive and Controllable TTS.
Proceedings of the IEEE International Conference on Acoustics, 2021

Effective Sequence-to-Sequence Dialogue State Tracking.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

w2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020
Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition.
CoRR, 2020

Universal ASR: Unify and Improve Streaming ASR with Full-context Modeling.
CoRR, 2020

Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis Including Unsupervised Duration Modeling.
CoRR, 2020

Interpretable Learning-to-Rank with Generalized Additive Models.
CoRR, 2020

A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency.
CoRR, 2020

Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior.
CoRR, 2020

Improved Noisy Student Training for Automatic Speech Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Conformer: Convolution-augmented Transformer for Speech Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

A 26.5GHz Wideband Gilbert-Cell Mixer MMIC Based on InP DHBT Technology.
Proceedings of the 20th IEEE International Conference on Communication Technology, 2020

Improving Speech Recognition Using Consistent Predictions on Synthesized Speech.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Fully-Hierarchical Fine-Grained Prosody Modeling For Interpretable Speech Synthesis.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Generating Diverse and Natural Text-to-Speech Samples Using a Quantized Fine-Grained VAE and Autoregressive Prosody Prior.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020


Specaugment on Large Scale Datasets.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Towards Fast and Accurate Streaming End-To-End ASR.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Leveraging Monolingual Data with Self-Supervision for Multilingual Neural Machine Translation.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019
A Stereo-Vision System for Measuring the Ram Speed of Steam Hammers in an Environment with a Large Field of View and Strong Vibrations.
Sensors, 2019

Massively Multilingual Neural Machine Translation in the Wild: Findings and Challenges.
CoRR, 2019

Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling.
CoRR, 2019

GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Gmail Smart Compose: Real-Time Assisted Writing.
Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2019

Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Two-Pass End-to-End Speech Recognition.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Large-Scale Multilingual Speech Recognition with a Streaming End-to-End Model.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Direct Speech-to-Speech Translation with a Sequence-to-Sequence Model.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Hierarchical Generative Modeling for Controllable Speech Synthesis.
Proceedings of the 7th International Conference on Learning Representations, 2019

Bytes Are All You Need: End-to-end Multilingual Speech Recognition and Synthesis with Bytes.
Proceedings of the IEEE International Conference on Acoustics, 2019

Leveraging Weakly Supervised Data to Improve End-to-end Speech-to-text Translation.
Proceedings of the IEEE International Conference on Acoustics, 2019

Disentangling Correlated Speaker and Noise for Speech Synthesis via Data Augmentation and Adversarial Factorization.
Proceedings of the IEEE International Conference on Acoustics, 2019


Speech Recognition with Augmented Synthesized Speech.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

A Comparison of End-to-End Models for Long-Form Speech Recognition.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

A 20GS/s Track-and-Hold Amplifier based on InP DHBT Process.
Proceedings of the 13th IEEE International Conference on ASIC, 2019

2018
The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation.
CoRR, 2018

A Comparison of Techniques for Language Model Integration in Encoder-Decoder Speech Recognition.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Event-Triggered Consensus of General Linear Multi-agent System with Time Delay.
Proceedings of the Advances in Neural Networks - ISNN 2018, 2018

Compression of End-to-End Models.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Speech Recognition for Medical Conversations.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Natural TTS Synthesis by Conditioning Wavenet on MEL Spectrogram Predictions.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

No Need for a Lexicon? Evaluating the Value of the Pronunciation Lexica in End-to-End Models.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Improving the Performance of Online Neural Transducer Models.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Minimum Word Error Rate Training for Attention-Based Sequence-to-Sequence Models.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

An Analysis of Incorporating an External Language Model into a Sequence-to-Sequence Model.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

State-of-the-Art Speech Recognition with Sequence-to-Sequence Models.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

CaLcs: Continuously Approximating Longest Common Subsequence for Sequence Level Optimization.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

Training Deeper Neural Machine Translation Models with Transparent Attention.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation.
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018

2017
Google's Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation.
Trans. Assoc. Comput. Linguistics, 2017

Multi-Dialect Speech Recognition With A Single Sequence-To-Sequence Model.
CoRR, 2017

Sequence-to-Sequence Models Can Directly Transcribe Foreign Speech.
CoRR, 2017

Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model.
CoRR, 2017

Sequence-to-Sequence Models Can Directly Translate Foreign Speech.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Tacotron: Towards End-to-End Speech Synthesis.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

2016
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation.
CoRR, 2016

Exploring the Limits of Language Modeling.
CoRR, 2016

Reward Augmented Maximum Likelihood for Neural Structured Prediction.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

2015
Scattering Mechanism Extraction by a Modified Cloude-Pottier Decomposition for Dual Polarization SAR.
Remote. Sens., 2015

2013
Combinatorial Pooling Enables Selective Sequencing of the Barley Gene Space.
PLoS Comput. Biol., 2013

2011
Accurate Construction of Consensus Genetic Maps via Integer Linear Programming.
IEEE ACM Trans. Comput. Biol. Bioinform., 2011

Barcoding-free BAC Pooling Enables Combinatorial Selective Sequencing of the Barley Gene Space
CoRR, 2011

2010
Efficient Genome-Wide TagSNP Selection Across Populations via the Linkage Disequilibrium Criterion.
J. Comput. Biol., 2010

2008
Region-Based Classification of Polarimetric SAR Images Using Wishart MRF.
IEEE Geosci. Remote. Sens. Lett., 2008

Deconvoluting BAC-Gene Relationships Using a Physical Map.
J. Bioinform. Comput. Biol., 2008

A Linear-Time Algorithm for Predicting Functional Annotations from PPI Networks.
J. Bioinform. Comput. Biol., 2008

2007
Efficient and Accurate Construction of Genetic Linkage Maps from Noisy and Missing Genotyping Data.
Proceedings of the Algorithms in Bioinformatics, 7th International Workshop, 2007

Clock-frequency assignment for multiple clock domain systems-on-a-chip.
Proceedings of the 2007 Design, Automation and Test in Europe Conference and Exposition, 2007

Two-level microprocessor-accelerator partitioning.
Proceedings of the 2007 Design, Automation and Test in Europe Conference and Exposition, 2007

2006
Error-Resilient LZW Data Compression.
Proceedings of the 2006 Data Compression Conference (DCC 2006), 2006

2000
Implementation and Proof for Normalization Design of Object-Oriented Data Schemes.
Proceedings of the TOOLS Asia 2000: 36th International Conference on Technology of Object-Oriented Languages and Systems, Xi'an, China, 30 October, 2000


  Loading...