Ron J. Weiss

Orcid: 0000-0003-2010-4053

Affiliations:

Google

According to our database¹, Ron J. Weiss authored at least 65 papers between 2006 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Bibliography

2025

Recomposer: Event-roll-guided generative audio editing.

[BibT_eX]

[DOI]

CoRR, September, 2025

SequenceLayers: Sequence Processing and Streaming Neural Networks Made Easy.

[BibT_eX]

[DOI]

CoRR, July, 2025

2022

G-Augment: Searching for the Meta-Structure of Data Augmentation Policies for ASR.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

2021

Sparse, Efficient, and Semantic Mixture Invariant Training: Taming In-the-Wild Unsupervised Sound Separation.

[BibT_eX]

[DOI]

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2021

Multitask Training with Text Data for End-to-End Speech Recognition.

[BibT_eX]

[DOI]

Peidong Wang

Tara N. Sainath

Ron J. Weiss

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

WaveGrad: Estimating Gradients for Waveform Generation.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

Wave-Tacotron: Spectrogram-Free End-to-End Text-to-Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Parallel Tacotron: Non-Autoregressive and Controllable TTS.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

2020

Unsupervised Sound Separation Using Mixtures of Mixtures.

[BibT_eX]

[DOI]

CoRR, 2020

Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior.

[BibT_eX]

[DOI]

CoRR, 2020

Unsupervised Sound Separation Using Mixture Invariant Training.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Fully-Hierarchical Fine-Grained Prosody Modeling For Interpretable Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Generating Diverse and Natural Text-to-Speech Samples Using a Quantized Fine-Grained VAE and Autoregressive Prosody Prior.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

An Attention-Based Joint Acoustic and Text on-Device End-To-End Model.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019

Unsupervised Speech Representation Learning Using WaveNet Autoencoders.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2019

Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling.

[BibT_eX]

[DOI]

CoRR, 2019

Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Direct Speech-to-Speech Translation with a Sequence-to-Sequence Model.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Parrotron: An End-to-End Speech-to-Speech Conversion Model and its Applications to Hearing-Impaired Speech and Speech Separation.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Hierarchical Generative Modeling for Controllable Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 7th International Conference on Learning Representations, 2019

Leveraging Weakly Supervised Data to Improve End-to-end Speech-to-text Translation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Disentangling Correlated Speaker and Noise for Speech Synthesis via Data Augmentation and Adversarial Factorization.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

A Spelling Correction Model for End-to-end Speech Recognition.

[BibT_eX]

[DOI]

Jinxi Guo

Tara N. Sainath

Ron J. Weiss

Proceedings of the IEEE International Conference on Acoustics, 2019

Audio Texture Synthesis with Random Neural Networks: Improving Diversity and Quality.

[BibT_eX]

[DOI]

Joseph M. Antognini

Matt Hoffman

Ron J. Weiss

Proceedings of the IEEE International Conference on Acoustics, 2019

2018

Synthesizing Diverse, High-Quality Audio Textures.

[BibT_eX]

[DOI]

Joseph M. Antognini

Matt Hoffman

Ron J. Weiss

CoRR, 2018

Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron.

[BibT_eX]

[DOI]

Proceedings of the 35th International Conference on Machine Learning, 2018

Multilingual Speech Recognition with a Single End-to-End Model.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Natural TTS Synthesis by Conditioning Wavenet on MEL Spectrogram Predictions.

[BibT_eX]

[DOI]

Yannis Agiomyrgiannakis

Yonghui Wu

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

On Using Backpropagation for Speech Texture Generation and Voice Conversion.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

State-of-the-Art Speech Recognition with Sequence-to-Sequence Models.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017

Multichannel Signal Processing With Deep Neural Networks for Automatic Speech Recognition.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2017

Sequence-to-Sequence Models Can Directly Transcribe Foreign Speech.

[BibT_eX]

[DOI]

CoRR, 2017

Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model.

[BibT_eX]

[DOI]

Yannis Agiomyrgiannakis

Rob Clark

Rif A. Saurous

CoRR, 2017

Sequence-to-Sequence Models Can Directly Translate Foreign Speech.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Tacotron: Towards End-to-End Speech Synthesis.

[BibT_eX]

[DOI]

Yannis Agiomyrgiannakis

Rob Clark

Rif A. Saurous

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Acoustic Modeling for Google Home.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Online and Linear-Time Attention by Enforcing Monotonic Alignments.

[BibT_eX]

[DOI]

Proceedings of the 34th International Conference on Machine Learning, 2017

CNN architectures for large-scale audio classification.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Raw Multichannel Processing Using Deep Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the New Era for Robust Speech Recognition, Exploiting Deep Learning., 2017

2016

Reducing the Computational Complexity of Multimicrophone Acoustic Models with Integrated Feature Extraction.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Neural Network Adaptive Beamforming for Robust Multichannel Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Factored spatial and spectral multichannel raw waveform CLDNNs.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015

Learning the speech front-end with raw waveform CLDNNs.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Speech acoustic modeling from raw multichannel waveforms.

[BibT_eX]

[DOI]

Yedid Hoshen

Ron J. Weiss

Kevin W. Wilson

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Speaker location and microphone spacing invariant acoustic modeling from raw multichannel waveforms.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

2014

Affinity Weighted Embedding.

[BibT_eX]

[DOI]

Jason Weston

Ron J. Weiss

Hector Yee

Proceedings of the 31th International Conference on Machine Learning, 2014

2013

Learning to rank recommendations with the k-order statistic loss.

[BibT_eX]

[DOI]

Jason Weston

Hector Yee

Ron J. Weiss

Proceedings of the Seventh ACM Conference on Recommender Systems, 2013

Nonlinear latent factorization by embedding multiple user interests.

[BibT_eX]

[DOI]

Jason Weston

Ron J. Weiss

Hector Yee

Proceedings of the Seventh ACM Conference on Recommender Systems, 2013

2012

Latent Collaborative Retrieval.

[BibT_eX]

[DOI]

Proceedings of the 29th International Conference on Machine Learning, 2012

2011

Combining localization cues and source model constraints for binaural source separation.

[BibT_eX]

[DOI]

Ron J. Weiss

Michael I. Mandel

Daniel P. W. Ellis

Speech Commun., 2011

Unsupervised Discovery of Temporal Structure in Music.

[BibT_eX]

[DOI]

Ron J. Weiss

Juan Pablo Bello

IEEE J. Sel. Top. Signal Process., 2011

Evaluating music sequence models through missing data.

[BibT_eX]

[DOI]

Thierry Bertin-Mahieux

Graham Grindlay

Ron J. Weiss

Daniel P. W. Ellis

Proceedings of the IEEE International Conference on Acoustics, 2011

2010

Model-Based Expectation-Maximization Source Separation and Localization.

[BibT_eX]

[DOI]

Michael I. Mandel

Ron J. Weiss

Daniel P. W. Ellis

IEEE Trans. Speech Audio Process., 2010

Speech separation using speaker-adapted eigenvoice speech models.

[BibT_eX]

[DOI]

Ron J. Weiss

Daniel P. W. Ellis

Comput. Speech Lang., 2010

Identifying Repeated Patterns in Music Using Sparse Convolutive Non-negative Matrix Factorization.

[BibT_eX]

[DOI]

Ron J. Weiss

Juan Pablo Bello

Proceedings of the 11th International Society for Music Information Retrieval Conference, 2010

Clustering Beat-Chroma Patterns in a Large Music Database.

[BibT_eX]

[DOI]

Thierry Bertin-Mahieux

Ron J. Weiss

Daniel P. W. Ellis

Proceedings of the 11th International Society for Music Information Retrieval Conference, 2010

2009

A variational EM algorithm for learning eigenvoice parameters in mixed signals.

[BibT_eX]

[DOI]

Ron J. Weiss

Daniel P. W. Ellis

Proceedings of the IEEE International Conference on Acoustics, 2009

2008

Source separation based on binaural cues and source model constraints.

[BibT_eX]

[DOI]

Ron J. Weiss

Michael I. Mandel

Daniel P. W. Ellis

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

DySANA: dynamic speech and noise adaptation for voice activity detection.

[BibT_eX]

[DOI]

Ron J. Weiss

Trausti T. Kristjansson

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

2006

Estimating single-channel source separation masks: relevance vector machine classifiers vs. pitch-based masking.

[BibT_eX]

[DOI]

Ron J. Weiss

Daniel P. W. Ellis

Proceedings of the ISCA Tutorial and Research Workshop on Statistical and Perceptual Audition, 2006

Model-Based Monaural Source Separation Using a Vector-Quantized Phase-Vocoder Representation.

[BibT_eX]

[DOI]

Daniel P. W. Ellis

Ron J. Weiss

Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Ron J. Weiss

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...