Ron J. Weiss

Orcid: 0000-0003-2010-4053

Affiliations:
  • Google


According to our database1, Ron J. Weiss authored at least 64 papers between 2006 and 2022.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2022
G-Augment: Searching for the Meta-Structure of Data Augmentation Policies for ASR.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

2021
Sparse, Efficient, and Semantic Mixture Invariant Training: Taming In-the-Wild Unsupervised Sound Separation.
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2021

Multitask Training with Text Data for End-to-End Speech Recognition.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

WaveGrad: Estimating Gradients for Waveform Generation.
Proceedings of the 9th International Conference on Learning Representations, 2021

Wave-Tacotron: Spectrogram-Free End-to-End Text-to-Speech Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2021

Parallel Tacotron: Non-Autoregressive and Controllable TTS.
Proceedings of the IEEE International Conference on Acoustics, 2021

2020
Unsupervised Sound Separation Using Mixtures of Mixtures.
CoRR, 2020

Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior.
CoRR, 2020

Unsupervised Sound Separation Using Mixture Invariant Training.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Fully-Hierarchical Fine-Grained Prosody Modeling For Interpretable Speech Synthesis.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Generating Diverse and Natural Text-to-Speech Samples Using a Quantized Fine-Grained VAE and Autoregressive Prosody Prior.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

An Attention-Based Joint Acoustic and Text on-Device End-To-End Model.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Unsupervised Speech Representation Learning Using WaveNet Autoencoders.
IEEE ACM Trans. Audio Speech Lang. Process., 2019

Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling.
CoRR, 2019

Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning.
Proceedings of the Interspeech 2019, 2019

LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech.
Proceedings of the Interspeech 2019, 2019

VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking.
Proceedings of the Interspeech 2019, 2019

Direct Speech-to-Speech Translation with a Sequence-to-Sequence Model.
Proceedings of the Interspeech 2019, 2019

Parrotron: An End-to-End Speech-to-Speech Conversion Model and its Applications to Hearing-Impaired Speech and Speech Separation.
Proceedings of the Interspeech 2019, 2019

Hierarchical Generative Modeling for Controllable Speech Synthesis.
Proceedings of the 7th International Conference on Learning Representations, 2019

Leveraging Weakly Supervised Data to Improve End-to-end Speech-to-text Translation.
Proceedings of the IEEE International Conference on Acoustics, 2019

Disentangling Correlated Speaker and Noise for Speech Synthesis via Data Augmentation and Adversarial Factorization.
Proceedings of the IEEE International Conference on Acoustics, 2019

A Spelling Correction Model for End-to-end Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2019

Audio Texture Synthesis with Random Neural Networks: Improving Diversity and Quality.
Proceedings of the IEEE International Conference on Acoustics, 2019

2018
Synthesizing Diverse, High-Quality Audio Textures.
CoRR, 2018

Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron.
Proceedings of the 35th International Conference on Machine Learning, 2018

Multilingual Speech Recognition with a Single End-to-End Model.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Natural TTS Synthesis by Conditioning Wavenet on MEL Spectrogram Predictions.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

On Using Backpropagation for Speech Texture Generation and Voice Conversion.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

State-of-the-Art Speech Recognition with Sequence-to-Sequence Models.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
Multichannel Signal Processing With Deep Neural Networks for Automatic Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2017

Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions.
CoRR, 2017

Sequence-to-Sequence Models Can Directly Transcribe Foreign Speech.
CoRR, 2017

Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model.
CoRR, 2017

Sequence-to-Sequence Models Can Directly Translate Foreign Speech.
Proceedings of the Interspeech 2017, 2017



Online and Linear-Time Attention by Enforcing Monotonic Alignments.
Proceedings of the 34th International Conference on Machine Learning, 2017

CNN architectures for large-scale audio classification.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Raw Multichannel Processing Using Deep Neural Networks.
Proceedings of the New Era for Robust Speech Recognition, Exploiting Deep Learning., 2017

2016
Reducing the Computational Complexity of Multimicrophone Acoustic Models with Integrated Feature Extraction.
Proceedings of the Interspeech 2016, 2016

Neural Network Adaptive Beamforming for Robust Multichannel Speech Recognition.
Proceedings of the Interspeech 2016, 2016

Factored spatial and spectral multichannel raw waveform CLDNNs.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015
Learning the speech front-end with raw waveform CLDNNs.
Proceedings of the INTERSPEECH 2015, 2015

Speech acoustic modeling from raw multichannel waveforms.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Speaker location and microphone spacing invariant acoustic modeling from raw multichannel waveforms.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

2013
Affinity Weighted Embedding
Proceedings of the 1st International Conference on Learning Representations, 2013

Learning to rank recommendations with the k-order statistic loss.
Proceedings of the Seventh ACM Conference on Recommender Systems, 2013

Nonlinear latent factorization by embedding multiple user interests.
Proceedings of the Seventh ACM Conference on Recommender Systems, 2013

2012
Latent Collaborative Retrieval.
Proceedings of the 29th International Conference on Machine Learning, 2012

2011
Combining localization cues and source model constraints for binaural source separation.
Speech Commun., 2011

Unsupervised Discovery of Temporal Structure in Music.
IEEE J. Sel. Top. Signal Process., 2011

Evaluating music sequence models through missing data.
Proceedings of the IEEE International Conference on Acoustics, 2011

2010
Model-Based Expectation-Maximization Source Separation and Localization.
IEEE Trans. Speech Audio Process., 2010

Speech separation using speaker-adapted eigenvoice speech models.
Comput. Speech Lang., 2010

Identifying Repeated Patterns in Music Using Sparse Convolutive Non-negative Matrix Factorization.
Proceedings of the 11th International Society for Music Information Retrieval Conference, 2010

Clustering Beat-Chroma Patterns in a Large Music Database.
Proceedings of the 11th International Society for Music Information Retrieval Conference, 2010

2009
A variational EM algorithm for learning eigenvoice parameters in mixed signals.
Proceedings of the IEEE International Conference on Acoustics, 2009

2008
Source separation based on binaural cues and source model constraints.
Proceedings of the INTERSPEECH 2008, 2008

DySANA: dynamic speech and noise adaptation for voice activity detection.
Proceedings of the INTERSPEECH 2008, 2008

2006
Estimating single-channel source separation masks: relevance vector machine classifiers vs. pitch-based masking.
Proceedings of the ISCA Tutorial and Research Workshop on Statistical and Perceptual Audition, 2006

Model-Based Monaural Source Separation Using a Vector-Quantized Phase-Vocoder Representation.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006


  Loading...