We stand with Ukraine

We stand with Ukraine

R. J. Skerry-Ryan

According to our database¹, R. J. Skerry-Ryan authored at least 25 papers between 2017 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2025

SequenceLayers: Sequence Processing and Streaming Neural Networks Made Easy.

[DOI]

R. J. Skerry-Ryan

,

,

Soroosh Mariooryad

,

,

,

Eric Battenberg

,

,

,

Robin Scheibler

,

,

CoRR, July, 2025

Robust and Unbounded Length Generalization in Autoregressive Transformer-Based Text-to-Speech.

[DOI]

Eric Battenberg

,

R. J. Skerry-Ryan

,

,

Soroosh Mariooryad

,

,

,

Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

Zero-Shot Mono-to-Binaural Speech Synthesis.

[DOI]

Alon Levkovitch

,

,

Soroosh Mariooryad

,

R. J. Skerry-Ryan

,

,

W. Bastiaan Kleijn

,

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Long-Form Speech Generation with Spoken Language Models.

[DOI]

,

,

,

Keisuke Kinoshita

,

,

R. J. Skerry-Ryan

Proceedings of the Forty-second International Conference on Machine Learning, 2025

2024

Very Attentive Tacotron: Robust and Unbounded Length Generalization in Autoregressive Transformer-Based Text-to-Speech.

[DOI]

Eric Battenberg

,

R. J. Skerry-Ryan

,

,

Soroosh Mariooryad

,

,

,

CoRR, 2024

Spoken Question Answering and Speech Continuation Using Spectrogram-Powered LLM.

[DOI]

,

Alon Levkovitch

,

,

,

Chulayuth Asawaroengchai

,

Soroosh Mariooryad

,

,

R. J. Skerry-Ryan

,

Michelle Tadmor Ramanovich

Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023

LMs with a Voice: Spoken Language Modeling beyond Speech Tokens.

[DOI]

,

Alon Levkovitch

,

,

Chulayuth Asawaroengchai

,

Soroosh Mariooryad

,

R. J. Skerry-Ryan

,

Michelle Tadmor Ramanovich

CoRR, 2023

2022

Learning the joint distribution of two sequences using little or no paired data.

[DOI]

Soroosh Mariooryad

,

,

,

,

,

,

Eric Battenberg

,

R. J. Skerry-Ryan

CoRR, 2022

Speaker Generation.

[DOI]

,

,

Soroosh Mariooryad

,

R. J. Skerry-Ryan

,

Eric Battenberg

,

,

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling.

[DOI]

,

,

,

,

,

R. J. Skerry-Ryan

,

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Wave-Tacotron: Spectrogram-Free End-to-End Text-to-Speech Synthesis.

[DOI]

,

R. J. Skerry-Ryan

,

Eric Battenberg

,

Soroosh Mariooryad

,

Diederik P. Kingma

Proceedings of the IEEE International Conference on Acoustics, 2021

2020

Non-saturating GAN training as divergence minimization.

[DOI]

,

,

Soroosh Mariooryad

,

,

Eric Battenberg

,

,

,

R. J. Skerry-Ryan

CoRR, 2020

Semi-Supervised Generative Modeling for Controllable Speech Synthesis.

[DOI]

,

Soroosh Mariooryad

,

,

Eric Battenberg

,

R. J. Skerry-Ryan

,

,

,

Proceedings of the 8th International Conference on Learning Representations, 2020

Location-Relative Attention Mechanisms for Robust Long-Form Speech Synthesis.

[DOI]

Eric Battenberg

,

R. J. Skerry-Ryan

,

Soroosh Mariooryad

,

,

,

,

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019

Effective Use of Variational Embedding Capacity in Expressive End-to-End Speech Synthesis.

[DOI]

Eric Battenberg

,

Soroosh Mariooryad

,

,

R. J. Skerry-Ryan

,

,

,

CoRR, 2019

Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning.

[DOI]

,

,

,

,

,

R. J. Skerry-Ryan

,

,

Andrew Rosenberg

,

Bhuvana Ramabhadran

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Semi-supervised Training for Improving Data Efficiency in End-to-end Speech Synthesis.

[DOI]

,

,

,

,

R. J. Skerry-Ryan

Proceedings of the IEEE International Conference on Acoustics, 2019

2018

Predicting Expressive Speaking Style from Text in End-To-End Speech Synthesis.

[DOI]

,

,

R. J. Skerry-Ryan

Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis.

[DOI]

,

,

,

R. J. Skerry-Ryan

,

Eric Battenberg

,

,

,

,

,

Proceedings of the 35th International Conference on Machine Learning, 2018

Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron.

[DOI]

R. J. Skerry-Ryan

,

Eric Battenberg

,

,

,

,

,

,

,

Proceedings of the 35th International Conference on Machine Learning, 2018

Natural TTS Synthesis by Conditioning Wavenet on MEL Spectrogram Predictions.

[DOI]

,

,

,

,

,

,

,

,

,

R. J. Skerry-Ryan

,

,

Yannis Agiomyrgiannakis

,

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Complex Evolution Recurrent Neural Networks (ceRNNs).

[DOI]

,

,

R. J. Skerry-Ryan

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017

Uncovering Latent Style Factors for Expressive Speech Synthesis.

[DOI]

,

R. J. Skerry-Ryan

,

,

,

,

Eric Battenberg

,

,

CoRR, 2017

Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model.

[DOI]

,

R. J. Skerry-Ryan

,

,

,

,

,

,

,

,

,

,

Yannis Agiomyrgiannakis

,

,

CoRR, 2017

Tacotron: Towards End-to-End Speech Synthesis.

[DOI]

,

R. J. Skerry-Ryan

,

,

,

,

,

,

,

,

,

,

Yannis Agiomyrgiannakis

,

,

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Loading...