Adam Polyak

CoRR, 2023

Pick-a-Pic: An Open Dataset of User Preferences for Text-to-Image Generation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Text-To-4D Dynamic Scene Generation.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Make-A-Video: Text-to-Video Generation without Text-Video Data.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

kNN-Diffusion: Image Generation via Large-Scale Retrieval.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

AudioGen: Textually Guided Audio Generation.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

2022

Audio Language Modeling using Perceptually-Guided Discrete Representations.

[BibT_eX]

[DOI]

CoRR, 2022

Multilingual Text-To-Speech Training Using Cross Language Voice Conversion And Self-Supervised Learning Of Speech Representations.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Textless Speech Emotion Conversion using Discrete & Decomposed Representations.

[BibT_eX]

[DOI]

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Direct Speech-to-Speech Translation With Discrete Units.

[BibT_eX]

[DOI]

Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

Text-Free Prosody-Aware Generative Spoken Language Modeling.

[BibT_eX]

[DOI]

Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

2021

Locally Shifted Attention With Early Global Integration.

[BibT_eX]

[DOI]

CoRR, 2021

Textless Speech Emotion Conversion using Decomposed and Discrete Representations.

[BibT_eX]

[DOI]

CoRR, 2021

Direct speech-to-speech translation with discrete units.

[BibT_eX]

[DOI]

CoRR, 2021

Generative Spoken Language Modeling from Raw Audio.

[BibT_eX]

[DOI]

CoRR, 2021

Speech Resynthesis from Discrete Disentangled Self-Supervised Representations.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

High Fidelity Speech Regeneration with Application to Speech Enhancement.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

fairseq S\^2: A Scalable and Integrable Speech Synthesis Toolkit.

[BibT_eX]

[DOI]

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, 2021

2020

Unsupervised Generation of Free-Form and Parameterized Avatars.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2020

TTS Skins: Speaker Conversion via ASR.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Unsupervised Cross-Domain Singing Voice Conversion.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

2019

A Universal Music Translation Network.

[BibT_eX]

[DOI]

Proceedings of the 7th International Conference on Learning Representations, 2019

Attention-based Wavenet Autoencoder for Universal Voice Conversion.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

2018

Fitting New Speakers Based on a Short Untranscribed Sample.

[BibT_eX]

[DOI]

Proceedings of the 35th International Conference on Machine Learning, 2018

VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop.

[BibT_eX]

[DOI]

Proceedings of the 6th International Conference on Learning Representations, 2018

2017

Voice Synthesis for in-the-Wild Speakers via a Phonological Loop.

[BibT_eX]

[DOI]

CoRR, 2017

Unsupervised Cross-Domain Image Generation.

[BibT_eX]

[DOI]

Proceedings of the 5th International Conference on Learning Representations, 2017

Unsupervised Creation of Parameterized Avatars.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Computer Vision, 2017

2015

Channel-Level Acceleration of Deep Face Representations.

[BibT_eX]

[DOI]