Puyuan Peng

According to our database1, Puyuan Peng authored at least 15 papers between 2020 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
VoiceCraft: Zero-Shot Speech Editing and Text-to-Speech in the Wild.
CoRR, 2024

SpeechCLIP+: Self-supervised multi-task representation learning for speech via CLIP and speech-image data.
CoRR, 2024

Integrating Self-supervised Speech Model with Pseudo Word-level Targets from Visually-grounded Speech Model.
CoRR, 2024

BAT: Learning to Reason about Spatial Sounds with Large Language Models.
CoRR, 2024

2023
AV-SUPERB: A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models.
CoRR, 2023

Style-transfer based Speech and Audio-visual Scene Understanding for Robot Action Sequence Acquisition from Videos.
CoRR, 2023

Syllable Discovery and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Mode.
CoRR, 2023

Prompting the Hidden Talent of Web-Scale Speech Models for Zero-Shot Task Generalization.
CoRR, 2023

Audio-Visual Neural Syntax Acquisition.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022
Self-Supervised Representation Learning for Speech Using Visual Grounding and Masked Language Modeling.
CoRR, 2022

Zero-shot Video Moment Retrieval With Off-the-Shelf Models.
Proceedings of the Transfer Learning for Natural Language Processing Workshop, 2022

Word Discovery in Visually Grounded, Self-Supervised Speech Models.
Proceedings of the Interspeech 2022, 2022

MAE-AST: Masked Autoencoding Audio Spectrogram Transformer.
Proceedings of the Interspeech 2022, 2022

Fast-Slow Transformer for Visually Grounding Speech.
Proceedings of the IEEE International Conference on Acoustics, 2022

2020
A Correspondence Variational Autoencoder for Unsupervised Acoustic Word Embeddings.
CoRR, 2020


  Loading...