Bowen Shi
Affiliations:- Meta, Meta AI, Fundamental AI Research (FAIR), Audiobox Team, USA
- Toyota Technological Institute at Chicago, IL, USA
According to our database1,
Bowen Shi
authored at least 39 papers
between 2017 and 2025.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2025
Meta Audiobox Aesthetics: Unified Automatic Quality Assessment for Speech, Music, and Sound.
CoRR, February, 2025
Audiobox TTA-RAG: Improving Zero-Shot and Few-Shot Text-To-Audio with Retrieval-Augmented Generation.
Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025
2024
High Fidelity Text-Guided Music Generation and Editing via Single-Stage Flow Matching.
CoRR, 2024
Proceedings of the IEEE Spoken Language Technology Workshop, 2024
Learning Fine-Grained Controllability on Speech Generation via Efficient Fine-Tuning.
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Proceedings of the Twelfth International Conference on Learning Representations, 2024
M2BART: Multilingual and Multimodal Encoder-Decoder Pre-Training for Any-to-Any Machine Translation.
Proceedings of the IEEE International Conference on Acoustics, 2024
XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
2023
Proceedings of the Eighth Conference on Machine Translation, 2023
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
ReVISE: Self-Supervised Speech Resynthesis with Visual Input for Universal and Generalized Speech Regeneration.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
2022
ReVISE: Self-Supervised Speech Resynthesis with Visual Input for Universal and Generalized Speech Enhancement.
CoRR, 2022
A Single Self-Supervised Model for Many Speech Modalities Enables Zero-Shot Modality Transfer.
CoRR, 2022
Proceedings of the Seventh Conference on Machine Translation, 2022
u-HuBERT: Unified Mixed-Modal Speech Pretraining And Zero-Shot Transfer to Unlabeled Modality.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the Tenth International Conference on Learning Representations, 2022
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022
2021
Proceedings of the IEEE Spoken Language Technology Workshop, 2021
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021
2020
Proceedings of the 5th Workshop on Representation Learning for NLP, 2020
A Joint Framework for Audio Tagging and Weakly Supervised Acoustic Event Detection Using DenseNet with Global Average Pooling.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
2019
Compression of Acoustic Event Detection Models with Low-rank Matrix Factorization and Quantization Training.
CoRR, 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
On the Contributions of Visual and Textual Supervision in Low-Resource Semantic Speech Retrieval.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019
Proceedings of the IEEE International Conference on Acoustics, 2019
2018
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018
2017
Multitask training with unlabeled data for end-to-end sign language fingerspelling recognition.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017