Yi-Jen Shih

Orcid: 0000-0003-3481-3117

According to our database¹, Yi-Jen Shih authored at least 14 papers between 2022 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2025

Can Speech LLMs Think while Listening?

[BibT_eX]

[DOI]

CoRR, October, 2025

Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks.

[BibT_eX]

[DOI]

Fabian Alejandro Ritter Gutierrez

et al.

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Unifying Model and Layer Fusion for Speech Foundation Models.

[BibT_eX]

[DOI]

Yi-Jen Shih

David Harwath

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2025

2024

Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks.

[BibT_eX]

[DOI]

Fabian Ritter Gutierrez

CoRR, 2024

Measuring Sound Symbolism In Audio-Visual Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

Self-Supervised Speech Models For Word-Level Stuttered Speech Detection.

[BibT_eX]

[DOI]

Yi-Jen Shih

Zoi Gkalitsiou

Alexandros G. Dimakis

David Harwath

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

Interface Design for Self-Supervised Speech Models.

[BibT_eX]

[DOI]

Yi-Jen Shih

David Harwath

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

SpeechCLIP+: Self-Supervised Multi-Task Representation Learning for Speech Via Clip and Speech-Image Data.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

AV-SUPERB: A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Integrating Self-Supervised Speech Model with Pseudo Word-Level Targets from Visually-Grounded Speech Model.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

Theme Transformer: Symbolic Music Generation With Theme-Conditioned Transformer.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2023

AV-SUPERB: A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models.

[BibT_eX]

[DOI]

CoRR, 2023

M-SpeechCLIP: Leveraging Large-Scale, Pre-Trained Models for Multilingual Speech to Image Retrieval.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

2022

SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Yi-Jen Shih

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...