Zhifang Guo

Orcid: 0009-0009-6037-2299

According to our database¹, Zhifang Guo authored at least 15 papers between 2022 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

Qwen3-Omni Technical Report.

[BibT_eX]

[DOI]

CoRR, September, 2025

Qwen2.5-Omni Technical Report.

[BibT_eX]

[DOI]

CoRR, March, 2025

InSerter: Speech Instruction Following with Unsupervised Interleaved Pre-training.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

Analyzing and Mitigating Inconsistency in Discrete Speech Tokens for Neural Codec Language Models.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024

Analyzing and Mitigating Inconsistency in Discrete Audio Tokens for Neural Codec Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

Qwen2-Audio Technical Report.

[BibT_eX]

[DOI]

CoRR, 2024

Qwen2 Technical Report.

[BibT_eX]

[DOI]

CoRR, 2024

Advancing Multi-grained Alignment for Contrastive Language-Audio Pre-training.

[BibT_eX]

[DOI]

Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Leveraging Language Model Capabilities for Sound Event Detection.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

PromptTTS 2: Describing and Generating Voices with Text Prompt.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Audio Generation with Multiple Conditional Diffusion Model.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

PromptTTS 2: Describing and Generating Voices with Text Prompt.

[BibT_eX]

[DOI]

CoRR, 2023

Furnishing Sound Event Detection with Language Model Abilities.

[BibT_eX]

[DOI]

CoRR, 2023

Prompttts: Controllable Text-To-Speech With Text Descriptions.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

2022

A Hybrid System of Sound Event Detection Transformer and Frame-Wise Model for DCASE 2022 Task 4.

[BibT_eX]

[DOI]

Proceedings of the 7th Workshop on Detection and Classification of Acoustic Scenes and Events 2022, 2022

Zhifang Guo

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...