Jialong Zuo
Orcid: 0009-0002-6876-9943
According to our database1,
Jialong Zuo authored at least 44 papers
between 2023 and 2026.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2026
Learning to Tell Apart: Weakly Supervised Video Anomaly Detection via Disentangled Semantic Alignment.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026
2025
Is Nano Banana Pro a Low-Level Vision All-Rounder? A Comprehensive Evaluation on 14 Tasks and 40 Datasets.
CoRR, December, 2025
CoRR, December, 2025
CoRR, October, 2025
CoRR, September, 2025
CoRR, June, 2025
Sparse Alignment Enhanced Latent Diffusion Transformer for Zero-Shot Speech Synthesis.
CoRR, February, 2025
Spatial cascaded clustering and weighted memory for unsupervised person re-identification.
Image Vis. Comput., 2025
Speech Token Prediction via Compressed-to-fine Language Modeling for Speech Generation.
Proceedings of the 33rd ACM International Conference on Multimedia, 2025
Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025
Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
Partial Forward Blocking: A Novel Data Pruning Paradigm for Lossless Training Acceleration.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025
Enhancing Expressive Voice Conversion with Discrete Pitch-Conditioned Flow Matching Model.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
Proceedings of the 31st International Conference on Computational Linguistics, 2025
Rhythm Controllable and Efficient Zero-Shot Voice Conversion via Shortcut Flow Matching.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025
ControlSpeech: Towards Simultaneous and Independent Zero-shot Speaker Cloning and Zero-shot Language Style Control.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025
CART: A Generative Cross-Modal Retrieval Framework With Coarse-To-Fine Semantic Modeling.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025
2024
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling.
CoRR, 2024
ACE: A Generative Cross-Modal Retrieval Framework with Coarse-To-Fine Semantic Modeling.
CoRR, 2024
Holmes-VAD: Towards Unbiased and Explainable Video Anomaly Detection via Multi-modal LLM.
CoRR, 2024
ControlSpeech: Towards Simultaneous Zero-shot Speaker Cloning and Zero-shot Language Style Control With Decoupled Codec.
CoRR, 2024
Language-Codec: Reducing the Gaps Between Discrete Codec Representation and Speech Language Models.
CoRR, 2024
MobileSpeech: A Fast and High-Fidelity Framework for Mobile Zero-Shot Text-to-Speech.
CoRR, 2024
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024
SyncTalklip: Highly Synchronized Lip-Readable Speaker Generation with Multi-Task Learning.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024
TextrolSpeech: A Text Style Control Speech Corpus with Codec Language Text-to-Speech Models.
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
MobileSpeech: A Fast and High-Fidelity Framework for Mobile Zero-Shot Text-to-Speech.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
2023
TextrolSpeech: A Text Style Control Speech Corpus With Codec Language Text-to-Speech Models.
CoRR, 2023
FluentSpeech: Stutter-Oriented Automatic Speech Editing with Context-Aware Diffusion Models.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023