Zhikang Niu

Orcid: 0009-0002-2709-9381

According to our database¹, Zhikang Niu authored at least 19 papers between 2023 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

STAR-Bench: Probing Deep Spatio-Temporal Reasoning as Audio 4D Intelligence.

[BibT_eX]

[DOI]

CoRR, October, 2025

SAC: Neural Speech Codec with Semantic-Acoustic Dual-Stream Quantization.

[BibT_eX]

[DOI]

CoRR, October, 2025

DiSTAR: Diffusion over a Scalable Token Autoregressive Representation for Speech Generation.

[BibT_eX]

[DOI]

CoRR, October, 2025

UniVoice: Unifying Autoregressive ASR and Flow-Matching based TTS with Large Language Models.

[BibT_eX]

[DOI]

CoRR, October, 2025

Cross-Lingual F5-TTS: Towards Language-Agnostic Voice Cloning and Speech Synthesis.

[BibT_eX]

[DOI]

CoRR, September, 2025

MeanAudio: Fast and Faithful Text-to-Audio Generation with Mean Flows.

[BibT_eX]

[DOI]

CoRR, August, 2025

Accelerating Diffusion-based Text-to-Speech Model Training with Dual Modality Alignment.

[BibT_eX]

[DOI]

CoRR, May, 2025

MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix.

[BibT_eX]

[DOI]

CoRR, May, 2025

EmoVoice: LLM-based Emotional Text-To-Speech Model with Freestyle Text Prompting.

[BibT_eX]

[DOI]

CoRR, April, 2025

URO-Bench: A Comprehensive Benchmark for End-to-End Spoken Dialogue Models.

[BibT_eX]

[DOI]

CoRR, February, 2025

Deep Learning-Based Real-Time Precise Pose Estimation Using Differential Magnetic Signals in the Dual-Robot Processing System.

[BibT_eX]

[DOI]

IEEE Trans. Instrum. Meas., 2025

Accelerating Flow-Matching-Based Text-to-Speech via Empirically Pruned Step Sampling.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Accelerating Diffusion-based Text-to-Speech Model Trainingwith Dual Modality Alignment.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

A Progressive Generation Framework with Speech Pre-trained Model for Expressive Voice Conversion.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2025

VALL-T: Decoder-Only Generative Transducer for Robust and Decoding-Controllable Text-to-Speech.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

SLAM-Omni: Timbre-Controllable Voice Interaction System with Single-Stage Training.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024

NDVQ: Robust Neural Audio Codec With Normal Distribution-Based Vector Quantization.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

2023

Fast-Hubert: an Efficient Training Framework for Self-Supervised Speech Representation Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

Zhikang Niu

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...