Zhaoheng Ni

According to our database1, Zhaoheng Ni authored at least 30 papers between 2017 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
An Empirical Study on the Impact of Positional Encoding in Transformer-based Monaural Speech Enhancement.
CoRR, 2024

2023
Software Design and User Interface of ESPnet-SE++: Speech Enhancement for Robust Speech Processing.
J. Open Source Softw., November, 2023

Software Design and User Interface of ESPnet-SE++: Speech Enhancement for Robust Speech Processing (espnet-v.202310).
Dataset, October, 2023

A Time-Frequency Attention Module for Neural Speech Enhancement.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

On The Open Prompt Challenge In Conditional Audio Generation.
CoRR, 2023

TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch.
CoRR, 2023

FoleyGen: Visually-Guided Audio Generation.
CoRR, 2023

Stack-and-Delay: a new codebook pattern for music generation.
CoRR, 2023

Enhance audio generation controllability through representation similarity regularization.
CoRR, 2023

Reducing Barriers to Self-Supervised Learning: HuBERT Pre-training with Academic Compute.
CoRR, 2023

Scaling Speech Technology to 1, 000+ Languages.
CoRR, 2023

Ripple Sparse Self-Attention for Monaural Speech Enhancement.
Proceedings of the IEEE International Conference on Acoustics, 2023

Torchaudio-Squim: Reference-Less Speech Quality and Intelligibility Measures in Torchaudio.
Proceedings of the IEEE International Conference on Acoustics, 2023

TorchAudio 2.1: Advancing Speech Recognition, Self-Supervised Learning, and Audio Processing Components for Pytorch.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2023

2022
ESPnet-SE++: Speech Enhancement for Robust Speech Recognition, Translation, and Understanding.
Proceedings of the Interspeech 2022, 2022

Time-Frequency Attention for Monaural Speech Enhancement.
Proceedings of the IEEE International Conference on Acoustics, 2022


Towards Low-Distortion Multi-Channel Speech Enhancement: The ESPNET-Se Submission to the L3DAS22 Challenge.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
TorchAudio: Building Blocks for Audio and Speech Processing.
CoRR, 2021

WPD++: An Improved Neural Beamformer for Simultaneous Speech Separation and Dereverberation.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

2020
Combining Spatial Clustering with LSTM Speech Models for Multichannel Speech Enhancement.
CoRR, 2020

Improved MVDR Beamforming Using LSTM Speech Models to Clean Spatial Clustering Masks.
CoRR, 2020

Enhancement of Spatial Clustering-Based Time-Frequency Masks using LSTM Neural Networks.
CoRR, 2020

Mask-Dependent Phase Estimation for Monaural Speaker Separation.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Onssen: an open-source speech separation and enhancement library.
CoRR, 2019

2018
Sound Signal Processing with Seq2Tree Network.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

Unusable Spoken Response Detection with BLSTM Neural Networks.
Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018

2017
A Sep2Tree Model for Recognizing Synthetic Bach Chorales.
Proceedings of the 2017 International Computer Music Conference, 2017

Confused or not Confused?: Disentangling Brain Activity from EEG Data Using Bidirectional LSTM Recurrent Neural Networks.
Proceedings of the 8th ACM International Conference on Bioinformatics, 2017


  Loading...