Jie Zhang

Orcid: 0000-0003-1124-0854

Affiliations:
  • University of Science and Technology of China, NEL-SLIP, Hefei, China
  • Chinese Academy of Sciences, Institute of Acoustics, Beijing, China
  • Delft University of Technology, Faculty of Electrical Engineering, Mathematics, and Computer Science, The Netherlands (former)
  • Peking University, Shenzhen Graduate School, Key Laboratory of Machine Perception / Engineering Lab on Intelligent Perception for Internet of Things, Beijing, China (former)


According to our database1, Jie Zhang authored at least 63 papers between 2014 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Adversarial speech for voice privacy protection from Personalized Speech generation.
CoRR, 2024

Multichannel AV-wav2vec2: A Framework for Learning Multichannel Multi-Modal Speech Representation.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Adaptive Video Streaming With Automatic Quality-of-Experience Optimization.
IEEE Trans. Mob. Comput., August, 2023

DUASVS: A Mobile Data Saving Strategy in Short-Form Video Streaming.
IEEE Trans. Serv. Comput., 2023

A Joint Speech Enhancement and Self-Supervised Representation Learning Framework for Noise-Robust Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

SDW-SWF: Speech Distortion Weighted Single-Channel Wiener Filter for Noise Reduction.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

Energy-Efficient Sparsity-Driven Speech Enhancement in Wireless Acoustic Sensor Networks.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

A Dynamic Convolution Framework for Session-Independent Speaker Embedding Learning.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

Memory Storable Network Based Feature Aggregation for Speaker Representation Learning.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

A Semi-Supervised Complementary Joint Training Approach for Low-Resource Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

Rep2wav: Noise Robust text-to-speech Using self-supervised representations.
CoRR, 2023

CASA-ASR: Context-Aware Speaker-Attributed ASR.
CoRR, 2023

Semantic VAD: Low-Latency Voice Activity Detection for Speech Interaction.
CoRR, 2023

BASEN: Time-Domain Brain-Assisted Speech Enhancement Network with Convolutional Cross Attention in Multi-talker Conditions.
CoRR, 2023

Speech Enhancement with Multi-granularity Vector Quantization.
CoRR, 2023

A Speech Distortion Weighted Single-Channel Wiener Filter Based STFT-Domain Noise Reduction.
Proceedings of the IEEE Statistical Signal Processing Workshop, 2023

Hierarchical Audio-Visual Information Fusion with Multi-label Joint Decoding for MER 2023.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

The USTC's Dialect Speech Translation System for IWSLT 2023.
Proceedings of the 20th International Conference on Spoken Language Translation, 2023

Robust Data2VEC: Noise-Robust Speech Representation Learning for ASR by Combining Regression and Improved Contrastive Learning.
Proceedings of the IEEE International Conference on Acoustics, 2023

The NERCSLIP-USTC System for the L3DAS23 Challenge Task2: 3D Sound Event Localization and Detection (SELD).
Proceedings of the IEEE International Conference on Acoustics, 2023

A Multi-Scale Feature Aggregation Based Lightweight Network for Audio-Visual Speech Enhancement.
Proceedings of the IEEE International Conference on Acoustics, 2023

Speech Enhancement with Multi-granularity Vector Quantization.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

A Comparative Study on Multichannel Speaker-Attributed Automatic Speech Recognition in Multi-party Meetings.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

Learning Semantic Information from Machine Translation to Improve Speech-to-Text Translation.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

2022
Frequency-Invariant Sensor Selection for MVDR Beamforming in Wireless Acoustic Sensor Networks.
IEEE Trans. Wirel. Commun., 2022

A Parametric Unconstrained Beamformer Based Binaural Noise Reduction for Assistive Hearing.
IEEE ACM Trans. Audio Speech Lang. Process., 2022

VATLM: Visual-Audio-Text Pre-Training with Unified Masked Prediction for Speech Representation Learning.
CoRR, 2022

Speech Enhancement Using Self-Supervised Pre-Trained Model and Vector Quantization.
CoRR, 2022

Joint Training of Speech Enhancement and Self-supervised Model for Noise-robust ASR.
CoRR, 2022

A Complementary Joint Training Approach Using Unpaired Speech and Text for Low-Resource Automatic Speech Recognition.
CoRR, 2022

External Text Based Data Augmentation for Low-Resource Speech Recognition in the Constrained Condition of OpenASR21 Challenge.
Proceedings of the Interspeech 2022, 2022

Differential Time-frequency Log-mel Spectrogram Features for Vision Transformer Based Infant Cry Recognition.
Proceedings of the Interspeech 2022, 2022

A Complementary Joint Training Approach Using Unpaired Speech and Text A Complementary Joint Training Approach Using Unpaired Speech and Text.
Proceedings of the Interspeech 2022, 2022

An Experimental Comparison between Low-Resource Semi-Supervised and High-Resource Supervised Automatic Speech Recognition Models.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

Learning Contextually Fused Audio-Visual Representations For Audio-Visual Speech Recognition.
Proceedings of the 2022 IEEE International Conference on Image Processing, 2022

A Noise-Robust Self-Supervised Pre-Training Model Based Speech Representation Learning for Automatic Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2022

Supervised and Self-Supervised Pretraining Based Covid-19 Detection Using Acoustic Breathing/Cough/Speech Signals.
Proceedings of the IEEE International Conference on Acoustics, 2022

Reference Microphone Selection and Low-Rank Approximation Based Multichannel Wiener Filter with Application to Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
Power Optimized and Power Constrained Randomized Gossip Approaches for Wireless Sensor Networks.
IEEE Wirel. Commun. Lett., 2021

Quantization-Aware Binaural MWF Based Noise Reduction Incorporating External Wireless Devices.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Sensor Selection for Relative Acoustic Transfer Function Steered Linearly-Constrained Beamformers.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

A Study on Reference Microphone Selection for Multi-Microphone Speech Enhancement.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Multi-Granularity Sequence Alignment Mapping for Encoder-Decoder Based End-to-End ASR.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

An Improved Wav2Vec 2.0 Pre-Training Approach Using Enhanced Local Dependency Modeling for Speech Recognition.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

2020
Joint Sampling Synchronization and Source Localization for Wireless Acoustic Sensor Networks.
IEEE Commun. Lett., 2020

Attentive Fusion Enhanced Audio-Visual Encoding for Transformer Based Robust Speech Recognition.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

2019
Relative Acoustic Transfer Function Estimation in Wireless Acoustic Sensor Networks.
IEEE ACM Trans. Audio Speech Lang. Process., 2019

Distributed Rate-Constrained LCMV Beamforming.
IEEE Signal Process. Lett., 2019

Sensor Selection and Rate Distribution Based Beamforming in Wireless Acoustic Sensor Networks.
Proceedings of the 27th European Signal Processing Conference, 2019

2018
Rate-Distributed Spatial Filtering Based Noise Reduction in Wireless Acoustic Sensor Networks.
IEEE ACM Trans. Audio Speech Lang. Process., 2018

Microphone Subset Selection for MVDR Beamformer Based Noise Reduction.
IEEE ACM Trans. Audio Speech Lang. Process., 2018

Rate-Distributed Binaural LCMV Beamforming for Assistive Hearing in Wireless Acoustic Sensor Networks.
Proceedings of the 10th IEEE Sensor Array and Multichannel Signal Processing Workshop, 2018

2017
Binaural Sound Localization Based on Reverberation Weighting and Generalized Parametric Mapping.
IEEE ACM Trans. Audio Speech Lang. Process., 2017

2016
Bi-Direction Interaural Matching Filter and Decision Weighting Fusion for Sound Source Localization in Noisy Environments.
IEICE Trans. Inf. Syst., 2016

Structured total least squares based internal delay estimation for distributed microphone auto-localization.
Proceedings of the IEEE International Workshop on Acoustic Signal Enhancement, 2016

Probabilistic binaural multiple sources localization based on time-delay compensation estimator and clustering analysis.
Proceedings of the 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2016

2015
Robust Acoustic Localization Via Time-Delay Compensation and Interaural Matching Filter.
IEEE Trans. Signal Process., 2015

Binaural cues estimates based on Interaural Matching Filter for sound source localization.
Proceedings of the 2015 IEEE International Conference on Robotics and Biomimetics, 2015

Direction of arrival estimation based on reverberation weighting and noise error estimator.
Proceedings of the INTERSPEECH 2015, 2015

Binaural sound source localization based on generalized parametric model and two-layer matching strategy in complex environments.
Proceedings of the IEEE International Conference on Robotics and Automation, 2015

2014
A new hierarchical binaural sound source localization method based on Interaural Matching Filter.
Proceedings of the 2014 IEEE International Conference on Robotics and Automation, 2014

A binaural sound source localization model based on time-delay compensation and interaural coherence.
Proceedings of the IEEE International Conference on Acoustics, 2014

Speaker age recognition based on isolated words by using SVM.
Proceedings of the IEEE 3rd International Conference on Cloud Computing and Intelligence Systems, 2014


  Loading...