Mao-shen Jia

Xinfeng Zhang

Circuits Syst. Signal Process., November, 2025

DARAS: Dynamic Audio-Room Acoustic Synthesis for Blind Room Impulse Response Estimation.

[BibT_eX]

[DOI]

Chunxi Wang

Wenyu Jin

CoRR, July, 2025

TFF-Codec: A High Fidelity End-to-End Neural Audio Codec.

[BibT_eX]

[DOI]

Circuits Syst. Signal Process., June, 2025

Cross-corpus speech emotion recognition using semi-supervised domain adaptation network.

[BibT_eX]

[DOI]

Speech Commun., 2025

Hybrid dual-path network: Singing voice separation in the waveform domain by combining Conformer and Transformer architectures.

[BibT_eX]

[DOI]

Speech Commun., 2025

Enhanced Prediction of Intracranial Aneurysm Rupture Risk via Multimodal Fusion.

[BibT_eX]

[DOI]

IET Image Process., 2025

Direct-path Relative Harmonic Coefficients Detection for Multi-source Direction-of-Arrival Estimation in Reverberant Environments.

[BibT_eX]

[DOI]

Liang Tao

Yonggang Hu

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Power Spectral Density Estimation for Acoustic Source Separation Using A Spherical Microphone Array.

[BibT_eX]

[DOI]

Liang Tao

Yonggang Hu

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Speech Enhancement with Dual-path Multi-Channel Linear Prediction Filter and Multi-norm Beamforming.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Using Corrected ASR Projection to Improve AD Recognition Performance from Spontaneous Speech.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

SS-BRPE: Self-Supervised Blind Room Parameter Estimation Using Attention Mechanisms.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

2024

Exploring the power of pure attention mechanisms in blind room parameter estimation.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., December, 2024

TIM-Net: A multi-label classification network for TCM tongue images fusing global-local features.

[BibT_eX]

[DOI]

IET Image Process., May, 2024

Joint DOA Estimation and Dereverberation Based on Multi-Channel Linear Prediction Filtering and Azimuth Sparsity.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2024

Harmonic-Aware Frequency and Time Attention for Automatic Piano Transcription.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2024

First-Order Relative Harmonic Coefficient-Based Time-Frequency Points Selection for Multi-Source DOA Estimation.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2024

Three-Dimensional Room Transfer Function Parameterization Based on Multiple Concentric Planar Circular Arrays.

[BibT_eX]

[DOI]

Lu Li

IEEE ACM Trans. Audio Speech Lang. Process., 2024

A distortionless convolution beamformer design method based on the weighted minimum mean square error for joint dereverberation and denoising.

[BibT_eX]

[DOI]

Speech Commun., 2024

A semi-supervised segmentation network fusing pseudo-label with multi-level feature consistency correction for hard exudates.

[BibT_eX]

[DOI]

IET Image Process., 2024

A Hybrid DFSMN and Mamba Architecture for Low Bitrate Neural Speech Coding.

[BibT_eX]

[DOI]

Proceedings of the 14th IEEE International Symposium on Chinese Spoken Language Processing, 2024

Speech Emotion Recognition Based on Shallow Structure of Wav2vec 2.0 and Attention Mechanism.

[BibT_eX]

[DOI]

Proceedings of the 14th IEEE International Symposium on Chinese Spoken Language Processing, 2024

A Dual-path Conformer-Based Network for Neural Speech Coding.

[BibT_eX]

[DOI]

Proceedings of the 14th IEEE International Symposium on Chinese Spoken Language Processing, 2024

Robust Coherent sources Localization Based on Hankel Matrix Reconstruction.

[BibT_eX]

[DOI]

Proceedings of the 14th IEEE International Symposium on Chinese Spoken Language Processing, 2024

Spatial Acoustic Enhancement Using Unbiased Relative Harmonic Coefficients.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Attention Is All You Need For Blind Room Volume Estimation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

Deep encoder/decoder dual-path neural network for speech separation in noisy reverberation environments.

[BibT_eX]

[DOI]

Chunxi Wang

Xinfeng Zhang

EURASIP J. Audio Speech Music. Process., December, 2023

Separation of Multiple Speech Sources in Reverberant Environments Based on Sparse Component Enhancement.

[BibT_eX]

[DOI]

Circuits Syst. Signal Process., October, 2023

Multisource localization based on angle distribution of time-frequency points using an FOA microphone.

[BibT_eX]

[DOI]

CAAI Trans. Intell. Technol., September, 2023

Diffuseness Estimation-Based SSTP Detection for Multiple Sound Source Localization in Reverberant Environments.

[BibT_eX]

[DOI]

Circuits Syst. Signal Process., August, 2023

Study of MVDR Beamforming with Spatially Distributed Source: Theoretical Analysis and Efficient Microphone Array Geometry Optimization Method.

[BibT_eX]

[DOI]

Circuits Syst. Signal Process., August, 2023

Multiple-Speech-Source DOA Estimation Based on Single-Source Cluster Detection.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2023

Multi-Source Localization Using Optimized Time-Frequency Representation and Sparsity Component Analysis.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2023

Adaptive learning Unet-based adversarial network with CNN and transformer for segmentation of hard exudates in diabetes retinopathy.

[BibT_eX]

[DOI]

IET Image Process., 2023

Sound Source Localization by Combining Phase Consistency and Angle Deviation.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Computing and Artificial Intelligence, 2023

Single Source Zone Detection in the Spherical Harmonic Domain for Multisource Localization.

[BibT_eX]

[DOI]

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

2022

Speech Enhancement With Robust Beamforming for Spatially Overlapped and Distributed Sources.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2022

Cross-corpus speech emotion recognition using subspace learning and domain adaption.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2022

A 3D U-Net-Based Approach for Intracranial Aneurysm Detection.

[BibT_eX]

[DOI]

Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition, 2022

A Symmetric Dual-Attention Generative Adversarial Network with Channel and Spatial Features Fusion.

[BibT_eX]

[DOI]

Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition, 2022

DOA Estimation of Multiple Sources based on the Angle Distribution of Time-frequency Points in Single-source Zone.

[BibT_eX]

[DOI]

Liang Tao

Lu Li

Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition, 2022

Speech Recognition Method based on CTC Multilayer Loss.

[BibT_eX]

[DOI]

Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition, 2022

Speech Emotion Recognition by using Philips Fingerprint and Spectral Entropy.

[BibT_eX]

[DOI]

Proceedings of the ICCAI '22: 8th International Conference on Computing and Artificial Intelligence, Tianjin, China, March 18, 2022

2021

Multi-Source DOA Estimation in Reverberant Environments by Jointing Detection and Modeling of Time-Frequency Points.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2021

Multi-source localization by using offset residual weight.

[BibT_eX]

[DOI]

Shang Gao

EURASIP J. Audio Speech Music. Process., 2021

Person Re-identification Based on Hash.

[BibT_eX]

[DOI]

Proceedings of the Intelligent Computing Theories and Application, 2021

A Hierarchical Retrieval Method Based on Hash Table for Audio Fingerprinting.

[BibT_eX]

[DOI]

Tianhao Li

Xuan Cao

Proceedings of the Intelligent Computing Theories and Application, 2021

Multi-source Localization by Using the Correlation between Single-Source Components.

[BibT_eX]

[DOI]

Proceedings of the ICCPR '21: 10th International Conference on Computing and Pattern Recognition, Shanghai, China, October 15, 2021

A multi-source localization method based on clustering and outlier removal.

[BibT_eX]

[DOI]

Shang Gao

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

2020

Multiple Sound Source Separation by Jointing Single Source Zone Detection and Linearly Constrained Minimum Variance.

[BibT_eX]

[DOI]

Proceedings of the ICCPR 2020: 9th International Conference on Computing and Pattern Recognition, Xiamen, China, October 30, 2020

Multiple Speech Source Separation by Using MVDR for B-Format Recordings.

[BibT_eX]

[DOI]

Proceedings of the ICCPR 2020: 9th International Conference on Computing and Pattern Recognition, Xiamen, China, October 30, 2020

2019

Sound Field Reproduction in Reverberant Room Using the Alternating Direction Method of Multipliers Based Lasso and Regularized Least-Square.

[BibT_eX]

[DOI]

Proceedings of the Intelligent Computing Theories and Application, 2019

Multiple Sound Sources Localization by using Statistical Source Component Equalization.

[BibT_eX]

[DOI]

Proceedings of the ICCPR '19: 8th International Conference on Computing and Pattern Recognition, 2019

2018

Design of a Planar First-Order Loudspeaker Array for Global Active Noise Control.

[BibT_eX]

[DOI]

Bing Bu

IEEE ACM Trans. Audio Speech Lang. Process., 2018

Separation of multiple speech sources by recovering sparse and non-sparse components from B-format microphone recordings.

[BibT_eX]

[DOI]

Speech Commun., 2018

Multiple Sound Sources Localization with Frame-by-Frame Component Removal of Statistically Dominant Source.

[BibT_eX]

[DOI]

Sensors, 2018

Multiple Source Localization by Using Improved Single Source Bins Detection.

[BibT_eX]

[DOI]

Jundai Sun

J. Inf. Hiding Multim. Signal Process., 2018

Multiple Speech Source Separation with Non-Sparse Components Recovery by Using Dual Similarity Determination.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2018

Sound Field Reproduction via the Alternating Direction Method of Multipliers Based Lasso Plus Regularized Least-Square.

[BibT_eX]

[DOI]

IEEE Access, 2018

Optical Character Detection and Recognition for Image-Based in Natural Scene.

[BibT_eX]

[DOI]

Proceedings of the Intelligent Computing Methodologies - 14th International Conference, 2018

2017

Real-time multiple sound source localization and counting using a soundfield microphone.

[BibT_eX]

[DOI]

Jundai Sun

J. Ambient Intell. Humaniz. Comput., 2017

Simulating the Three-Dimensional Room Transfer Function for a Rotatable Complex Source.

[BibT_eX]

[DOI]

Bing Bu

IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2017

Multiple audio source separation by using intra-object-sparsity encoding framework.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Signal Processing, 2017

Multiple source localization by using energy weighted single source zone detection.

[BibT_eX]

[DOI]

Jundai Sun

Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

2016

HMM-based cue parameters estimation for speech enhancement.

[BibT_eX]

[DOI]

Feng Deng

Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

Measurement of the acoustic transfer function using compressed sensing techniques.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016

2015

Encoding Multiple Audio Objects Using Intra-Object Sparsity.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2015

3D multizone soundfield reproduction using spherical harmonic analysis.

[BibT_eX]

[DOI]

Proceedings of the IEEE China Summit and International Conference on Signal and Information Processing, 2015

Conversion of multichannel sound signals based on spherical harmonics with L1-norm constraint.

[BibT_eX]

[DOI]

Proceedings of the IEEE China Summit and International Conference on Signal and Information Processing, 2015

3D multizone soundfield reproduction in the reverberant room using a spherical loudspeaker array.

[BibT_eX]

[DOI]

Meng-fang Zha

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

An analysis-by-synthesis encoding approach for multiple audio objects.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

2014

The design of Ambisonic reproduction system based on dynamic gain parameters.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

The design of ambisonics decoders for irregular speaker array conforming to subjective perception.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Audio, 2014

Relative distance estimation in multi-channel spatial audio signal.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Audio, 2014

Multi-source sound field reproduction using cylindrical harmonic analysis.

[BibT_eX]

[DOI]

Proceedings of the IEEE China Summit & International Conference on Signal and Information Processing, 2014

Speech enhancement based on a few shapes of speech spectrum.

[BibT_eX]

[DOI]

Proceedings of the IEEE China Summit & International Conference on Signal and Information Processing, 2014

The design of HOA irregular decoders based on the optimal symmetrical virtual microphone response.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

Range extrapolation of Head-Related Transfer Function using improved Higher Order Ambisonics.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

Speech enhancement based on a novel weighting spectral distortion measure.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

A novel speech enhancement method using power spectra smooth in Wiener filtering.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

2011

A MDCT-based click noise reduction method for MPEG-4 AAC codec.

[BibT_eX]

[DOI]

Proceedings of the 2011 International Conference on Wireless Communications & Signal Processing, 2011

A sinusoidal audio and speech analysis/synthesis model based on improved EMD by adding pure tone.

[BibT_eX]

[DOI]

Xiao-ming Li