Yoshiki Masuyama

Natsuki Ueno

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Domain Adaptation for Multi-Channel Acoustic Scene Classification to Different Array Positions.

[BibT_eX]

[DOI]

Takao Kawamura

Proceedings of the 33rd European Signal Processing Conference, 2025

Robot Confirmation Generation and Action Planning Using Long-context Q-Former Integrated with Multimodal LLM.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2025

2024

Efficient Joint Optimization of Sampling Rate Offsets Using Entire Multichannel Signal.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2024

Causal and Relaxed-Distortionless Response Beamforming for Online Target Source Extraction.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2024

ESPnet-Codec: Comprehensive Training and Evaluation of Neural Codecs For Audio, Music, and Speech.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

Mamba-Based Decoder-Only Approach with Bidirectional Speech Modeling for Speech Recognition.

[BibT_eX]

[DOI]

Koichi Miyazaki

Masato Murata

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

Exploring the Capability of Mamba in Speech Applications.

[BibT_eX]

[DOI]

Koichi Miyazaki

Masato Murata

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

NIIRF: Neural IIR Filter Field for HRTF Upsampling and Personalization.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

Software Design and User Interface of ESPnet-SE++: Speech Enhancement for Robust Speech Processing.

[BibT_eX]

[DOI]

J. Open Source Softw., November, 2023

Software Design and User Interface of ESPnet-SE++: Speech Enhancement for Robust Speech Processing (espnet-v.202310).

[BibT_eX]

[DOI]

Dataset, October, 2023

Online Phase Reconstruction via DNN-Based Phase Differences Estimation.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2023

The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios.

[BibT_eX]

[DOI]

CoRR, 2023

Multi-Channel Target Speaker Extraction with Refinement: The WavLab Submission to the Second Clarity Enhancement Challenge.

[BibT_eX]

[DOI]

CoRR, 2023

Signal Reconstruction from Mel-Spectrogram Based on Bi-Level Consistency of Full-Band Magnitude and Phase.

[BibT_eX]

[DOI]

Natsuki Ueno

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2023

Exploring the Integration of Speech Separation and Recognition with Self-Supervised Learning Representation.

[BibT_eX]

[DOI]

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2023

Multi-Channel Speaker Extraction with Adversarial Training: The Wavlab Submission to The Clarity ICASSP 2023 Grand Challenge.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Neural Fast Full-Rank Spatial Covariance Analysis for Blind Source Separation.

[BibT_eX]

[DOI]

Proceedings of the 31st European Signal Processing Conference, 2023

Scenario-Aware Audio-Visual TF-Gridnet for Target Speech Extraction.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

Fundamental Frequency Estimation Based on Finite-Order Harmonic Constraint Differential Equation.

[BibT_eX]

[DOI]

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

2022

End-to-End Integration of Speech Recognition, Dereverberation, Beamforming, and Self-Supervised Learning Representation.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Joint Optimization of Sampling Rate Offsets Based on Entire Signal Relationship Among Distributed Microphones.

[BibT_eX]

[DOI]

Kouei Yamaoka

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

ESPnet-SE++: Speech Enhancement for Robust Speech Recognition, Translation, and Understanding.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

2021

Neural Full-Rank Spatial Covariance Analysis for Blind Source Separation.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2021

Deep Griffin-Lim Iteration: Trainable Iterative Phase Reconstruction Using Neural Network.

[BibT_eX]

[DOI]

IEEE J. Sel. Top. Signal Process., 2021

Robust Auditory Functions Based on Probabilistic Integration of MUSIC and CGMM.

[BibT_eX]

[DOI]

IEEE Access, 2021

Simultaneous Declipping and Beamforming via Alternating Direction Method of Multipliers.

[BibT_eX]

[DOI]

Proceedings of the 29th European Signal Processing Conference, 2021

Causal Distortionless Response Beamforming by Alternating Direction Method of Multipliers.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

2020

Joint Amplitude and Phase Refinement for Monaural Source Separation.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2020

Self-supervised Neural Audio-Visual Sound Source Localization via Probabilistic Spatial Modeling.

[BibT_eX]

[DOI]

Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2020

Unsupervised Training for Deep Speech Source Separation with Kullback-Leibler Divergence Based Probabilistic Loss Function.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Phase Reconstruction Based On Recurrent Phase Unwrapping With Deep Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Consistency-Aware Multi-Channel Speech Enhancement Using Deep Neural Networks.

[BibT_eX]

[DOI]

Masahito Togami

Tatsuya Komatsu

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Speech Enhancement Using Self-Adaptation and Multi-Head Self-Attention.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Computer-Resource-Aware Deep Speech Separation with a Run-Time-Specified Number of BLSTM Layers.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

2019

Griffin-Lim Like Phase Recovery via Alternating Direction Method of Multipliers.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2019

Multichannel Loss Function for Supervised Speech Source Separation by Mask-Based Beamforming.

[BibT_eX]

[DOI]

Masahito Togami

Tatsuya Komatsu

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Phase-aware Harmonic/percussive Source Separation via Convex Optimization.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Low-rankness of Complex-valued Spectrogram and Its Application to Phase-aware Audio Processing.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Deep Griffin-Lim Iteration.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

2018

Designing nearly tight window for improving time-frequency masking.

[BibT_eX]

[DOI]

CoRR, 2018

Rectified Linear Unit Can Assist Griffin-Lim Phase Recovery.

[BibT_eX]

[DOI]

Proceedings of the 16th International Workshop on Acoustic Signal Enhancement, 2018

Model-Based Phase Recovery of Spectrograms via Optimization on Riemannian Manifolds.

[BibT_eX]

[DOI]