Tsubasa Ochiai

Orcid: 0000-0002-2519-2032

According to our database1, Tsubasa Ochiai authored at least 52 papers between 2014 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Probing Self-supervised Learning Models with Target Speech Extraction.
CoRR, 2024

Target Speech Extraction with Pre-trained Self-supervised Learning Models.
CoRR, 2024

Array Geometry-Robust Attention-Based Neural Beamformer for Moving Speakers.
CoRR, 2024

2023
Neural Target Speech Extraction: An overview.
IEEE Signal Process. Mag., May, 2023

Mask-Based Neural Beamforming for Moving Speakers With Self-Attention-Based Tracking.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

SoundBeam: Target Sound Extraction Conditioned on Sound-Class Labels and Enrollment Clues for Increased Performance and Continuous Learning.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

Downstream Task Agnostic Speech Enhancement with Self-Supervised Representation Loss.
CoRR, 2023

Streaming End-to-End Target-Speaker Automatic Speech Recognition and Activity Detection.
IEEE Access, 2023

2022
ConceptBeam: Concept Driven Target Speech Extraction.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Analysis of Impact of Emotions on Target Speech Extraction and Speech Separation.
Proceedings of the 17th International Workshop on Acoustic Signal Enhancement, 2022

Strategies to Improve Robustness of Target Speech Extraction to Enrollment Variations.
Proceedings of the Interspeech 2022, 2022

Streaming Target-Speaker ASR with Neural Transducer.
Proceedings of the Interspeech 2022, 2022

Revisiting joint decoding based multi-talker speech recognition with DNN acoustic model.
Proceedings of the Interspeech 2022, 2022

How bad are artifacts?: Analyzing the impact of speech enhancement errors on ASR.
Proceedings of the Interspeech 2022, 2022

Listen only to me! How well can target speech extraction handle false alarms?
Proceedings of the Interspeech 2022, 2022

Learning to Enhance or Not: Neural Network-Based Switching of Enhanced and Observed Signals for Overlapping Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
Multimodal Attention Fusion for Target Speaker Extraction.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

PILOT: Introducing Transformers for Probabilistic Sound Event Localization.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Should We Always Separate?: Switching Between Enhanced and Observed Signals for Overlapping Speech Recognition.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Streaming End-to-End Speech Recognition for Hybrid RNN-T/Attention Architecture.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Few-Shot Learning of New Sound Classes for Target Sound Extraction.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

End-to-End Dereverberation, Beamforming, and Speech Recognition with Improved Numerical Stability and Advanced Frontend.
Proceedings of the IEEE International Conference on Acoustics, 2021

Data Fusion for Audiovisual Speaker Localization: Extending Dynamic Stream Weights to the Spatial Domain.
Proceedings of the IEEE International Conference on Acoustics, 2021

Neural Network-Based Virtual Microphone Estimator.
Proceedings of the IEEE International Conference on Acoustics, 2021

Simpleflat: A Simple Whole-Network Pre-Training Approach for RNN Transducer-Based End-to-End Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2021

Speaker Activity Driven Neural Speech Extraction.
Proceedings of the IEEE International Conference on Acoustics, 2021

Convolutive Transfer Function Invariant SDR Training Criteria for Multi-Channel Reverberant Speech Separation.
Proceedings of the IEEE International Conference on Acoustics, 2021

2020
Convolutive Transfer Function Invariant SDR training criteria for Multi-Channel Reverberant Speech Separation.
CoRR, 2020

Listen to What You Want: Neural Network-Based Universal Sound Selector.
Proceedings of the Interspeech 2020, 2020

Self-Distillation for Improving CTC-Transformer-Based ASR Systems.
Proceedings of the Interspeech 2020, 2020

A Dynamic Stream Weight Backprop Kalman Filter for Audiovisual Speaker Tracking.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Beam-TasNet: Time-domain Audio Separation Network Meets Frequency-domain Beamformer.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

DNN-supported Mask-based Convolutional Beamforming for Simultaneous Denoising, Dereverberation, and Source Separation.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Improving Noise Robust Automatic Speech Recognition with Single-Channel Time-Domain Enhancement Network.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Improving Speaker Discrimination of Target Speech Extraction With Time-Domain Speakerbeam.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Exploiting Attention-based Sequence-to-Sequence Architectures for Sound Event Localization.
Proceedings of the 28th European Signal Processing Conference, 2020

2019
SpeakerBeam: Speaker Aware Neural Network for Target Speaker Extraction in Speech Mixtures.
IEEE J. Sel. Top. Signal Process., 2019

Multimodal SpeakerBeam: Single Channel Target Speech Extraction with Audio-Visual Speaker Clues.
Proceedings of the Interspeech 2019, 2019

End-to-End SpeakerBeam for Single Channel Target Speech Recognition.
Proceedings of the Interspeech 2019, 2019

A Unified Framework for Neural Speech Separation and Extraction.
Proceedings of the IEEE International Conference on Acoustics, 2019

Compact Network for Speakerbeam Target Speaker Extraction.
Proceedings of the IEEE International Conference on Acoustics, 2019

2018

Speaker Adaptation for Multichannel End-to-End Speech Recognition.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
Unified Architecture for Multichannel End-to-End Speech Recognition With Neural Beamforming.
IEEE J. Sel. Top. Signal Process., 2017

Does speech enhancement work with end-to-end ASR objectives?: Experimental analysis of multichannel end-to-end ASR.
Proceedings of the 27th IEEE International Workshop on Machine Learning for Signal Processing, 2017

Multichannel End-to-end Speech Recognition.
Proceedings of the 34th International Conference on Machine Learning, 2017

Automatic node selection for Deep Neural Networks using Group Lasso regularization.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Cumulative moving averaged bottleneck speaker vectors for online speaker adaptation of CNN-based acoustic models.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

2016
Speaker Adaptive Training Localizing Speaker Modules in DNN for Hybrid DNN-HMM Speech Recognizers.
IEICE Trans. Inf. Syst., 2016

Bottleneck linear transformation network adaptation for speaker adaptive training-based hybrid DNN-HMM speech recognizer.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015
Speaker adaptive training for deep neural networks embedding linear transformation networks.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2014
Speaker Adaptive Training using Deep Neural Networks.
Proceedings of the IEEE International Conference on Acoustics, 2014


  Loading...