Kazuhito Koishida

Orcid: 0000-0002-3111-5375

According to our database1, Kazuhito Koishida authored at least 40 papers between 1994 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
uaMix-MAE: Efficient Tuning of Pretrained Audio Transformers with Unsupervised Audio Mixtures.
CoRR, 2024

Learned Image Compression with Text Quality Enhancement.
CoRR, 2024

2023
Automatic Disfluency Detection from Untranscribed Speech.
CoRR, 2023

Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation.
CoRR, 2023

Progressive Knowledge Distillation: Building Ensembles for Efficient Inference.
CoRR, 2023

Progressive Ensemble Distillation: Building Ensembles for Efficient Inference.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Toward A Multimodal Approach for Disfluency Detection and Categorization.
Proceedings of the IEEE International Conference on Acoustics, 2023

Improving Low-Latency Mono-Channel Speech Enhancement by Compensation Windows in STFT Analysis.
Proceedings of the Complex Networks & Their Applications XII, 2023

2022
SCP-GAN: Self-Correcting Discriminator Optimization for Training Consistency Preserving Metric GAN on Speech Enhancement Tasks.
CoRR, 2022

A Training Framework for Stereo-Aware Speech Enhancement Using Deep Neural Networks.
Proceedings of the IEEE International Conference on Acoustics, 2022

Training Robust Zero-Shot Voice Conversion Models with Self-Supervised Features.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
Augmented Contrastive Self-Supervised Learning for Audio Invariant Representations.
CoRR, 2021

INTERSPEECH 2021 Deep Noise Suppression Challenge.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Single-Channel Speech Enhancement Using Learnable Loss Mixup.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Cascaded Time + Time-Frequency Unet For Speech Enhancement: Jointly Addressing Clipping, Codec Distortions, And Gaps.
Proceedings of the IEEE International Conference on Acoustics, 2021

2020
Single-Channel Speech Enhancement by Subspace Affinity Minimization.
Proceedings of the Interspeech 2020, 2020

Robust Pitch Regression with Voiced/Unvoiced Classification in Nonstationary Noise Environments.
Proceedings of the Interspeech 2020, 2020

Low-Latency Single Channel Speech Dereverberation Using U-Net Convolutional Neural Networks.
Proceedings of the Interspeech 2020, 2020

Online Directional Speech Enhancement Using Geometrically Constrained Independent Vector Analysis.
Proceedings of the Interspeech 2020, 2020

Neuro-Symbolic Visual Reasoning: Disentangling "Visual" from "Reasoning".
Proceedings of the 37th International Conference on Machine Learning, 2020

Geometrically Constrained Independent Vector Analysis for Directional Speech Enhancement.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

AV(SE)<sup>2</sup>: Audio-Visual Squeeze-Excite Speech Enhancement.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Low-Latency Single Channel Speech Enhancement Using U-Net Convolutional Neural Networks.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

MMTM: Multimodal Transfer Module for CNN Fusion.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Improved Active Speaker Detection based on Optical Flow.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019
Adversarial Training for Speech Super-Resolution.
IEEE J. Sel. Top. Signal Process., 2019

Sound Event Detection in Multichannel Audio Using Convolutional Time-Frequency-Channel Squeeze and Excitation.
Proceedings of the Interspeech 2019, 2019

Speech Super Resolution Generative Adversarial Network.
Proceedings of the IEEE International Conference on Acoustics, 2019

2018
Text-Independent Speaker Verification Based on Triplet Convolutional Neural Network Embeddings.
IEEE ACM Trans. Audio Speech Lang. Process., 2018

2017
End-to-End Text-Independent Speaker Verification with Triplet Loss on Short Utterances.
Proceedings of the Interspeech 2017, 2017

End-to-end text-independent speaker verification with flexibility in utterance duration.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

2008
Hybrid low bitrate audio coding using adaptive gain shape vector quantization.
Proceedings of the International Workshop on Multimedia Signal Processing, 2008

2000
A 1200 bps speech coder based on MELP.
Proceedings of the IEEE International Conference on Acoustics, 2000

A 16-kbit/s bandwidth scalable audio coder based on the G.729 standard.
Proceedings of the IEEE International Conference on Acoustics, 2000

1998
A 16 kbit/s wideband CELP coder using MEL-generalized cepstral analysis and its subjective evaluation.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

A wideband CELP speech coder at 16 kbit/s based on mel-generalized cepstral analysis.
Proceedings of the 1998 IEEE International Conference on Acoustics, 1998

1997
Efficient encoding of mel-generalized cepstrum for CELP coders.
Proceedings of the 1997 IEEE International Conference on Acoustics, 1997

1996
CELP coding system based on mel-generalized cepstral analysis.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

1995
CELP coding based on mel-cepstral analysis.
Proceedings of the 1995 International Conference on Acoustics, 1995

1994
Speech coding based on adaptive MEL-cepstral analysis for noisy channels.
Proceedings of the 3rd International Conference on Spoken Language Processing, 1994


  Loading...