Mana Ihori

Difference Vector Equalization for Robust Fine-tuning of Vision-Language Models.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

SOMSRED-SVC: Sequential Output Modeling with Speaker Vector Constraints for Joint Multi-Talker Overlapped ASR and Speaker Diarization.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Unified Audio-Visual Modeling for Recognizing Which Face Spoke When and What in Multi-Talker Overlapped Speech and Video.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Phoneme Overlapping-Aware Pre-Training with External Text Resources for Multi-Talker ASR.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2025

Few-shot Personalization via In-Context Learning for Speech Emotion Recognition based on Speech-Language Model.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2025

Semi-Supervised End-to-End Speech-to-Text Translation with Joint Text-to-Text and Speech-to-Text Decoding.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2025

Joint Modeling of Big Five and HEXACO for Multimodal Apparent Personality-Trait Recognition.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2025

Multimodal Fine-Grained Apparent Personality Trait Recognition: Joint Modeling of Big Five and Questionnaire Item-level Scores.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

Unified Multi-Talker ASR with and without Target-speaker Enrollment.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

SOMSRED: Sequential Output Modeling for Joint Multi-talker Overlapped Speech Recognition and Speaker Diarization.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Talking Face Generation for Impression Conversion Considering Speech Semantics.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2024

Block Refinement Learning for Improving Early Exit in Autoregressive ASR.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2024

Downstream Task Agnostic Speech Enhancement with Self-Supervised Representation Loss.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

End-to-End Joint Target and Non-Target Speakers ASR.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Transcribing Speech as Spoken and Written Dual Text Using an Autoregressive Model.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Audio-Visual Praise Estimation for Conversational Video based on Synchronization-Guided Multimodal Transformer.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Retrieval, Masking, and Generation: Feedback Comment Generation using Masked Comment Examples.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the 16th International Natural Language Generation Conference, 2023

Leveraging Language Embeddings for Cross-Lingual Self-Supervised Speech Representation Learning.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2023

Text-to-Text Pre-Training with Paraphrasing for Improving Transformer-Based Image Captioning.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 31st European Signal Processing Conference, 2023

Domain Adversarial Self-Supervised Speech Representation Learning for Improving Unknown Domain Downstream Tasks.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Strategies to Improve Robustness of Target Speech Extraction to Enrollment Variations.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

End-to-End Joint Modeling of Conversation History-Dependent and Independent ASR Systems with Multi-History Training.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Multi-Perspective Document Revision.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the 29th International Conference on Computational Linguistics, 2022

Large-Context Conversational Representation Learning: Self-Supervised Learning For Conversational Documents.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Utilizing Resource-Rich Language Datasets for End-to-End Scene Text Recognition in Resource-Poor Languages.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the MMAsia '21: ACM Multimedia Asia, Gold Coast, Australia, December 1, 2021

End-to-End Rich Transcription-Style Automatic Speech Recognition with Semi-Supervised Learning.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Cross-Modal Transformer-Based Neural Correction Models for Automatic Speech Recognition.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Unified Autoregressive Modeling for Joint End-to-End Multi-Talker Overlapped Speech Recognition and Speaker Attribute Estimation.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Enrollment-Less Training for Personalized Voice Activity Detection.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Zero-Shot Joint Modeling of Multiple Spoken-Text-Style Conversion Tasks Using Switching Tokens.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Hierarchical Transformer-Based Large-Context End-To-End ASR with Large-Context Knowledge Distillation.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2021

Audio-Visual Speech Separation Using Cross-Modal Correspondence Loss.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2021

MAPGN: Masked Pointer-Generator Network for Sequence-to-Sequence Pre-Training.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2021

Hierarchical Knowledge Distillation for Dialogue Sequence Labeling.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

Parallel Corpus for Japanese Spoken-to-Written Style Conversion.

[BibT_eX]

[DOI]

Mana Ihori

,

Akihiko Takashima

,

Ryo Masumura

Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Unsupervised Domain Adaptation for Dialogue Sequence Labeling Based on Hierarchical Adversarial Training.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Phoneme-to-Grapheme Conversion Based Large-Scale Pre-Training for End-to-End Automatic Speech Recognition.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Memory Attentive Fusion: External Language Model Integration for Transformer-based Sequence-to-Sequence Model.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 13th International Conference on Natural Language Generation, 2020

Sequence-Level Consistency Training for Semi-Supervised End-to-End Automatic Speech Recognition.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Large-Context Pointer-Generator Networks for Spoken-to-Written Style Conversion.

[BibT_eX]

[DOI]

Mana Ihori

,

Akihiko Takashima

,

Ryo Masumura

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Unsupervised Domain Adversarial Training in Angular Space for Facial Expression Recognition.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

End-to-End Automatic Speech Recognition with Deep Mutual Learning.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

Generalized Large-Context Language Models Based on Forward-Backward Hierarchical Recurrent Encoder-Decoder Models.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Improving Speech-Based End-of-Turn Detection Via Cross-Modal Representation Learning with Punctuated Text Data.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Ryuichiro Higashinaka

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Mana Ihori

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...