Shota Orihashi

Orcid: 0009-0005-5998-6278

According to our database¹, Shota Orihashi authored at least 43 papers between 2015 and 2026.

Collaborative distances:

Dijkstra number² of five.
Erdős number³ of five.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Bibliography

2026

Distribution Highlighted Reference-based Label Distribution Learning for Facial Age Estimation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2026

Difference Vector Equalization for Robust Fine-tuning of Vision-Language Models.

[BibT_eX]

[DOI]

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025

MSMVD: Exploiting Multi-scale Image Features via Multi-scale BEV Features for Multi-view Pedestrian Detection.

[BibT_eX]

[DOI]

CoRR, August, 2025

SOMSRED-SVC: Sequential Output Modeling with Speaker Vector Constraints for Joint Multi-Talker Overlapped ASR and Speaker Diarization.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Unified Audio-Visual Modeling for Recognizing Which Face Spoke When and What in Multi-Talker Overlapped Speech and Video.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

MVTrajecter: Multi-View Pedestrian Tracking With Trajectory Motion Cost and Trajectory Appearance Cost.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

Phoneme Overlapping-Aware Pre-Training with External Text Resources for Multi-Talker ASR.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2025

Few-shot Personalization via In-Context Learning for Speech Emotion Recognition based on Speech-Language Model.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2025

Semi-Supervised End-to-End Speech-to-Text Translation with Joint Text-to-Text and Speech-to-Text Decoding.

[BibT_eX]

[DOI]

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2025

Joint Modeling of Big Five and HEXACO for Multimodal Apparent Personality-Trait Recognition.

[BibT_eX]

[DOI]

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2025

Multimodal Fine-Grained Apparent Personality Trait Recognition: Joint Modeling of Big Five and Questionnaire Item-level Scores.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024

Unified Multi-Talker ASR with and without Target-speaker Enrollment.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

SOMSRED: Sequential Output Modeling for Joint Multi-talker Overlapped Speech Recognition and Speaker Diarization.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Born-Again Multi-task Self-training for Multi-task Facial Emotion Recognition.

[BibT_eX]

[DOI]

Proceedings of the Pattern Recognition - 27th International Conference, 2024

Block Refinement Learning for Improving Early Exit in Autoregressive ASR.

[BibT_eX]

[DOI]

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2024

2023

Open-Set Recognition for Facial-Expression Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Image Processing, 2023

Distilling Knowledge of Bidirectional Language Model for Scene Text Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Image Processing, 2023

Text-to-Text Pre-Training with Paraphrasing for Improving Transformer-Based Image Captioning.

[BibT_eX]

[DOI]

Proceedings of the 31st European Signal Processing Conference, 2023

2022

Audio Visual Scene-Aware Dialog Generation with Transformer-based Video Representations.

[BibT_eX]

[DOI]

CoRR, 2022

Interactive Co-Learning with Cross-Modal Transformer for Audio-Visual Emotion Recognition.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

End-to-End Joint Modeling of Conversation History-Dependent and Independent ASR Systems with Multi-History Training.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Fully Shareable Scene Text Recognition Modeling for Horizontal and Vertical Writing.

[BibT_eX]

[DOI]

Proceedings of the 2022 IEEE International Conference on Image Processing, 2022

2021

GAN-Based Image Compression Using Mutual Information for Optimizing Subjective Image Similarity.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2021

Large-Context Conversational Representation Learning: Self-Supervised Learning For Conversational Documents.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Utilizing Resource-Rich Language Datasets for End-to-End Scene Text Recognition in Resource-Poor Languages.

[BibT_eX]

[DOI]

Proceedings of the MMAsia '21: ACM Multimedia Asia, Gold Coast, Australia, December 1, 2021

End-to-End Rich Transcription-Style Automatic Speech Recognition with Semi-Supervised Learning.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Cross-Modal Transformer-Based Neural Correction Models for Automatic Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Unified Autoregressive Modeling for Joint End-to-End Multi-Talker Overlapped Speech Recognition and Speaker Attribute Estimation.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Enrollment-Less Training for Personalized Voice Activity Detection.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Zero-Shot Joint Modeling of Multiple Spoken-Text-Style Conversion Tasks Using Switching Tokens.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Hierarchical Transformer-Based Large-Context End-To-End ASR with Large-Context Knowledge Distillation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Audio-Visual Speech Separation Using Cross-Modal Correspondence Loss.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

MAPGN: Masked Pointer-Generator Network for Sequence-to-Sequence Pre-Training.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Hierarchical Knowledge Distillation for Dialogue Sequence Labeling.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020

Unsupervised Domain Adaptation for Dialogue Sequence Labeling Based on Hierarchical Adversarial Training.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Phoneme-to-Grapheme Conversion Based Large-Scale Pre-Training for End-to-End Automatic Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Memory Attentive Fusion: External Language Model Integration for Transformer-based Sequence-to-Sequence Model.

[BibT_eX]

[DOI]

Proceedings of the 13th International Conference on Natural Language Generation, 2020

Unsupervised Domain Adversarial Training in Angular Space for Facial Expression Recognition.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

Subjective Quality Driven Image Encoding Method Using Image Completion.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

2019

GAN-based Image Compression Using Mutual Information Maximizing Regularization.

[BibT_eX]

[DOI]

Proceedings of the Picture Coding Symposium, 2019

2016

Improvement of H.265/HEVC encoding for 8K UHDTV by detecting motion complexity.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Consumer Electronics, 2016

2015

An Adaptive H.265/HEVC Encoding Control for 8K UHDTV Movies Based on Motion Complexity Estimation.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Symposium on Multimedia, 2015

Improvement of 8K UHDTV picture quality for H.265/HEVC by global zoom estimation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Consumer Electronics, 2015

Shota Orihashi

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...