Yasunori Ohishi

Orcid: 0000-0002-7856-248X

According to our database1, Yasunori Ohishi authored at least 42 papers between 2005 and 2023.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2023
BYOL for Audio: Exploring Pre-Trained General-Purpose Audio Representations.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

SoundBeam: Target Sound Extraction Conditioned on Sound-Class Labels and Enrollment Clues for Increased Performance and Continuous Learning.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

Audio Difference Captioning Utilizing Similarity-Discrepancy Disentanglement.
CoRR, 2023

Masked Modeling Duo for Speech: Specializing General-Purpose Audio Representation to Speech using Denoising Distillation.
CoRR, 2023

Masked Modeling Duo: Learning Representations by Encouraging Both Networks to Model the Input.
Proceedings of the IEEE International Conference on Acoustics, 2023

First-Shot Anomaly Sound Detection for Machine Condition Monitoring: A Domain Generalization Baseline.
Proceedings of the 31st European Signal Processing Conference, 2023

Joint Analysis of Acoustic Scenes and Sound Events Based on Semi-Supervised Approach.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

2022
Multi-view and Multi-modal Event Detection Utilizing Transformer-based Multi-sensor fusion.
CoRR, 2022

ConceptBeam: Concept Driven Target Speech Extraction.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Introducing Auxiliary Text Query-modifier to Content-based Audio Retrieval.
Proceedings of the Interspeech 2022, 2022

Multi-View And Multi-Modal Event Detection Utilizing Transformer-Based Multi-Sensor Fusion.
Proceedings of the IEEE International Conference on Acoustics, 2022

Echo-Aware Adaptation of Sound Event Localization and Detection in Unknown Environments.
Proceedings of the IEEE International Conference on Acoustics, 2022

Composing General Audio Representation by Fusing Multilayer Features of a Pre-trained Model.
Proceedings of the 30th European Signal Processing Conference, 2022

2021
BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation.
Proceedings of the International Joint Conference on Neural Networks, 2021

Masked Spectrogram Modeling using Masked Autoencoders for Learning General-purpose Audio Representation.
Proceedings of the HEAR: Holistic Evaluation of Audio Representations, 2021

ToyADMOS2: Another Dataset of Miniature-Machine Operating Sounds for Anomalous Sound Detection under Domain Shift Conditions.
Proceedings of the 6th Workshop on Detection and Classification of Acoustic Scenes and Events 2021 (DCASE 2021), 2021

2020
Audio Captioning using Pre-Trained Large-Scale Language Model Guided by Audio-based Similar Caption Retrieval.
CoRR, 2020

The NTT DCASE2020 Challenge Task 6 system: Automated Audio Captioning with Keywords and Sentence Length Estimation.
CoRR, 2020

Crossmodal Sound Retrieval Based on Specific Target Co-Occurrence Denoted with Weak Labels.
Proceedings of the Interspeech 2020, 2020

Harmonic Lowering for Accelerating Harmonic Convolution for Audio Signals.
Proceedings of the Interspeech 2020, 2020

Pair Expansion for Learning Multilingual Semantic Embeddings Using Disjoint Visually-Grounded Speech Audio Datasets.
Proceedings of the Interspeech 2020, 2020

Unsupervised Co-Segmentation for Athlete Movements and Live Commentaries Using Crossmodal Temporal Proximity.
Proceedings of the 25th International Conference on Pattern Recognition, 2020

Trilingual Semantic Embeddings of Visually Grounded Speech with Self-Attention Mechanisms.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Effects of Word-Frequency Based Pre- and Post- Processings for Audio Captioning.
Proceedings of 5th the Workshop on Detection and Classification of Acoustic Scenes and Events 2020 (DCASE 2020), 2020

2019
Crossmodal Voice Conversion.
CoRR, 2019

2015
Generative Modeling of Voice Fundamental Frequency Contours.
IEEE ACM Trans. Audio Speech Lang. Process., 2015

2014
Mixture of Gaussian process experts for predicting sung melodic contour with expressive dynamic fluctuations.
Proceedings of the IEEE International Conference on Acoustics, 2014

Mondrian hidden Markov model for music signal processing.
Proceedings of the IEEE International Conference on Acoustics, 2014

2013
Acoustic scene analysis based on latent acoustic topic and event allocation.
Proceedings of the IEEE International Workshop on Machine Learning for Signal Processing, 2013

Generative modeling of speech F<sub>0</sub> contours.
Proceedings of the INTERSPEECH 2013, 2013

Bayesian semi-supervised audio event transcription based on Markov indian buffet process.
Proceedings of the IEEE International Conference on Acoustics, 2013

2012
A Stochastic Model of Singing Voice F0 Contours for Characterizing Expressive Dynamic Components.
Proceedings of the INTERSPEECH 2012, 2012

Bayesian nonparametric music parser.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011
Automatic audio tag classification via semi-supervised canonical density estimation.
Proceedings of the IEEE International Conference on Acoustics, 2011

2010
Statistical modeling of F0 dynamics in singing voices based on Gaussian processes with multiple oscillation bases.
Proceedings of the INTERSPEECH 2010, 2010

A statistical model of speech F0 contours.
Proceedings of the ISCA Workshop on Statistical And Perceptual Audition, 2010

2009
Automatic Identification for Singing Style based on Sung Melodic Contour Characterized in Phase Plane.
Proceedings of the 10th International Society for Music Information Retrieval Conference, 2009

2008
Building and combining document and music spaces for music query-by-webpage system.
Proceedings of the INTERSPEECH 2008, 2008

Parameter estimation method of F0 control model for singing voices.
Proceedings of the INTERSPEECH 2008, 2008

2007
A Stochastic Representation of the Dynamics of Sung Melody.
Proceedings of the 8th International Conference on Music Information Retrieval, 2007

2006
Statistical Analysis for Thesaurus Construction using an Encyclopedic Corpus.
Proceedings of the Fifth International Conference on Language Resources and Evaluation, 2006

2005
Discrimination between singing and speaking voices.
Proceedings of the INTERSPEECH 2005, 2005


  Loading...