Kunio Kashino

According to our database1, Kunio Kashino authored at least 165 papers between 1993 and 2023.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2023
Deep attentive time warping.
Pattern Recognit., April, 2023

BYOL for Audio: Exploring Pre-Trained General-Purpose Audio Representations.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

Audio Difference Captioning Utilizing Similarity-Discrepancy Disentanglement.
CoRR, 2023

Masked Modeling Duo for Speech: Specializing General-Purpose Audio Representation to Speech using Denoising Distillation.
CoRR, 2023

Masked Modeling Duo: Learning Representations by Encouraging Both Networks to Model the Input.
Proceedings of the IEEE International Conference on Acoustics, 2023

2022
Contrast enhancement based on reflectance-oriented probabilistic equalization.
Signal Process., 2022

ConceptBeam: Concept Driven Target Speech Extraction.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Introducing Auxiliary Text Query-modifier to Content-based Audio Retrieval.
Proceedings of the Interspeech 2022, 2022

Composing General Audio Representation by Fusing Multilayer Features of a Pre-trained Model.
Proceedings of the 30th European Signal Processing Conference, 2022

2021
Reflectance-Guided Histogram Equalization and Comparametric Approximation.
IEEE Trans. Circuits Syst. Video Technol., 2021

Contrast enhancement based on discriminative co-occurrence statistics.
Multim. Tools Appl., 2021

BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation.
Proceedings of the International Joint Conference on Neural Networks, 2021

Deep Reinforcement Image Matching with Self-Termination.
Proceedings of the 2021 IEEE International Conference on Image Processing, 2021

Attention to Warp: Deep Metric Learning for Multivariate Time Series.
Proceedings of the 16th International Conference on Document Analysis and Recognition, 2021

Reflectance-Oriented Probabilistic Equalization for Image Enhancement.
Proceedings of the IEEE International Conference on Acoustics, 2021

Masked Spectrogram Modeling using Masked Autoencoders for Learning General-purpose Audio Representation.
Proceedings of the HEAR: Holistic Evaluation of Audio Representations, 2021

Unsupervised Heart Sound Decomposition and State Estimation with Generative Oscillation Models.
Proceedings of the 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2021

2020
Knowledge Discovery from Layered Neural Networks Based on Non-negative Task Matrix Decomposition.
IEICE Trans. Inf. Syst., 2020

The NTT DCASE2020 Challenge Task 6 system: Automated Audio Captioning with Keywords and Sentence Length Estimation.
CoRR, 2020

Harmonic Lowering for Accelerating Harmonic Convolution for Audio Signals.
Proceedings of the Interspeech 2020, 2020

Pair Expansion for Learning Multilingual Semantic Embeddings Using Disjoint Visually-Grounded Speech Audio Datasets.
Proceedings of the Interspeech 2020, 2020

Total Whitening for Online Signature Verification Based on Deep Representation.
Proceedings of the 25th International Conference on Pattern Recognition, 2020

Unsupervised Co-Segmentation for Athlete Movements and Live Commentaries Using Crossmodal Temporal Proximity.
Proceedings of the 25th International Conference on Pattern Recognition, 2020

Translating Adult's Focus of Attention to Elderly's.
Proceedings of the 25th International Conference on Pattern Recognition, 2020

Reflectance-Guided, Contrast-Accumulated Histogram Equalization.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Trilingual Semantic Embeddings of Visually Grounded Speech with Self-Attention Mechanisms.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Gaussian process with physical laws for 3D cardiac modeling.
Proceedings of the 28th European Signal Processing Conference, 2020

Effects of Word-Frequency Based Pre- and Post- Processings for Audio Captioning.
Proceedings of 5th the Workshop on Detection and Classification of Acoustic Scenes and Events 2020 (DCASE 2020), 2020

Adaptive Spotting: Deep Reinforcement Object Search in 3D Point Clouds.
Proceedings of the Computer Vision - ACCV 2020 - 15th Asian Conference on Computer Vision, Kyoto, Japan, November 30, 2020

Cascaded Transposed Long-Range Convolutions for Monocular Depth Estimation.
Proceedings of the Computer Vision - ACCV 2020 - 15th Asian Conference on Computer Vision, Kyoto, Japan, November 30, 2020

2019
Understanding community structure in layered neural networks.
Neurocomputing, 2019

Development of the Stool Color Card for Early Detection of Biliary Atresia using Multispectral Image.
Proceedings of the 27th Color and Imaging Conference, 2019

Robust Learning for Deep Monocular Depth Estimation.
Proceedings of the 2019 IEEE International Conference on Image Processing, 2019

Deep Dynamic Time Warping: End-to-End Local Representation Learning for Online Signature Verification.
Proceedings of the 2019 International Conference on Document Analysis and Recognition, 2019

Subspace Structure-Aware Spectral Clustering for Robust Subspace Clustering.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Prewarping Siamese Network: Learning Local Representations for Online Signature Verification.
Proceedings of the IEEE International Conference on Acoustics, 2019

Learning Search Path for Region-level Image Matching.
Proceedings of the IEEE International Conference on Acoustics, 2019

Seeing through Sounds: Predicting Visual Semantic Segmentation Results from Multichannel Audio Signals.
Proceedings of the IEEE International Conference on Acoustics, 2019

Neural Audio Captioning Based on Conditional Sequence-to-Sequence Model.
Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events 2019 (DCASE 2019), 2019

Delving Deep into Least Square Regression Model for Subspace Clustering.
Proceedings of the 30th British Machine Vision Conference 2019, 2019

2018
Modular representation of layered neural networks.
Neural Networks, 2018

Label Propagation with Ensemble of Pairwise Geometric Relations: Towards Robust Large-Scale Retrieval of Object Instances.
Int. J. Comput. Vis., 2018

Knowledge Discovery from Layered Neural Networks based on Non-negative Task Decomposition.
CoRR, 2018

Color enhancement factors to control spectral power distribution of illumination.
Proceedings of the SIGGRAPH Asia 2018 Posters, Tokyo, Japan, December 04-07, 2018, 2018

Weighted Generalized Mean Pooling for Deep Image Retrieval.
Proceedings of the 2018 IEEE International Conference on Image Processing, 2018

Query Expansion with Diffusion On Mutual Rank Graphs.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Statistical Phrase/Accent Command Estimation Algorithm Utilizing Linguistic Information.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Generating Sound Words from Audio Signals of Acoustic Events with Sequence-to-Sequence Model.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Acoustic event search with an onomatopoeic query: measuring distance between onomatopoeic words and sounds.
Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events, 2018

Generative Adversarial Image Synthesis With Decision Tree Latent Controller.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Localizing the Gaze Target of a Crowd of People.
Proceedings of the Computer Vision - ACCV 2018 Workshops, 2018

2017
Visualizing Video Sounds With Sound Word Animation to Enrich User Experience.
IEEE Trans. Multim., 2017

NTT Communication Science Laboratories and National Institute of Informatics at TRECVID 2017 Instance Search.
Proceedings of the 2017 TREC Video Retrieval Evaluation, 2017

Modular representation of autoencoder networks.
Proceedings of the 2017 IEEE Symposium Series on Computational Intelligence, 2017

Sequence-to-Sequence Voice Conversion with Similarity Metric Learned Using Generative Adversarial Networks.
Proceedings of the Interspeech 2017, 2017

Contrast-accumulated histogram equalization for image enhancement.
Proceedings of the 2017 IEEE International Conference on Image Processing, 2017

Edited film alignment via selective Hough transform and accurate template matching.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Fast algorithm for statistical phrase/accent command estimation based on generative model incorporating spectral features.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Deep salience map guided arbitrary direction scene text recognition.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Generative adversarial network-based postfilter for statistical parametric speech synthesis.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Recursive Extraction of Modular Structure from Layered Neural Networks Using Variational Bayes Method.
Proceedings of the Discovery Science - 20th International Conference, 2017

Generative Attribute Controller with Conditional Filtered Generative Adversarial Networks.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Visualizing Lost Designs in Degraded Early Modern Tapestry Using Infra-red Image.
Proceedings of the Computational Color Imaging - 6th International Workshop, 2017

Non-native speech conversion with consistency-aware recursive network and generative adversarial network.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

2016
Bayesian Exponential Inverse Document Frequency and Region-of-Interest Effect for Enhancing Instance Search Accuracy.
IEICE Trans. Inf. Syst., 2016

Unsupervised categorical shape reconstruction through manifolds.
Proceedings of the 2016 IEEE Winter Conference on Applications of Computer Vision, 2016

Image Transformation of Eye Areas for Synthesizing Eye-contacts in Video Conferencing.
Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2016), 2016

Adaptive Visual Feedback Generation for Facial Expression Improvement with Multi-task Deep Neural Networks.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Designing Spectral Power Distribution of Illumination with Color Chart to Enhance Color Saturation.
Proceedings of the 24th Color and Imaging Conference, 2016

Scene text recognition with CNN classifier and WFST-based word labeling.
Proceedings of the 23rd International Conference on Pattern Recognition, 2016

Scene text recognition with high performance CNN classifier and efficient word inference.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Filter design based on multiple model estimation.
Proceedings of the 2016 American Control Conference, 2016

2015
Second-Order Configuration of Local Features for Geometrically Stable Image Matching and Retrieval.
IEEE Trans. Circuits Syst. Video Technol., 2015

Generative Modeling of Voice Fundamental Frequency Contours.
IEEE ACM Trans. Audio Speech Lang. Process., 2015

Interest point selection by topology coherence for multi-query image retrieval.
Multim. Tools Appl., 2015

NTT at TRECVID 2015: Instance Search.
Proceedings of the 2015 TREC Video Retrieval Evaluation, 2015

Visual Attention Driven by Auditory Cues - Selecting Visual Features in Synchronization with Attracting Auditory Events.
Proceedings of the MultiMedia Modeling - 21st International Conference, 2015

Reproduction of Reflective and Fluorescent Components using Eight-band Imaging.
Proceedings of the 23rd Color and Imaging Conference, 2015

Data-driven taxonomy forest for fine-grained image categorization.
Proceedings of the 2015 IEEE International Conference on Multimedia and Expo, 2015

Visualizing video sounds with sound word animation.
Proceedings of the 2015 IEEE International Conference on Multimedia and Expo, 2015

Adaptive Dither Voting for Robust Spatial Verification.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

A fast audio search method based on skipping irrelevant signals by similarity upper-bound calculation.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Monte Carlo filter particle filter.
Proceedings of the 14th European Control Conference, 2015

Gaussian sum resampling filter.
Proceedings of the 54th IEEE Conference on Decision and Control, 2015

Trademark Image Retrieval Using Inverse Total Feature Frequency and Multiple Detectors.
Proceedings of the Computer Analysis of Images and Patterns, 2015

Robust Spatial Matching as Ensemble of Weak Geometric Relations.
Proceedings of the British Machine Vision Conference 2015, 2015

2014
BM25 With Exponential IDF for Instance Search.
IEEE Trans. Multim., 2014

NTT Communication Science Laboratories at TRECVID 2014 Instance Search Task.
Proceedings of the 2014 TREC Video Retrieval Evaluation, 2014

Efficient POC-based Correspondence Detection Method for Multi-channel Images.
Proceedings of the 22nd Color and Imaging Conference, 2014

Image Retrieval Based on Anisotropic Scaling and Shearing Invariant Geometric Coherence.
Proceedings of the 22nd International Conference on Pattern Recognition, 2014

Spatial People Density Estimation from Multiple Viewpoints by Memory Based Regression.
Proceedings of the 22nd International Conference on Pattern Recognition, 2014

Video Content Detection with Single Frame Level Accuracy Using Dynamic Thresholding Technique.
Proceedings of the 22nd International Conference on Pattern Recognition, 2014

Experimental Evaluation of Chromostereopsis with Varying Center Wavelength and FWHM of Spectral Power Distribution.
Proceedings of the Image and Signal Processing - 6th International Conference, 2014

Image retrieval based on spatial context with Relaxed Gabriel Graph pyramid.
Proceedings of the IEEE International Conference on Acoustics, 2014

Mixture of Gaussian process experts for predicting sung melodic contour with expressive dynamic fluctuations.
Proceedings of the IEEE International Conference on Acoustics, 2014

Mondrian hidden Markov model for music signal processing.
Proceedings of the IEEE International Conference on Acoustics, 2014

Iterative unscented statistically linearized filter for nonlinear Gaussian observation models.
Proceedings of the 53rd IEEE Conference on Decision and Control, 2014

Tri-Map Self-Validation Based on Least Gibbs Energy for Foreground Segmentation.
Proceedings of the British Machine Vision Conference, 2014

Unscented statistical linearization and robustified Kalman filter for nonlinear systems with parameter uncertainties.
Proceedings of the American Control Conference, 2014

2013
Stereo one-shot six-band camera system for accurate color reproduction.
J. Electronic Imaging, 2013

A stereo six-band motion picture capturing using 4K digital cinema camera.
Proceedings of the Special Interest Group on Computer Graphics and Interactive Techniques Conference, 2013

Generative modeling of speech F<sub>0</sub> contours.
Proceedings of the INTERSPEECH 2013, 2013

An eleven-band stereoscopic camera system for accurate color and spectral reproduction.
Proceedings of the 21st Color and Imaging Conference, 2013

Bayesian semi-supervised audio event transcription based on Markov indian buffet process.
Proceedings of the IEEE International Conference on Acoustics, 2013

Digital Archiving of Tapestries of Kyoto Gion Festival Using a High-Definition and Multispectral Image Capturing System.
Proceedings of the 2013 International Conference on Culture and Computing, 2013

Robustifying Kalman filter to rapidly adapt to significant changes in system model parameters of state-space models.
Proceedings of the 52nd IEEE Conference on Decision and Control, 2013

Normalized unscented Kalman filter and normalized unscented RTS smoother for nonlinear state-space model identification.
Proceedings of the American Control Conference, 2013

2012
NTT Communication Science Laboratories and National Institute of Informatics at TRECVID 2012 Instance Search and Multimedia Event Detection Tasks.
Proceedings of the 2012 TREC Video Retrieval Evaluation, 2012

High-definition and multispectral capturing for digital archiving of large 3D woven cultural artifacts.
Proceedings of the International Conference on Computer Graphics and Interactive Techniques, 2012

A stereo nine-band camera for accurate color and spectrum reproduction.
Proceedings of the International Conference on Computer Graphics and Interactive Techniques, 2012

A Stochastic Model of Singing Voice F0 Contours for Characterizing Expressive Dynamic Components.
Proceedings of the INTERSPEECH 2012, 2012

A six-band stereoscopic video camera system for accurate color reproduction.
Proceedings of the 20th Color and Imaging Conference, 2012

Bayesian nonparametric music parser.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Constrained and regularized variants of non-negative matrix factorization incorporating music-specific constraints.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

High-Resolution and Multi-spectral Capturing for Digital Archiving of Large 3D Woven Cultural Artifacts.
Proceedings of the Computer Vision - ACCV 2012 Workshops, 2012

2011
Interest Point Detection Based on Stochastically Derived Stability.
IPSJ Trans. Comput. Vis. Appl., 2011

NTT Communication Science Laboratories at TRECVID 2011 Content Based Copy Detection.
Proceedings of the 2011 TREC Video Retrieval Evaluation, 2011

NTT Communication Science Laboratories and NII at TRECVID 2011 Instance Search Task.
Proceedings of the 2011 TREC Video Retrieval Evaluation, 2011

Evaluating Color Reproduction Accuracy of Stereo One-shot Six-band Camera System.
Proceedings of the 19th Color and Imaging Conference, 2011

Automatic video annotation via Hierarchical Topic Trajectory Model considering cross-modal correlations.
Proceedings of the IEEE International Conference on Acoustics, 2011

2010
A stochastic model of human visual attention with a dynamic Bayesian network
CoRR, 2010

SEMANTIC INDEXING AND KNOWN ITEM SEARCH BASED ON A UNIFIED MODEL WITH TOPIC TRANSITION REPRESENTATION.
Proceedings of the TRECVID 2010 workshop participants notebook papers, 2010

NTT Communication Science Laboratories at TRECVID 2010 Content Based Copy Detection.
Proceedings of the TRECVID 2010 workshop participants notebook papers, 2010

NTT Communication Science Laboratories and NII in TRECVID 2010 Instance Search Task.
Proceedings of the TRECVID 2010 workshop participants notebook papers, 2010

Fast Template Matching Based on Normalized Cross Correlation Using Adaptive Block Partitioning and Initial Threshold Estimation.
Proceedings of the 12th IEEE International Symposium on Multimedia, 2010

Statistical modeling of F0 dynamics in singing voices based on Gaussian processes with multiple oscillation bases.
Proceedings of the INTERSPEECH 2010, 2010

Statistical Model of Speech Signals Based on Composite Autoregressive System with Application to Blind Source Separation.
Proceedings of the Latent Variable Analysis and Signal Separation, 2010

2009
Automatic Identification for Singing Style based on Sung Melodic Contour Characterized in Phase Plane.
Proceedings of the 10th International Society for Music Information Retrieval Conference, 2009

Composite Autoregressive System for Sparse Source-filter Representation of speech.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2009), 2009

Complex NMF: A new sparse representation for acoustic signals.
Proceedings of the IEEE International Conference on Acoustics, 2009

2008
A Quick Search Method for Audio Signals Based on a Piecewise Linear Representation of Feature Trajectories.
IEEE Trans. Speech Audio Process., 2008

A Robust Musical Audio Search Method Based on Diagonal Dynamic Programming Matching of Self-Similarity Matrices.
Proceedings of the ISMIR 2008, 2008

Parameter estimation method of F0 control model for singing voices.
Proceedings of the INTERSPEECH 2008, 2008

Dynamic Markov random fields for stochastic modeling of visual attention.
Proceedings of the 19th International Conference on Pattern Recognition (ICPR 2008), 2008

A stochastic model of selective visual attention with a dynamic Bayesian network.
Proceedings of the 2008 IEEE International Conference on Multimedia and Expo, 2008

A background music detection method based on robust feature extraction.
Proceedings of the IEEE International Conference on Acoustics, 2008

2007
A quick search method for audio signals based on a piecewise linear representation of feature trajectories
CoRR, 2007

A Computational Model of Saliency Depletion/Recovery Phenomena for the Salient Region Extraction of Videos.
Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007

A Musical Audio Search Method Based on Self-Similarity Features.
Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007

Robust Search Methods for Music Signals Based on Simple Representation.
Proceedings of the IEEE International Conference on Acoustics, 2007

2006
Frequency component restoration for music sounds using local probabilistic models with maximum entropy learning.
Proceedings of the ISCA Tutorial and Research Workshop on Statistical and Perceptual Audition, 2006

Frequency Component Restoration for Music Sounds using a Markov Random Field and Maximum Entropy Learning.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

2004
Acceleration of Similarity-Based Partial Image Retrieval using Multistage Vector Quantization.
Proceedings of the 17th International Conference on Pattern Recognition, 2004

A Fast Template Matching Algorithm with Adaptive Skipping Using Inner-Subtemplates' Distances.
Proceedings of the 17th International Conference on Pattern Recognition, 2004

A Quick Video Search Method based on Local and Global Feature Clustering.
Proceedings of the 17th International Conference on Pattern Recognition, 2004

Similarity-based partial image retrieval guaranteeing same accuracy as exhaustive matching.
Proceedings of the 2004 IEEE International Conference on Multimedia and Expo, 2004

Bayesian estimation of simultaneous musical notes based on frequency domain modelling.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2003
A quick search method for audio and video signals based on histogram pruning.
IEEE Trans. Multim., 2003

A quick search method for multimedia signals using global pruning.
Syst. Comput. Jpn., 2003

A fast search algorithm for background music signals based on the search for numerous small signal components.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Dynamic-segmentation-based feature dimension reduction for quick audio/video searching.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

2002
A Robust Audio Searching Method for Cellular-Phone-Based Music Information Retrieval.
Proceedings of the 16th International Conference on Pattern Recognition, 2002

Fast music retrieval using polyphonic binary feature vectors.
Proceedings of the 2002 IEEE International Conference on Multimedia and Expo, 2002

A quick search method for multimedia signals using feature compression based on piecewise linear maps.
Proceedings of the IEEE International Conference on Acoustics, 2002

2001
A method for robust and quick video searching using probabilistic dither-voting.
Proceedings of the 2001 International Conference on Image Processing, 2001

Very quick audio searching: introducing global pruning to the Time-Series Active Search.
Proceedings of the IEEE International Conference on Acoustics, 2001

2000
Feature Fluctuation Absorption for a Quick Audio Retrieval from Long Recordings.
Proceedings of the 15th International Conference on Pattern Recognition, 2000

1999
A sound source identification system for ensemble music based on template adaptation and music stream extraction.
Speech Commun., 1999

Time-series active search for quick retrieval of audio and video.
Proceedings of the 1999 IEEE International Conference on Acoustics, 1999

1998
Quick audio retrieval using active search.
Proceedings of the 1998 IEEE International Conference on Acoustics, 1998

Music recognition using note transition context.
Proceedings of the 1998 IEEE International Conference on Acoustics, 1998

1997
A Music Stream Segregation System Based on Adaptive Multi-Agents.
Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence, 1997

1996
A music scene analysis system with the MRF-based information integration scheme.
Proceedings of the 13th International Conference on Pattern Recognition, 1996

1995
Organization of Hierarchical Perceptual Sounds: Music Scene Analysis with Autonomous Processing Modules and a Quantitative Information Integration Mechanism.
Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, 1995

1993
A Sound Source Separation System with the Ability of Automatic Tone Modeling.
Proceedings of the Opening a New Horizon: Proceedings of the 1993 International Computer Music Conference, 1993


  Loading...