Yuki Mitsufuji

Orcid: 0000-0002-6806-6140

According to our database1, Yuki Mitsufuji authored at least 85 papers between 2013 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
DiffuCOMET: Contextual Commonsense Knowledge Diffusion.
CoRR, 2024

MusicMagus: Zero-Shot Text-to-Music Editing via Diffusion Models.
CoRR, 2024

HQ-VAE: Hierarchical Discrete Representation Learning with Variational Bayes.
CoRR, 2024

2023
Manifold Preserving Guided Diffusion.
CoRR, 2023

On the Language Encoder of Contrastive Cross-modal Models.
CoRR, 2023

Consistency Trajectory Models: Learning Probability Flow ODE Trajectory of Diffusion.
CoRR, 2023

Towards reporting bias in visual-language datasets: bimodal augmentation by decoupling object-attribute association.
CoRR, 2023

Timbre-Trap: A Low-Resource Framework for Instrument-Agnostic Music Transcription.
CoRR, 2023

Zero- and Few-shot Sound Event Localization and Detection.
CoRR, 2023

VRDMG: Vocal Restoration via Diffusion Posterior Sampling with Multiple Guidance.
CoRR, 2023

BigVSAN: Enhancing GAN-based Neural Vocoders with Slicing Adversarial Network.
CoRR, 2023

Enhancing Semantic Communication with Deep Generative Models - An ICASSP Special Session Overview.
CoRR, 2023

The Sound Demixing Challenge 2023 - Cinematic Demixing Track.
CoRR, 2023

The Sound Demixing Challenge 2023 - Music Demixing Track.
CoRR, 2023

On the Equivalence of Consistency-Type Models: Consistency Models, Consistent Diffusion Models, and Fokker-Planck Regularization.
CoRR, 2023

Diffusion-Based Speech Enhancement with Joint Generative and Predictive Decoders.
CoRR, 2023

The Whole Is Greater than the Sum of Its Parts: Improving DNN-based Music Source Separation.
CoRR, 2023

Diffusion-based Signal Refiner for Speech Separation.
CoRR, 2023

Cross-modal Face- and Voice-style Transfer.
CoRR, 2023

Adversarially Slicing Generative Networks: Discriminator Slices Feature for One-Dimensional Optimal Transport.
CoRR, 2023

Extending Audio Masked Autoencoders toward Audio Restoration.
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2023

STARSS23: An Audio-Visual Dataset of Spatial Recordings of Real Scenes with Spatiotemporal Annotations of Sound Events.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Automatic Piano Transcription With Hierarchical Frequency-Time Transformer.
Proceedings of the 24th International Society for Music Information Retrieval Conference, 2023

GibbsDDRM: A Partially Collapsed Gibbs Sampler for Solving Blind Inverse Problems with Denoising Diffusion Restoration.
Proceedings of the International Conference on Machine Learning, 2023

FP-Diffusion: Improving Score-based Diffusion Models by Enforcing the Underlying Score Fokker-Planck Equation.
Proceedings of the International Conference on Machine Learning, 2023

CLIPSep: Learning Text-queried Sound Separation with Noisy Unlabeled Videos.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

An Attention-Based Approach to Hierarchical Multi-Label Music Instrument Classification.
Proceedings of the IEEE International Conference on Acoustics, 2023

Hierarchical Diffusion Models for Singing Voice Neural Vocoder.
Proceedings of the IEEE International Conference on Acoustics, 2023

Unsupervised Vocal Dereverberation with Diffusion-Based Generative Models.
Proceedings of the IEEE International Conference on Acoustics, 2023

Music Mixing Style Transfer: A Contrastive Learning Approach to Disentangle Audio Effects.
Proceedings of the IEEE International Conference on Acoustics, 2023

Diffroll: Diffusion-Based Generative Music Transcription with Unsupervised Pretraining Capability.
Proceedings of the IEEE International Conference on Acoustics, 2023

PeaCoK: Persona Commonsense Knowledge for Consistent and Engaging Narratives.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022
Preventing oversmoothing in VAE via generalized variance parameterization.
Neurocomputing, 2022

A Versatile Diffusion-based Generative Refiner for Speech Enhancement.
CoRR, 2022

Robust One-Shot Singing Voice Conversion.
CoRR, 2022

Regularizing Score-based Models with Score Fokker-Planck Equations.
CoRR, 2022

Removing Distortion Effects in Music Using Deep Neural Networks.
CoRR, 2022

Automatic music mixing with deep learning and out-of-domain data.
Proceedings of the 23rd International Society for Music Information Retrieval Conference, 2022

Distortion Audio Effects: Learning How to Recover the Clean Signal.
Proceedings of the 23rd International Society for Music Information Retrieval Conference, 2022

SQ-VAE: Variational Bayes on Discrete Representation with Self-annealed Stochastic Quantization.
Proceedings of the International Conference on Machine Learning, 2022

Amicable Examples for Informed Source Separation.
Proceedings of the IEEE International Conference on Acoustics, 2022

Multi-ACCDOA: Localizing And Detecting Overlapping Sounds From The Same Class With Auxiliary Duplicating Permutation Invariant Training.
Proceedings of the IEEE International Conference on Acoustics, 2022

Spatial Mixup: Directional Loudness Modification as Data Augmentation for Sound Event Localization and Detection.
Proceedings of the IEEE International Conference on Acoustics, 2022

Spatial Data Augmentation with Simulated Room Impulse Responses for Sound Event Localization and Detection.
Proceedings of the IEEE International Conference on Acoustics, 2022

Music Source Separation With Deep Equilibrium Models.
Proceedings of the IEEE International Conference on Acoustics, 2022

Automatic DJ Transitions with Differentiable Audio Effects and Generative Adversarial Networks.
Proceedings of the IEEE International Conference on Acoustics, 2022

ComFact: A Benchmark for Linking Contextual Commonsense Knowledge.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

STARSS22: A Dataset of Spatial Recordings of Real Scenes with Spatiotemporal Annotations of Sound Events.
Proceedings of the 7th Workshop on Detection and Classification of Acoustic Scenes and Events 2022, 2022

2021
Multichannel Blind Source Separation Based on Evanescent-Region-Aware Non-Negative Tensor Factorization in Spherical Harmonic Domain.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Source Mixing and Separation Robust Audio Steganography.
CoRR, 2021

Music Demixing Challenge at ISMIR 2021.
CoRR, 2021

Ensemble of ACCDOA- and EINV2-based Systems with D3Nets and Impulse Response Simulation for Sound Event Localization and Detection.
CoRR, 2021

Training Speech Enhancement Systems with Noisy Speech Datasets.
CoRR, 2021

Preventing Posterior Collapse Induced by Oversmoothing in Gaussian VAE.
CoRR, 2021

Hierarchical disentangled representation learning for singing voice conversion.
Proceedings of the International Joint Conference on Neural Networks, 2021

Adversarial Attacks on Audio Source Separation.
Proceedings of the IEEE International Conference on Acoustics, 2021

Accdoa: Activity-Coupled Cartesian Direction of Arrival Representation for Sound Event Localization And Detection.
Proceedings of the IEEE International Conference on Acoustics, 2021

All For One And One For All: Improving Music Separation By Bridging Networks.
Proceedings of the IEEE International Conference on Acoustics, 2021

Densely Connected Multi-Dilated Convolutional Networks for Dense Prediction Tasks.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Psychophysiological Effect of Immersive Spatial Audio Experience Enhanced Using Sound Field Synthesis.
Proceedings of the 9th International Conference on Affective Computing and Intelligent Interaction, 2021

2020
Multichannel Non-Negative Matrix Factorization Using Banded Spatial Covariance Matrices in Wavenumber Domain.
IEEE ACM Trans. Audio Speech Lang. Process., 2020

Spherical-Harmonic-Domain Feedforward Active Noise Control Using Sparse Decomposition of Reference Signals from Distributed Sensor Arrays.
IEEE ACM Trans. Audio Speech Lang. Process., 2020

Densely connected multidilated convolutional networks for dense prediction tasks.
CoRR, 2020

D3Net: Densely connected multidilated DenseNet for music source separation.
CoRR, 2020

Sound Event Localization and Detection Using Activity-Coupled Cartesian DOA Vector and RD3net.
CoRR, 2020

Improving Voice Separation by Incorporating End-To-End Speech Recognition.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Array-Geometry-Aware Spatial Active Noise Control Based on Direction-of-Arrival Weighting.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Open-Unmix - A Reference Implementation for Music Source Separation.
J. Open Source Softw., 2019

Closing the Training/Inference Gap for Deep Attractor Networks.
CoRR, 2019

Recursive Speech Separation for Unknown Number of Speakers.
Proceedings of the Interspeech 2019, 2019

Global and Local Mode-domain Adaptive Algorithms for Spatial Active Noise Control Using Higher-order Sources.
Proceedings of the IEEE International Conference on Acoustics, 2019

2018
Improving DNN-based Music Source Separation using Phase Features.
CoRR, 2018

Mmdenselstm: An Efficient Combination of Convolutional and Recurrent Neural Networks for Audio Source Separation.
Proceedings of the 16th International Workshop on Acoustic Signal Enhancement, 2018

Mode-Domain Spatial Active Noise Control Using Multiple Circular Arrays.
Proceedings of the 16th International Workshop on Acoustic Signal Enhancement, 2018

PhaseNet: Discretized Phase Modeling with Deep Neural Networks for Audio Source Separation.
Proceedings of the Interspeech 2018, 2018

Mode Domain Spatial Active Noise Control Using Sparse Signal Representation.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
Multi-Scale multi-band densenets for audio source separation.
Proceedings of the 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2017

Improving music source separation based on deep neural networks through data augmentation and network blending.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Supervised monaural source separation based on autoencoders.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

2016
Multichannel blind source separation based on non-negative tensor factorization in wavenumber domain.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015
Deep neural network based instrument extraction from music.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

NMF-based blind source separation using a linear predictive coding error clustering criterion.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2014
On the use of a spatial cue as prior information for stereo sound source separation based on spatially weighted non-negative tensor factorization.
EURASIP J. Adv. Signal Process., 2014

Online NON-negative Tensor Deconvolution for source detection in 3DTV audio.
Proceedings of the IEEE International Conference on Acoustics, 2014

2013
Sound source separation based on non-negative tensor factorization incorporating spatial cue as prior knowledge.
Proceedings of the IEEE International Conference on Acoustics, 2013


  Loading...