Yuki Saito

Orcid: 0000-0003-0492-414X

According to our database1, Yuki Saito authored at least 98 papers between 2008 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
On permutation-invariant neural networks.
CoRR, 2024

UTDUSS: UTokyo-SaruLab System for Interspeech2024 Speech Processing Using Discrete Speech Unit Challenge.
CoRR, 2024

Building speech corpus with diverse voice characteristics for its prompt-based representation.
CoRR, 2024

Frame-Wise Breath Detection with Self-Training: An Exploration of Enhancing Breath Naturalness in Text-to-Speech.
CoRR, 2024

JVNV: A Corpus of Japanese Emotional Speech With Verbal Content and Nonverbal Expressions.
IEEE Access, 2024

An Analysis of Knowledge Representation for Anime Recommendation Using Graph Neural Networks.
Proceedings of the 16th International Conference on Agents and Artificial Intelligence, 2024

2023
Multicore fiber interconnects for multi-terabit spine-leaf datacenter network topologies.
J. Opt. Commun. Netw., July, 2023

GUI System to Support Cardiology Examination Based on Explainable Regression CNN for Estimating Pulmonary Artery Wedge Pressure.
IEICE Trans. Inf. Syst., March, 2023

Evaluation of Lower-Limb Kinematics during Timed Up and Go (TUG) Test in Subjects with Locomotive Syndrome (LS) Using Wearable Gait Sensors (H-Gait System).
Sensors, January, 2023

Fashion intelligence system: An outfit interpretation utilizing images and rich abstract tags.
Expert Syst. Appl., 2023

Outfit Completion via Conditional Set Transformation.
CoRR, 2023

StyleCap: Automatic Speaking-Style Captioning from Speech Based on Speech and Language Self-supervised Learning Models.
CoRR, 2023

HumanDiffusion: diffusion model using perceptual gradients.
CoRR, 2023

Virtual Human Generative Model: Masked Modeling Approach for Learning Human Characteristics.
CoRR, 2023

ChatGPT-EDSS: Empathetic Dialogue Speech Synthesis Trained from ChatGPT-derived Context Word Embeddings.
CoRR, 2023

CALLS: Japanese Empathetic Dialogue Speech Corpus of Complaint Handling and Attentive Listening in Customer Center.
CoRR, 2023

Monocular Depth Estimation for Tilted Images via Gravity Rectifier.
Proceedings of the 18th International Joint Conference on Computer Vision, 2023

Verification of Anode Position and Generated Force Vector of EHD at Wire-cylinder Electrode.
Proceedings of the 32nd IEEE International Symposium on Industrial Electronics, 2023

Machine Learning-Based Performance Improvement of Bilateral Teleoperation with Hydraulic Actuator.
Proceedings of the IEEE International Conference on Mechatronics, 2023

Duration-Aware Pause Insertion Using Pre-Trained Language Model for Multi-Speaker Text-To-Speech.
Proceedings of the IEEE International Conference on Acoustics, 2023

MID-Attribute Speaker Generation Using Optimal-Transport-Based Interpolation of Gaussian Mixture Models.
Proceedings of the IEEE International Conference on Acoustics, 2023

SHIFT15M: Fashion-specific dataset for set-to-set matching with several distribution shifts.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

COCO-NUT: Corpus of Japanese Utterance and Voice Characteristics Description for Prompt-Based Control.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022
Multi-Task Adversarial Training Algorithm for Multi-Speaker Neural Text-to-Speech.
CoRR, 2022

Human-in-the-loop Speaker Adaptation for DNN-based Multi-speaker TTS.
Proceedings of the Interspeech 2022, 2022

STUDIES: Corpus of Japanese Empathetic Dialogue Speech Towards Friendly Voice Agent.
Proceedings of the Interspeech 2022, 2022

Acoustic Modeling for End-to-End Empathetic Dialogue Speech Synthesis Using Linguistic and Prosodic Contexts of Dialogue History.
Proceedings of the Interspeech 2022, 2022

Predicting VQVAE-based Character Acting Style from Quotation-Annotated Text for Audiobook Speech Synthesis.
Proceedings of the Interspeech 2022, 2022

Validation of a Property Estimation Method Based on Sequential and Posteriori Estimation.
Proceedings of the IECON 2022, 2022

2021
Perceptual-Similarity-Aware Deep Speaker Representation Learning for Multi-Speaker Generative Modeling.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Analysis of 3-D Kinematics Using H-Gait System during Walking on a Lower Body Positive Pressure Treadmill.
Sensors, 2021

Real-Time Full-Band Voice Conversion with Sub-Band Modeling and Data-Driven Phase Estimation of Spectral Differentials.
IEICE Trans. Inf. Syst., 2021

DNN-Based Low-Musical-Noise Single-Channel Speech Enhancement Based on Higher-Order-Moments Matching.
IEICE Trans. Inf. Syst., 2021

SHIFT15M: Multiobjective Large-Scale Fashion Dataset with Distributional Shifts.
CoRR, 2021

Camera Selection for Occlusion-Less Surgery Recording via Training With an Egocentric Camera.
IEEE Access, 2021

Cross-Lingual Speaker Adaptation Using Domain Adaptation and Speaker Consistency Loss for Text-To-Speech Synthesis.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Humanacgan: Conditional Generative Adversarial Network with Human-Based Auxiliary Classifier and its Evaluation in Phoneme Perception.
Proceedings of the IEEE International Conference on Acoustics, 2021

Emotion-Controllable Speech Synthesis Using Emotion Soft Labels and Fine-Grained Prosody Factors.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

Performance Improvement of Bilateral Teleoperation with Hydraulic Actuator by Friction Compensation.
Proceedings of the 17th IEEE International Conference on Advanced Motion Control, 2021

Motion Generation Based on Physical Property Estimation in Motion Copy System.
Proceedings of the 17th IEEE International Conference on Advanced Motion Control, 2021

Experimental Verification of a Novel Continuously Variable Transmission with Electro-Hydrostatic Actuator.
Proceedings of the 17th IEEE International Conference on Advanced Motion Control, 2021

2020
Phase reconstruction from amplitude spectrograms based on directional-statistics deep neural networks.
Signal Process., 2020

Development of Scanning Line Tool Path Generation Algorithm Using Boundary Position Information of Approximate Polyhedron of Complex Molds.
Int. J. Autom. Technol., 2020

Joint Adversarial Training of Speech Recognition and Synthesis Models for Many-to-One Voice Conversion Using Phonetic Posteriorgrams.
IEICE Trans. Inf. Syst., 2020

Generative Moment Matching Network-Based Neural Double-Tracking for Synthesized and Natural Singing Voices.
IEICE Trans. Inf. Syst., 2020

DNN-based Speech Synthesis Using Abundant Tags of Spontaneous Speech Corpus.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

SMASH Corpus: A Spontaneous Speech Corpus Recording Third-person Audio Commentaries on Gameplay.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Investigating Effective Additional Contextual Factors in DNN-Based Spontaneous Speech Synthesis.
Proceedings of the Interspeech 2020, 2020

Cross-Lingual Text-To-Speech Synthesis via Domain Adaptation and Perceptual Similarity Regression in Speaker Space.
Proceedings of the Interspeech 2020, 2020

Real-Time, Full-Band, Online DNN-Based Voice Conversion System Using a Single CPU.
Proceedings of the Interspeech 2020, 2020

Face2Speech: Towards Multi-Speaker Text-to-Speech Synthesis Using an Embedding Vector Predicted from a Face Image.
Proceedings of the Interspeech 2020, 2020

Lifter Training and Sub-Band Modeling for Computationally Efficient and High-Quality Voice Conversion Using Spectral Differentials.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Humangan: Generative Adversarial Network With Human-Based Discriminator And Its Evaluation In Speech Perception Modeling.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

In-Plane Rotation-Aware Monocular Depth Estimation Using SLAM.
Proceedings of the Frontiers of Computer Vision - 26th International Workshop, 2020

Exchangeable Deep Neural Networks for Set-to-Set Matching and Learning.
Proceedings of the Computer Vision - ECCV 2020, 2020

2019
Simulation of Reflectance and Vegetation Indices for Unmanned Aerial Vehicle (UAV) Monitoring of Paddy Fields.
Remote. Sens., 2019

Vocoder-free text-to-speech synthesis incorporating generative adversarial networks using low-/multi-frequency STFT amplitude spectra.
Comput. Speech Lang., 2019

Deep Set-to-Set Matching and Learning.
CoRR, 2019

JVS corpus: free Japanese multi-speaker voice corpus.
CoRR, 2019

V2S attack: building DNN-based voice conversion from automatic speaker verification.
CoRR, 2019

DNN-based Speaker Embedding Using Subjective Inter-speaker Similarity for Multi-speaker Modeling in Speech Synthesis.
CoRR, 2019

Bandwidth Expansion of Bilateral Teleoperation Based on Synergy of Observer Gain and Velocity Feedback Gain.
Proceedings of the IECON 2019, 2019

A Controller Design Method of Bilateral Teleoperation for Velocity Control Driver.
Proceedings of the IECON 2019, 2019

Symmetric Operational Force Compensator for Bilateral Teleoperation Under Time Delay Based on Power Flow Direction.
Proceedings of the IEEE International Conference on Mechatronics, 2019

Generative Moment Matching Network-based Random Modulation Post-filter for DNN-based Singing Voice Synthesis and Neural Double-tracking.
Proceedings of the IEEE International Conference on Acoustics, 2019

2018
Artificial Replacement of Human Sensation Using Haptic Transplant Technology.
IEEE Trans. Ind. Electron., 2018

Statistical Parametric Speech Synthesis Incorporating Generative Adversarial Networks.
IEEE ACM Trans. Audio Speech Lang. Process., 2018

2.5D Faster R-CNN for Distance Estimation.
Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, 2018

Effects of EEG Electrode Positional Deviations for Classification Accuracy on Different Days.
Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, 2018

Physical-contact 256-core MPO Connector with Flat Polished Multi-core Fibers.
Proceedings of the Optical Fiber Communications Conference and Exposition, 2018

Phase Reconstruction from Amplitude Spectrograms Based on Von-Mises-Distribution Deep Neural Network.
Proceedings of the 16th International Workshop on Acoustic Signal Enhancement, 2018

Text-to-Speech Synthesis Using STFT Spectra Based on Low-/Multi-Resolution Generative Adversarial Networks.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Non-Parallel Voice Conversion Using Variational Autoencoders Conditioned by Phonetic Posteriorgrams and D-Vectors.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Accurate Passive Rotational Alignment of Multi-Core Fibre with Double-D-Shape Cladding on V Groove.
Proceedings of the European Conference on Optical Communication, 2018

Generative approach using the noise generation models for DNN-based speech synthesis trained from noisy speech.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

2017
Voice Conversion Using Input-to-Output Highway Networks.
IEICE Trans. Inf. Syst., 2017

Online compensation of gravity and friction for haptics with incremental position sensors.
Proceedings of the 24th International Conference on Mechatronics and Machine Vision in Practice, 2017

Voice Conversion Using Sequence-to-Sequence Learning of Context Posterior Probabilities.
Proceedings of the Interspeech 2017, 2017

Motion-reproduction system adaptable to position fluctuation of picking objects based on image information.
Proceedings of the IECON 2017 - 43rd Annual Conference of the IEEE Industrial Electronics Society, Beijing, China, October 29, 2017

Wearable finger exoskeleton using flexible actuator for rehabilitation.
Proceedings of the IEEE International Conference on Mechatronics, 2017

Training algorithm to deceive Anti-Spoofing Verification for DNN-based speech synthesis.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

2015
Conversion of Speaker's Face Image Using PCA and Animation Unit for Video Chatting.
Proceedings of the 2015 International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2015

2014
Statistically significant subgraphs for genome-wide association study.
Proceedings of the 1st ECML/PKDD Workshop on Statistically Sound Data Mining, 2014

Extraction and realization of human actions.
Proceedings of the IEEE 13th International Workshop on Advanced Motion Control, 2014

2013
Recognition of Grasping Motion Based on Modal Space Haptic Information Using DP Pattern-Matching Algorithm.
IEEE Trans. Ind. Informatics, 2013

Development of an atomic force microscope for measuring mechanical properties of cell population.
Proceedings of the International Symposium on Micro-NanoMechatronics and Human Science, 2013

Acceleration-based position and force control for twist drive.
Proceedings of the IEEE International Conference on Mechatronics, 2013

Variable tension control for master-slave tendon-driven robot hand.
Proceedings of the IEEE International Conference on Mechatronics, 2013

Detection and Tracking Protein Molecules in Fluorescence Microscopic Video.
Proceedings of the First International Symposium on Computing and Networking, 2013

Stability analysis of time-delay systems based on a power of the monodromy operator.
Proceedings of the 12th European Control Conference, 2013

Widely linear LQCMV beamformer and augmented dual-domain adaptive algorithm.
Proceedings of the 9th International Conference on Information, 2013

2012
Reduction of Patient Dose in Digital Mammography: Simulation of Low-Dose Image from a Routine Dose.
Proceedings of the Breast Imaging, 2012

Model-based compensation of wire elongation for tendon-driven rotary actuator.
Proceedings of the 12th IEEE International Workshop on Advanced Motion Control, 2012

2010
Robust interference management to satisfy allowable outage probability using minority game.
Proceedings of the IEEE 21st International Symposium on Personal, 2010

2009
Empirical Mode Decomposition Method for MEG Phantom Data Analysis.
J. Circuits Syst. Comput., 2009

Adaptive rhythmic component extractionwith regularization for EEG data analysis.
Proceedings of the IEEE International Conference on Acoustics, 2009

Joint Dynamics of Spectrum Allocation and User Behavior in Spectrum Markets.
Proceedings of the Global Communications Conference, 2009. GLOBECOM 2009, Honolulu, Hawaii, USA, 30 November, 2009

2008
Rhythmic component extraction for multi-channel EEG data analysis.
Proceedings of the IEEE International Conference on Acoustics, 2008


  Loading...