Prasanta Kumar Ghosh

Orcid: 0000-0002-2925-1802

Affiliations:
  • Indian Institute of Science, Department of Electrical Engineering, Bangalore, India


According to our database1, Prasanta Kumar Ghosh authored at least 182 papers between 2006 and 2023.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2023
Model Adaptation for ASR in low-resource Indian Languages.
CoRR, 2023


An End-to-End TTS Model in Chhattisgarhi, a Low-Resource Indian Language.
Proceedings of the Speech and Computer - 25th International Conference, 2023

Study of Indian English Pronunciation Variabilities Relative to Received Pronunciation.
Proceedings of the Speech and Computer - 25th International Conference, 2023

Curriculum Learning Based Approach for Faster Convergence of TTS Model.
Proceedings of the Speech and Computer - 25th International Conference, 2023

SPIRE-SIES: A Spontaneous Indian English Speech Corpus.
Proceedings of the 26th Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques, 2023

Improved Acoustic-to-Articulatory Inversion Using Representations from Pretrained Self-Supervised Learning Models.
Proceedings of the IEEE International Conference on Acoustics, 2023

Real-Time MRI Video Synthesis from Time Aligned Phonemes with Sequence-to-Sequence Networks.
Proceedings of the IEEE International Conference on Acoustics, 2023

Lightweight, Multi-Speaker, Multi-Lingual Indic Text-to-Speech.
Proceedings of the IEEE International Conference on Acoustics, 2023

Static and Dynamic Source and Filter Cues for Classification of Amyotrophic Lateral Sclerosis Patients and Healthy Subjects.
Proceedings of the IEEE International Conference on Acoustics, 2023

Exploring the Role of Fricatives in Classifying Healthy Subjects and Patients with Amyotrophic Lateral Sclerosis and Parkinson's Disease.
Proceedings of the IEEE International Conference on Acoustics, 2023

Gated Multi Encoders and Multitask Objectives for Dialectal Speech Recognition in Indian Languages.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022
Automatic syllable stress detection under non-parallel label and data condition.
Speech Commun., 2022

A deteriorating food preservation supply chain model with downstream delayed payment and upstream partial prepayment.
RAIRO Oper. Res., 2022

An Investigation of Indian Native Language Phonemic Influences on L2 English Pronunciations.
CoRR, 2022

Study of Indian English Pronunciation Variabilities relative to Received Pronunciation.
CoRR, 2022

Voistutor 2.0: A Speech Corpus with Phonetic Transcription for Pronunciation Evaluation of Indian L2 English Learners.
Proceedings of the 25th Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques, 2022

Whisper to Neutral Mapping Using I-Vector Space Likelihood and a Cosine Similarity Based Iterative Optimization for Whispered Speaker Verification.
Proceedings of the 27th National Conference on Communications, 2022

Streaming model for Acoustic to Articulatory Inversion with transformer networks.
Proceedings of the Interspeech 2022, 2022

Watch Me Speak: 2D Visualization of Human Mouth during Speech.
Proceedings of the Interspeech 2022, 2022

Air tissue boundary segmentation using regional loss in real-time Magnetic Resonance Imaging video for speech production.
Proceedings of the Interspeech 2022, 2022

Gram Vaani ASR Challenge on spontaneous telephone speech recordings in regional variations of Hindi.
Proceedings of the Interspeech 2022, 2022

SegNet-Based Deep Representation Learning for Dysphagia Classification.
Proceedings of the IEEE International Conference on Acoustics, 2022

An Error Correction Scheme for Improved Air-Tissue Boundary in Real-Time MRI Video for Speech Production.
Proceedings of the IEEE International Conference on Acoustics, 2022

Dual Attention Pooling Network for Recording Device Classification Using Neutral and Whispered Speech.
Proceedings of the IEEE International Conference on Acoustics, 2022

The impact of cross language on acoustic-to-articulatory inversion and its influence on articulatory speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
A Robust Speaking Rate Estimator Using a CNN-BLSTM Network.
Circuits Syst. Signal Process., 2021

A deep neural network based correction scheme for improved air-tissue boundary prediction in real-time magnetic resonance imaging video.
Comput. Speech Lang., 2021

Multi-modal Point-of-Care Diagnostics for COVID-19 Based On Acoustics and Symptoms.
CoRR, 2021

Multilingual and code-switching ASR challenges for low resource Indian languages.
CoRR, 2021

wSPIRE: A Parallel Multi-Device Corpus in Neutral and Whispered Speech.
Proceedings of the 24th Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques, 2021

A Study on Native American English Speech Recognition by Indian Listeners with Varying Word Familiarity Level.
Proceedings of the 24th Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques, 2021

SPIRE VCV: An Acoustic-Articulatory Corpus with Three Different Speaking Rates.
Proceedings of the 24th Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques, 2021

Convolutional Dense Neural Network Based Spirometry Variable FVC Prediction Using Sustained Phonations.
Proceedings of the 2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP), 2021

Noise Robust Pitch Stylization Using Minimum Mean Absolute Error Criterion.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Web Interface for Estimating Articulatory Movements in Speech Production from Acoustics and Text.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Estimating Articulatory Movements in Speech Production with Transformer Networks.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

A Comparative Study of Different EMG Features for Acoustics-to-EMG Mapping.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

DiCOVA Challenge: Dataset, Task, and Baseline System for COVID-19 Diagnosis Using Acoustics.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

MUCS 2021: Multilingual and Code-Switching ASR Challenges for Low Resource Indian Languages.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Source and Vocal Tract Cues for Speech-Based Classification of Patients with Parkinson's Disease and Healthy Subjects.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Impact of Speaking Rate on the Source Filter Interaction in Speech: A Study.
Proceedings of the IEEE International Conference on Acoustics, 2021

Acoustic-to-Articulatory Inversion for Dysarthric Speech by Using Cross-Corpus Acoustic-Articulatory Data.
Proceedings of the IEEE International Conference on Acoustics, 2021

Effect of Noise and Model Complexity on Detection of Amyotrophic Lateral Sclerosis and Parkinson's Disease Using Pitch and MFCC.
Proceedings of the IEEE International Conference on Acoustics, 2021

Role of breath phase and breath boundaries for the classification between asthmatic and healthy subjects.
Proceedings of the 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2021

Noise Robust Detection of Fundamental Heart Sound using Parametric Mixture Gaussian and Dynamic Programming.
Proceedings of the 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2021

Unsegmented Heart Sound Classification Using Hybrid CNN-LSTM Neural Networks.
Proceedings of the 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2021

2020
SFNet: A Computationally Efficient Source Filter Model Based Neural Speech Synthesis.
IEEE Signal Process. Lett., 2020

The impact of speaking rate on acoustic-to-articulatory inversion.
Comput. Speech Lang., 2020

Speech task based automatic classification of ALS and Parkinson's Disease and their severity using log Mel spectrograms.
Proceedings of the International Conference on Signal Processing and Communications, 2020

Speech rate estimation using representations learned from speech with convolutional neural network.
Proceedings of the International Conference on Signal Processing and Communications, 2020

Attention and Encoder-Decoder Based Models for Transforming Articulatory Movements at Different Speaking Rates.
Proceedings of the Interspeech 2020, 2020

Coswara - A Database of Breathing, Cough, and Voice Sounds for COVID-19 Diagnosis.
Proceedings of the Interspeech 2020, 2020

An Investigation of the Virtual Lip Trajectories During the Production of Bilabial Stops and Nasal at Different Speaking Rates.
Proceedings of the Interspeech 2020, 2020

Whisper Activity Detection Using CNN-LSTM Based Attention Pooling Network Trained for a Speaker Identification Task.
Proceedings of the Interspeech 2020, 2020

Speech Rate Task-Specific Representation Learning from Acoustic-Articulatory Data.
Proceedings of the Interspeech 2020, 2020

Air-Tissue Boundary Segmentation in Real Time Magnetic Resonance Imaging Video Using 3-D Convolutional Neural Network.
Proceedings of the Interspeech 2020, 2020

Raw Speech Waveform Based Classification of Patients with ALS, Parkinson's Disease and Healthy Controls Using CNN-BLSTM.
Proceedings of the Interspeech 2020, 2020

Speaker Conditioned Acoustic-to-Articulatory Inversion Using x-Vectors.
Proceedings of the Interspeech 2020, 2020

Automatic Glottis Detection and Segmentation in Stroboscopic Videos Using Convolutional Networks.
Proceedings of the Interspeech 2020, 2020

Analysis of Acoustic Features for Speech Sound Based Classification of Asthmatic and Healthy Subjects.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Automatic Identification of Speakers From Head Gestures in a Narration.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Automatic Classification of Volumes of Water Using Swallow Sounds from Cervical Auscultation.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

A Comparative Study of Estimating Articulatory Movements from Phoneme Sequences and Acoustic Features.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Pseudo Likelihood Correction Technique for Low Resource Accented ASR.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Voice based classification of patients with Amyotrophic Lateral Sclerosis, Parkinson's Disease and Healthy Controls with CNN-LSTM using transfer learning.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Glottal Inverse Filtering Using Probabilistic Weighted Linear Prediction.
IEEE ACM Trans. Audio Speech Lang. Process., 2019

Dirichlet Latent Variable Model: A Dynamic Model Based on Dirichlet Prior for Audio Processing.
IEEE ACM Trans. Audio Speech Lang. Process., 2019

P- and T-wave delineation in ECG signals using parametric mixture Gaussian and dynamic programming.
Biomed. Signal Process. Control., 2019

Comparison of automatic syllable stress detection quality with time-aligned boundaries and context dependencies.
Proceedings of the 8th ISCA International Workshop on Speech and Language Technology in Education, 2019

voisTUTOR: Virtual Operator for Interactive Spoken English TUTORing.
Proceedings of the 8th ISCA International Workshop on Speech and Language Technology in Education, 2019

Noise robust goodness of pronunciation measures using teacher's utterance.
Proceedings of the 8th ISCA International Workshop on Speech and Language Technology in Education, 2019

Automatic assessment of pronunciation and its dependent factors by exploring their interdependencies using DNN and LSTM.
Proceedings of the 8th ISCA International Workshop on Speech and Language Technology in Education, 2019

voisTUTOR corpus: A speech corpus of Indian L2 English learners for pronunciation assessment.
Proceedings of the 22nd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques, 2019

Indic TIMIT and Indic English lexicon: A speech database of Indian speakers using TIMIT stimuli and a lexicon from their mispronunciations.
Proceedings of the 22nd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques, 2019

An acoustic-articulatory database of VCV sequences and words in Toda at different speaking rates.
Proceedings of the 22nd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques, 2019

A SegNet Based Image Enhancement Technique for Air-Tissue Boundary Segmentation in Real-Time Magnetic Resonance Imaging Video.
Proceedings of the National Conference on Communications, 2019

SPIRE-fluent: A Self-Learning App for Tutoring Oral Fluency to Second Language English Learners.
Proceedings of the Interspeech 2019, 2019

An Improved Goodness of Pronunciation (GoP) Measure for Pronunciation Evaluation with DNN-HMM System Considering HMM Transition Probabilities.
Proceedings of the Interspeech 2019, 2019

Low Resource Automatic Intonation Classification Using Gated Recurrent Unit (GRU) Networks Pre-Trained with Synthesized Pitch Patterns.
Proceedings of the Interspeech 2019, 2019

ASR Inspired Syllable Stress Detection for Pronunciation Evaluation Without Using a Supervised Classifier and Syllable Level Features.
Proceedings of the Interspeech 2019, 2019

Whisper to Neutral Mapping Using Cosine Similarity Maximization in i-Vector Space for Speaker Verification.
Proceedings of the Interspeech 2019, 2019

Comparison of Speech Tasks and Recording Devices for Voice Based Automatic Classification of Healthy Subjects and Patients with Amyotrophic Lateral Sclerosis.
Proceedings of the Interspeech 2019, 2019

Acoustic and Articulatory Feature Based Speech Rate Estimation Using a Convolutional Dense Neural Network.
Proceedings of the Interspeech 2019, 2019

An Investigation on Speaker Specific Articulatory Synthesis with Speaker Independent Articulatory Inversion.
Proceedings of the Interspeech 2019, 2019

An Improved Air Tissue Boundary Segmentation Technique for Real Time Magnetic Resonance Imaging Video Using Segnet.
Proceedings of the IEEE International Conference on Acoustics, 2019

A Study on Robustness of Articulatory Features for Automatic Speech Recognition of Neutral and Whispered Speech.
Proceedings of the IEEE International Conference on Acoustics, 2019

Formant-gaps Features for Speaker Verification Using Whispered Speech.
Proceedings of the IEEE International Conference on Acoustics, 2019

Air-tissue Boundary Segmentation in Real Time Magnetic Resonance Imaging Video Using a Convolutional Encoder-decoder Network.
Proceedings of the IEEE International Conference on Acoustics, 2019

Representation Learning Using Convolution Neural Network for Acoustic-to-articulatory Inversion.
Proceedings of the IEEE International Conference on Acoustics, 2019

Trend Statistics Network and Channel invariant EEG Network for sleep arousal study.
Proceedings of the 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2019

2018
PSFM - A Probabilistic Source Filter Model for Noise Robust Glottal Closure Instant Detection.
IEEE ACM Trans. Audio Speech Lang. Process., 2018

Optimal sensor placement in electromagnetic articulography recording for speech production study.
Comput. Speech Lang., 2018

Classification of story-telling and poem recitation using head gesture of the talker.
Proceedings of the 2018 International Conference on Signal Processing and Communications (SPCOM), 2018

Broad Phoneme Class Specific Deep Neural Network Based Speech Enhancement.
Proceedings of the 2018 International Conference on Signal Processing and Communications (SPCOM), 2018

SPIRE-SST: An Automatic Web-based Self-learning Tool for Syllable Stress Tutoring (SST) to the Second Language Learners.
Proceedings of the Interspeech 2018, 2018

Air-Tissue Boundary Segmentation in Real-Time Magnetic Resonance Imaging Video Using Semantic Segmentation with Fully Convolutional Networks.
Proceedings of the Interspeech 2018, 2018

Relating Articulatory Motions in Different Speaking Rates.
Proceedings of the Interspeech 2018, 2018

Automatic Visual Augmentation for Concatenation Based Synthesized Articulatory Videos from Real-time MRI Data for Spoken Language Training.
Proceedings of the Interspeech 2018, 2018

Reconstructing Neutral Speech from Tracheoesophageal Speech.
Proceedings of the Interspeech 2018, 2018

Whispered Speech to Neutral Speech Conversion Using Bidirectional LSTMs.
Proceedings of the Interspeech 2018, 2018

Automatic Glottis Localization and Segmentation in Stroboscopic Videos Using Deep Neural Network.
Proceedings of the Interspeech 2018, 2018

Subband Weighting for Binaural Speech Source Localization.
Proceedings of the Interspeech 2018, 2018

Speech Enhancement Using Deep Mixture of Experts Based on Hard Expectation Maximization.
Proceedings of the Interspeech 2018, 2018

Low Resource Acoustic-to-articulatory Inversion Using Bi-directional Long Short Term Memory.
Proceedings of the Interspeech 2018, 2018

Intonation tutor by SPIRE (In-SPIRE): An Online Tool for an Automatic Feedback to the Second Language Learners in Learning Intonation.
Proceedings of the Interspeech 2018, 2018

A Supervised Air-Tissue Boundary Segmentation Technique in Real-Time Magnetic Resonance Imaging Video Using a Novel Measure of Contrast and Dynamic Programming.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Binaural Speech Source Localization Using Template Matching of Interaural Time Difference Patterns.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Speech Enhancement Using Multiple Deep Neural Networks.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Comparison of Speech Tasks for Automatic Classification of Patients with Amyotrophic Lateral Sclerosis and Healthy Subjects.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Concatenative Articulatory Video Synthesis Using Real-Time MRI Data for Spoken Language Training.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Comparison of Cough, Wheeze and Sustained Phonations for Automatic Classification Between Healthy Subjects and Asthmatic Patients.
Proceedings of the 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2018

A Maximum Likelihood Formulation To Exploit Heart Rate Variability for Robust Heart Rate Estimation From Facial Video.
Proceedings of the 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2018

A Heart Rate Driven Kalman Filter for Continuous Arousal Trend Monitoring.
Proceedings of the 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2018

SleepTight: Identifying Sleep Arousals Using Inter and Intra-Relation of Multimodal Signals.
Proceedings of the Computing in Cardiology, 2018

2017
Spectrogram Enhancement Using Multiple Window Savitzky-Golay (MWSG) Filter for Robust Bird Sound Detection.
IEEE ACM Trans. Audio Speech Lang. Process., 2017

A high resolution ENF based multi-stage classifier for location forensics of media recordings.
Proceedings of the Twenty-third National Conference on Communications, 2017

Pitch prediction from Mel-frequency cepstral coefficients using sparse spectrum recovery.
Proceedings of the Twenty-third National Conference on Communications, 2017

Classification of healthy subjects and patients with essential vocal tremor using empirical mode decomposition of high resolution pitch contour.
Proceedings of the Twenty-third National Conference on Communications, 2017

A comparative study on the effect of different codecs on speech recognition accuracy using various acoustic modeling techniques.
Proceedings of the Twenty-third National Conference on Communications, 2017

Phoneme State Posteriorgram Features for Speech Based Automatic Classification of Speakers in Cold and Healthy Condition.
Proceedings of the Interspeech 2017, 2017

A Dual Source-Filter Model of Snore Audio for Snorer Group Classification.
Proceedings of the Interspeech 2017, 2017

PRAV: A Phonetically Rich Audio Visual Corpus.
Proceedings of the Interspeech 2017, 2017

A Robust Voiced/Unvoiced Phoneme Classification from Whispered Speech Using the 'Color' of Whispered Phonemes and Deep Neural Network.
Proceedings of the Interspeech 2017, 2017

Subband Selection for Binaural Speech Source Localization.
Proceedings of the Interspeech 2017, 2017

An Information Theoretic Analysis of the Temporal Synchrony Between Head Gestures and Prosodic Patterns in Spontaneous Speech.
Proceedings of the Interspeech 2017, 2017

Automatic detection of syllable stress using sonority based prominence features for pronunciation evaluation.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

A comparative study of acoustic-to-articulatory inversion for neutral and whispered speech.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Low resource point process models for keyword spotting using unsupervised online learning.
Proceedings of the 25th European Signal Processing Conference, 2017

Automatic prediction of spirometry readings from cough and wheeze for monitoring of asthma severity.
Proceedings of the 25th European Signal Processing Conference, 2017

Pitch prediction from Mel-generalized cepstrum - a computationally efficient pitch modeling approach for speech synthesis.
Proceedings of the 25th European Signal Processing Conference, 2017

2016
Cumulative Impulse Strength for Epoch Extraction.
IEEE Signal Process. Lett., 2016

A mode-shape classification technique for robust speech rate estimation and syllable nuclei detection.
Speech Commun., 2016

Information theoretic optimal vocal tract region selection from real time magnetic resonance images for broad phonetic class recognition.
Comput. Speech Lang., 2016

Speaker verification based on the fusion of speech acoustics and inverted articulatory signals.
Comput. Speech Lang., 2016

A Class-Specific Speech Enhancement for Phoneme Recognition: A Dictionary Learning Approach.
Proceedings of the Interspeech 2016, 2016

Automatic Recognition of Social Roles Using Long Term Role Transitions in Small Group Interactions.
Proceedings of the Interspeech 2016, 2016

A robust speech rate estimation based on the activation profile from the selected acoustic unit dictionary.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Better acoustic normalization in subject independent acoustic-to-articulatory inversion: Benefit to recognition.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015
Multiple Spectral Peak Tracking for Heart Rate Monitoring from Photoplethysmography Signal During Intensive Physical Exercise.
IEEE Signal Process. Lett., 2015

Robust Whisper Activity Detection Using Long-Term Log Energy Variation of Sub-Band Signal.
IEEE Signal Process. Lett., 2015

Improved subject-independent acoustic-to-articulatory inversion.
Speech Commun., 2015

Automatic gender classification using the mel frequency cepstrum of neutral and whispered speech: A comparative study.
Proceedings of the Twenty First National Conference on Communications, 2015

An error correction scheme for GCI detection algorithms using pitch smoothness criterion.
Proceedings of the INTERSPEECH 2015, 2015

Automatic classification of eating conditions from speech using acoustic feature selection and a set of hierarchical support vector machine classifiers.
Proceedings of the INTERSPEECH 2015, 2015

Estimation of the air-tissue boundaries of the vocal tract in the mid-sagittal plane from electromagnetic articulograph data.
Proceedings of the INTERSPEECH 2015, 2015

A discriminative analysis within and across voiced and unvoiced consonants in neutral and whispered speech in multiple indian languages.
Proceedings of the INTERSPEECH 2015, 2015

Estimation of the invariant and variant characteristics in speech articulation and its application to speaker identification.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Bayesian learning for time-varying linear prediction of speech.
Proceedings of the 23rd European Signal Processing Conference, 2015

2014
Missing samples estimation in electromagnetic articulography data using equality constrained kalman smoother.
Proceedings of the INTERSPEECH 2014, 2014

Sparse smoothing of articulatory features from Gaussian mixture model based acoustic-to-articulatory inversion: benefit to speech recognition.
Proceedings of the INTERSPEECH 2014, 2014

Selection of optimal vocal tract regions using real-time magnetic resonance imaging for robust voice activity detection.
Proceedings of the INTERSPEECH 2014, 2014

Comparison of speech quality with and without sensors in electromagnetic articulograph AG 501 recording.
Proceedings of the INTERSPEECH 2014, 2014

Classification of clean and noisy bilingual movie audio for speech-to-speech translation corpora design.
Proceedings of the IEEE International Conference on Acoustics, 2014

Maximum a-posteriori estimation of missing samples with continuity constraint in Electromagnetic Articulography data.
Proceedings of the IEEE International Conference on Acoustics, 2014

A sparse smoothing approach for Gaussian Mixture Model based Acoustic-to-Articulatory Inversion.
Proceedings of the IEEE International Conference on Acoustics, 2014

Multi-pitch tracking using Gaussian mixture model with time varying parameters and Grating Compression Transform.
Proceedings of the IEEE International Conference on Acoustics, 2014

2013
High-quality bilingual subtitle document alignments with application to spontaneous speech translation.
Comput. Speech Lang., 2013

Multi-band long-term signal variability features for robust voice activity detection.
Proceedings of the INTERSPEECH 2013, 2013

Speaker verification based on fusion of acoustic and articulatory information.
Proceedings of the INTERSPEECH 2013, 2013

Information theoretic acoustic feature selection for acoustic-to-articulatory inversion.
Proceedings of the INTERSPEECH 2013, 2013

Spatial and temporal alignment of multimodal human speech production data: Real time imaging, flesh point tracking and audio.
Proceedings of the IEEE International Conference on Acoustics, 2013

2012
Exploiting speech production information for automatic speech and speaker modeling and recognition - possibilities and new opportunities.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

A study of emotional information present in articulatory movements estimated using acoustic-to-articulatory inversion.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

2011
Robust Voice Activity Detection Using Long-Term Signal Variability.
IEEE Trans. Speech Audio Process., 2011

Joint source-filter optimization for robust glottal source estimation in the presence of shimmer and jitter.
Speech Commun., 2011

A Multimodal Real-Time MRI Articulatory Corpus for Speech Research.
Proceedings of the INTERSPEECH 2011, 2011

Analysis of Inter-Articulator Correlation in Acoustic-to-Articulatory Inversion Using Generalized Smoothness Criterion.
Proceedings of the INTERSPEECH 2011, 2011

Overlapped speech detection using long-term spectro-temporal similarity in stereo recording.
Proceedings of the IEEE International Conference on Acoustics, 2011

Bilingual audio-subtitle extraction using automatic segmentation of movie audio.
Proceedings of the IEEE International Conference on Acoustics, 2011

A subject-independent acoustic-to-articulatory inversion.
Proceedings of the IEEE International Conference on Acoustics, 2011

2010
Bark Frequency Transform Using an Arbitrary Order Allpass Filter.
IEEE Signal Process. Lett., 2010

Robust voice activity detection in stereo recording with crosstalk.
Proceedings of the INTERSPEECH 2010, 2010

2009
Pitch Contour Stylization Using an Optimal Piecewise Polynomial Approximation.
IEEE Signal Process. Lett., 2009

Context-driven automatic bilingual movie subtitle alignment.
Proceedings of the INTERSPEECH 2009, 2009

Estimation of articulatory gesture patterns from speech acoustics.
Proceedings of the INTERSPEECH 2009, 2009

Robust word boundary detection in spontaneous speech using acoustic and lexical cues.
Proceedings of the IEEE International Conference on Acoustics, 2009

2008
Automatic classification of question turns in spontaneous speech using lexical and prosodic evidence.
Proceedings of the IEEE International Conference on Acoustics, 2008

2007
Pitch period estimation using multipulse model and wavelet transform.
Proceedings of the INTERSPEECH 2007, 2007

Speech Segmentation using Extrema-Based Signal Track Length Measure.
Proceedings of the IEEE International Conference on Acoustics, 2007

2006
Time-varying filter interpretation of Fourier transform and its variants.
Signal Process., 2006

Dynamic Programming Based Optimum Non-Uniform Samples For Speech Reconstruction and Coding.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006


  Loading...