Rita Singh

Orcid: 0000-0003-3743-0162

According to our database1, Rita Singh authored at least 133 papers between 1998 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
R<sup>2</sup>-Bench: Benchmarking the Robustness of Referring Perception Models under Perturbations.
CoRR, 2024

Domain Adaptation for Contrastive Audio-Language Models.
CoRR, 2024

A General Framework for Learning from Weak Supervision.
CoRR, 2024

PAM: Prompting Audio-Language Models for Audio Quality Assessment.
CoRR, 2024

2023
Deriving Vocal Fold Oscillation Information from Recorded Voice Signals Using Models of Phonation.
Entropy, July, 2023

A Gene-Based Algorithm for Identifying Factors That May Affect a Speaker's Voice.
Entropy, June, 2023

SphereFace Revived: Unifying Hyperspherical Face Recognition.
IEEE Trans. Pattern Anal. Mach. Intell., 2023

LoFT: Local Proxy Fine-tuning For Improving Transferability Of Adversarial Attacks Against Large Language Model.
CoRR, 2023

Prompting Audios Using Acoustic Properties For Emotion Representation.
CoRR, 2023

Completing Visual Objects via Bridging Generation and Segmentation.
CoRR, 2023

Evaluating Speech Synthesis by Training Recognizers on Synthetic Speech.
CoRR, 2023

Rethinking Audiovisual Segmentation with Semantic Quantization and Decomposition.
CoRR, 2023

Importance of negative sampling in weak label learning.
CoRR, 2023

Training Audio Captioning Models without Audio.
CoRR, 2023

The Hidden Dance of Phonemes and Visage: Unveiling the Enigmatic Link between Phonemes and Facial Features.
CoRR, 2023

BASS: Block-wise Adaptation for Speech Summarization.
CoRR, 2023

Imprecise Label Learning: A Unified Framework for Learning with Various Imprecise Label Configurations.
CoRR, 2023

GPT-Sentinel: Distinguishing Human and ChatGPT Generated Content.
CoRR, 2023

PaintSeg: Painting Pixels for Training-free Segmentation.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Pengi: An Audio Language Model for Audio Tasks.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Rethinking Voice-Face Correlation: A Geometry View.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Pairwise Similarity Learning is SimPLE.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Towards Noise-Tolerant Speech-Referring Video Object Segmentation: Bridging Speech and Text.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Token Prediction as Implicit Classification to Identify LLM-Generated Text.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Espnet-Summ: Introducing a Novel Large Dataset, Toolkit, and a Cross-Corpora Evaluation of Speech Summarization Systems.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022
Describing emotions with acoustic property prompts for speech emotion recognition.
CoRR, 2022

Unifying the Discrete and Continuous Emotion labels for Speech Emotion Recognition.
CoRR, 2022

Self-supervision and Learnable STRFs for Age, Emotion, and Country Prediction.
CoRR, 2022

On the pragmatism of using binary classifiers over data intensive neural network classifiers for detection of COVID-19 from voice.
CoRR, 2022

Positional Encoding for Capturing Modality Specific Cadence for Emotion Detection.
Proceedings of the Interspeech 2022, 2022

SphereFace2: Binary Classification is All You Need for Deep Face Recognition.
Proceedings of the Tenth International Conference on Learning Representations, 2022

2021
An Overview of Techniques for Biomarker Discovery in Voice Signal.
CoRR, 2021

Detection and Evaluation of Human and Machine Generated Speech in Spoofing Attacks on Automatic Speaker Verification Systems.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Masked Proxy Loss for Text-Independent Speaker Verification.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Generalized Spoofing Detection Inspired from Audio Generation Artifacts.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Improving Weakly Supervised Sound Event Detection with Self-Supervised Auxiliary Tasks.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Self-Supervised 3D Face Reconstruction via Conditional Estimation.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Detection of Covid-19 Through the Analysis of Vocal Fold Oscillations.
Proceedings of the IEEE International Conference on Acoustics, 2021

Interpreting Glottal Flow Dynamics for Detecting Covid-19 From Voice.
Proceedings of the IEEE International Conference on Acoustics, 2021

2020
Mask Proxy Loss for Text-Independent Speaker Recognition.
CoRR, 2020

Multi-Task Learning for Interpretable Weakly Labelled Sound Event Detection.
CoRR, 2020

Controlled AutoEncoders to Generate Faces from Voices.
Proceedings of the Advances in Visual Computing - 15th International Symposium, 2020

Hide and Speak: Towards Deep Neural Networks for Speech Steganography.
Proceedings of the Interspeech 2020, 2020

The Phonetic Bases of Vocal Expressed Emotion: Natural versus Acted.
Proceedings of the Interspeech 2020, 2020

Hierarchical Routing Mixture of Experts.
Proceedings of the 25th International Conference on Pattern Recognition, 2020

Artificial Creative Intelligence: Breaking the Imitation Barrier.
Proceedings of the Eleventh International Conference on Computational Creativity, 2020

Speech-Based Parameter Estimation of an Asymmetric Vocal Fold Oscillation Model and its Application in Discriminating Vocal Fold Pathologies.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Detecting gender differences in perception of emotion in crowdsourced data.
CoRR, 2019

Non-Determinism in Neural Networks for Adversarial Robustness.
CoRR, 2019

Reconstructing faces from voices.
CoRR, 2019

Hide and Speak: Deep Neural Networks for Speech Steganography.
CoRR, 2019

Face Reconstruction from Voice using Generative Adversarial Networks.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Neural Regression Trees.
Proceedings of the International Joint Conference on Neural Networks, 2019

Disjoint Mapping Network for Cross-modal Matching of Voices and Faces.
Proceedings of the 7th International Conference on Learning Representations, 2019

Human Behaviour Recognition Using Wifi Channel State Information.
Proceedings of the IEEE International Conference on Acoustics, 2019

Optimizing Neural Network Embeddings Using a Pair-Wise Loss for Text-Independent Speaker Verification.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018
Optimal Strategies for Matching and Retrieval Problems by Comparing Covariates.
CoRR, 2018

A Corrective Learning Approach for Text-Independent Speaker Verification.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Voice Impersonation Using Generative Adversarial Networks.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
Voice disguise by mimicry: deriving statistical articulometric evidence to evaluate claimed impersonation.
IET Biom., 2017

Speaker identification from the sound of the human breath.
CoRR, 2017

Deducing the severity of psychiatric symptoms from the human voice.
CoRR, 2017

Supervised monaural source separation based on autoencoders.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

2016
Minimizing Free Energy of Stochastic Functions of Markov Chains.
Proceedings of the Recent Advances in Nonlinear Speech Processing, 2016

Content-based Video Indexing and Retrieval Using Corr-LDA.
CoRR, 2016

Mereological algebras as mechanisms for reasoning about sounds.
Proceedings of the 26th IEEE International Workshop on Machine Learning for Signal Processing, 2016

Estimating multiple physical parameters from speech data.
Proceedings of the 26th IEEE International Workshop on Machine Learning for Signal Processing, 2016

Forensic anthropometry from voice: An articulatory-phonetic approach.
Proceedings of the 39th International Convention on Information and Communication Technology, 2016

Short-term analysis for estimating physical parameters of speakers.
Proceedings of the 4th International Conference on Biometrics and Forensics, 2016

Formant manipulations in voice disguise by mimicry.
Proceedings of the 4th International Conference on Biometrics and Forensics, 2016

Estimation of Children's Physical Characteristics from Their Voices.
Proceedings of the Interspeech 2016, 2016

The relationship of voice onset time and Voice Offset Time to physical age.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015
Complex recurrent neural networks for denoising speech signals.
Proceedings of the 2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2015


Keyword spotting in multi-player voice driven games for children.
Proceedings of the INTERSPEECH 2015, 2015

Free energy for speech recognition.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2014

Online word-spotting in continuous speech with recurrent neural networks.
Proceedings of the 2014 IEEE Spoken Language Technology Workshop, 2014

Audio Classification with Thermodynamic Criteria.
Proceedings of the 2014 IEEE International Conference on Cloud Engineering, 2014

Detecting sound objects in audio recordings.
Proceedings of the 22nd European Signal Processing Conference, 2014

2013

Discriminatively trained dependency language modeling for conversational speech recognition.
Proceedings of the INTERSPEECH 2013, 2013

Doppler based speed estimation of vehicles using passive sensor.
Proceedings of the 2013 IEEE International Conference on Multimedia and Expo Workshops, 2013

Joint constrained maximum likelihood regression for overlapping speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2013

Event detection in short duration audio using Gaussian Mixture Model and Random Forest Classifier.
Proceedings of the 21st European Signal Processing Conference, 2013

2012

A signal-separation-based array postfilter for distant speech recognition.
Proceedings of the INTERSPEECH 2012, 2012

Language identification using spectro-temporal patch features.
Proceedings of the ISCA Workshop on Statistical And Perceptual Audition, 2012

Microphone Array Post-filter based on Spatially-Correlated Noise Measurements for Distant Speech Recognition.
Proceedings of the INTERSPEECH 2012, 2012

Plagiarism Detection in Polyphonic Music using Monaural Signal Separation.
Proceedings of the INTERSPEECH 2012, 2012

Exploiting Temporal Sequence Structure for Semantic Analysis of Multimedia.
Proceedings of the INTERSPEECH 2012, 2012

Compensating for denoising artifacts.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Audio event detection from acoustic unit occurrence patterns.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Spectrographic seam patterns for discriminative word spotting.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Microphone array processing for distant speech recognition: Towards real-world deployment.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

Introduction.
Proceedings of the Techniques for Noise Robustness in Automatic Speech Recognition, 2012

The Basics of Automatic Speech Recognition.
Proceedings of the Techniques for Noise Robustness in Automatic Speech Recognition, 2012

The Problem of Robustness in Automatic Speech Recognition.
Proceedings of the Techniques for Noise Robustness in Automatic Speech Recognition, 2012

2011
Phoneme-Dependent NMF for Speech Enhancement in Monaural Mixtures.
Proceedings of the INTERSPEECH 2011, 2011

A paired test for recognizer selection with untranscribed data.
Proceedings of the IEEE International Conference on Acoustics, 2011

Gammatone sub-band magnitude-domain dereverberation for ASR.
Proceedings of the IEEE International Conference on Acoustics, 2011

An iterative least-squares technique for dereverberation.
Proceedings of the IEEE International Conference on Acoustics, 2011

Reconstructing Noise-Corrupted Spectrographic Components for Robust Speech Recognition.
Proceedings of the Robust Speech Recognition of Uncertain or Missing Data, 2011

2010
The use of sense in unsupervised training of acoustic models for ASR systems.
Proceedings of the INTERSPEECH 2010, 2010

Non-negative matrix factorization based compensation of music for automatic speech recognition.
Proceedings of the INTERSPEECH 2010, 2010

Creating a linguistic plausibility dataset with non-expert annotators.
Proceedings of the INTERSPEECH 2010, 2010

Latent-variable decomposition based dereverberation of monaural and multi-channel signals.
Proceedings of the IEEE International Conference on Acoustics, 2010

2009
A joint decoding algorithm for multiple-example-based addition of words to a pronunciation lexicon.
Proceedings of the IEEE International Conference on Acoustics, 2009

2007
Probabilistic deduction of symbol mappings for extension of lexicons.
Proceedings of the INTERSPEECH 2007, 2007

Bandwidth Expansionwith a pólya URN Model.
Proceedings of the IEEE International Conference on Acoustics, 2007

2005
Voice driven applications in non-stationary and chaotic environment.
Proceedings of the IEEE International Conference on Robotics and Biomimetics, 2005

Recognizing speech from simultaneous speakers.
Proceedings of the INTERSPEECH 2005, 2005

Feature compensation with secondary sensor measurements for robust speech recognition.
Proceedings of the 13th European Signal Processing Conference, 2005

2004
Classification in Likelihood Spaces.
Technometrics, 2004

Maximum - likelihod adaptation of semi-continuous HMMs by latent variable decomposition of state distributions.
Proceedings of the INTERSPEECH 2004, 2004

On tracking noise with linear dynamical system models.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2003
Classifier-based non-linear projection for adaptive endpointing of continuous speech.
Comput. Speech Lang., 2003

Classification with free energy at raised temperatures.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Design of the CMU sphinx-4 decoder.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Tracking noise via dynamical systems with a continuum of states.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

2002
Automatic generation of subword units for speech recognition systems.
IEEE Trans. Speech Audio Process., 2002

Combining search spaces of heterogeneous recognizers for improved speech recogniton.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Rapid development of speech-to-speech translation systems.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

2001
Speech in Noisy Environments: robust automatic segmentation, feature extraction, and hypothesis combination.
Proceedings of the IEEE International Conference on Acoustics, 2001

Tandem acoustic modeling in large-vocabulary recognition.
Proceedings of the IEEE International Conference on Acoustics, 2001

2000
Structured redefinition of sound units by merging and splitting for improved speech recognition.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Task and domain specific modelling in the Carnegie Mellon communicator system.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Automatic subword unit refinement for spontaneous speech recognition via phone splitting.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Phone transition acoustic modeling: application to speaker independent and spontaneous speech systems.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Automatic generation of phone sets and lexical transcriptions.
Proceedings of the IEEE International Conference on Acoustics, 2000

1999
Domain adduced state tying for cross-domain acoustic modelling.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

Automatic clustering and generation of contextual questions for tied states in hidden Markov models.
Proceedings of the 1999 IEEE International Conference on Acoustics, 1999

1998
Inference of missing spectrographic features for robust speech recognition.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998


  Loading...