Piotr Zelasko

IEEE ACM Trans. Audio Speech Lang. Process., 2024

2023

Delay-Penalized Transducer for Low-Latency Streaming ASR.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Fast and Parallel Decoding for Transducer.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Predicting Multi-Codebook Vector Quantization Indexes for Knowledge Distillation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Why Aren't We NER Yet? Artifacts of ASR Errors in Named Entity Recognition in Spontaneous Speech Transcripts.

[BibT_eX]

[DOI]

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022

Unsupervised Speech Segmentation and Variable Rate Representation Learning Using Segmental Contrastive Predictive Coding.

[BibT_eX]

[DOI]

Saurabhchand Bhati

IEEE ACM Trans. Audio Speech Lang. Process., 2022

Discovering phonetic inventories with crosslingual automatic speech recognition.

[BibT_eX]

[DOI]

Siyuan Feng

Ali Abavisani

Saurabhchand Bhati

Mark Hasegawa-Johnson

Comput. Speech Lang., 2022

Defense against Adversarial Attacks on Hybrid Speech Recognition using Joint Adversarial Fine-tuning with Denoiser.

[BibT_eX]

[DOI]

CoRR, 2022

Vsameter: Evaluation of a New Open-Source Tool to Measure Vowel Space Area and Related Metrics.

[BibT_eX]

[DOI]

Tianyu Cao

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Defense against Adversarial Attacks on Hybrid Speech Recognition System using Adversarial Fine-tuning with Denoiser.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2022, 2022

Non-contrastive self-supervised learning of utterance-level speech representations.

[BibT_eX]

[DOI]

Jaejin Cho

Proceedings of the Interspeech 2022, 2022

2021

Study of Pre-Processing Defenses Against Adversarial Attacks on State-of-the-Art Speaker Recognition Systems.

[BibT_eX]

[DOI]

Sonal Joshi

IEEE Trans. Inf. Forensics Secur., 2021

What Helps Transformers Recognize Conversational Structure? Importance of Context, Punctuation, and Labels in Dialog Act Recognition.

[BibT_eX]

[DOI]

Trans. Assoc. Comput. Linguistics, 2021

Non-Autoregressive Transformer for Speech Recognition.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2021

Lhotse: a speech data representation library for the modern deep learning ecosystem.

[BibT_eX]

[DOI]

CoRR, 2021

Adversarial Attacks and Defenses for Speech Recognition Systems.

[BibT_eX]

[DOI]

CoRR, 2021

Adversarial Attacks and Defenses for Speaker Identification Systems.

[BibT_eX]

[DOI]

Sonal Joshi

CoRR, 2021

Representation Learning to Classify and Detect Adversarial Attacks Against Speaker and Speech Recognition Systems.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Spine2Net: SpineNet with Res2Net and Time-Squeeze-and-Excitation Blocks for Speaker Recognition.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Earnings-21: A Practical Benchmark for ASR in the Wild.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Automatic Detection and Assessment of Alzheimer Disease Using Speech and Language Technologies in Low-Resource Scenarios.

[BibT_eX]

[DOI]

Jaejin Cho

Sonal Joshi

Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Deep Feature CycleGANs: Speaker Identity Preserving Non-Parallel Microphone-Telephone Domain Adaptation for Speaker Verification.

[BibT_eX]

[DOI]

Saurabh Kataria

Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Unsupervised Acoustic Unit Discovery by Leveraging a Language-Independent Subword Discriminative Feature Representation.

[BibT_eX]

[DOI]

Siyuan Feng

Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Align-Denoise: Single-Pass Non-Autoregressive Speech Recognition.

[BibT_eX]

[DOI]

Nanxin Chen

Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Segmental Contrastive Predictive Coding for Unsupervised Word Segmentation.

[BibT_eX]

[DOI]

Saurabhchand Bhati

Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

CopyPaste: An Augmentation Method for Speech Emotion Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

How Phonotactics Affect Multilingual and Zero-Shot ASR Performance.

[BibT_eX]

[DOI]

Siyuan Feng

Ali Abavisani

Mark Hasegawa-Johnson

Proceedings of the IEEE International Conference on Acoustics, 2021

Improving Reconstruction Loss Based Speaker Embedding in Unsupervised and Semi-Supervised Scenarios.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Focus on the Present: A Regularization Method for the ASR Source-Target Attention Layer.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Beyond Isolated Utterances: Conversational Emotion Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

Joint Prediction of Truecasing and Punctuation for Conversational Speech in Low-Resource Scenarios.

[BibT_eX]

[DOI]

Agnieszka Mikolajczyk

Piotr Pezik

Aswin Shanmugam Subramanian

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020

The JHU Multi-Microphone Multi-Speaker ASR System for the CHiME-6 Challenge.

[BibT_eX]

[DOI]

Ashish Arora

Desh Raj

CoRR, 2020

That Sounds Familiar: An Analysis of Phonetic Representations Transfer Across Languages.

[BibT_eX]

[DOI]

Mark Hasegawa-Johnson

Proceedings of the Interspeech 2020, 2020

Learning Speaker Embedding from Text-to-Speech.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2020, 2020

Self-Expressing Autoencoders for Unsupervised Spoken Term Discovery.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2020, 2020

Punctuation Prediction in Spontaneous Conversations: Can We Mitigate ASR Errors with Retrofitted Word Embeddings?

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2020, 2020

WER we are and WER we think we are.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

2019

Towards Better Understanding of Spontaneous Conversations: Overcoming Automatic Speech Recognition Errors With Intent Recognition.

[BibT_eX]

[DOI]

CoRR, 2019

Avaya Conversational Intelligence: A Real-Time System for Spoken Language Understanding in Human-Human Call Center Conversations.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2019, 2019

Hierarchical Transformers for Long Document Classification.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018

An Application for Building a Polish Telephone Speech Corpus.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

Expanding Abbreviations in a Strongly Inflected Language: Are Morphosyntactic Tags Sufficient?

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

Punctuation Prediction Model for Conversational Speech.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2018, 2018

2017

LSTM Network for Inflected Abbreviation Expansion.

[BibT_eX]

[DOI]

CoRR, 2017

Audio Replay Attack Detection Using High-Frequency Features.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2017, 2017

2016

AGH corpus of Polish speech.

[BibT_eX]

[DOI]

Lang. Resour. Evaluation, 2016

Structure of pauses in speech in the context of speaker verification and classification of speech type.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2016

2015

SARMATA 2.0 automatic Polish language speech recognition system.

[BibT_eX]

[DOI]

Proceedings of the INTERSPEECH 2015, 2015

Linguistically motivated tied-state triphones for polish speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 2nd IEEE International Conference on Cybernetics, 2015

2014

HMM-based Breath and Filled Pauses Elimination in ASR.

[BibT_eX]

[DOI]