Kevin W. Wilson

Affiliations:

Google

According to our database¹, Kevin W. Wilson authored at least 41 papers between 2001 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Bibliography

2025

Recomposer: Event-roll-guided generative audio editing.

[BibT_eX]

[DOI]

CoRR, September, 2025

2024

Unsupervised Multi-Channel Separation And Adaptation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

2022

Distance-Based Sound Separation.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

2021

Sequential Multi-Frame Neural Beamforming for Speech Separation and Enhancement.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

End-To-End Diarization for Variable Number of Speakers with Local-Global Networks and Discriminative Speaker Embeddings.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

2020

Unsupervised Sound Separation Using Mixtures of Mixtures.

[BibT_eX]

[DOI]

CoRR, 2020

Unsupervised Sound Separation Using Mixture Invariant Training.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

VoiceFilter-Lite: Streaming Targeted Voice Separation for On-Device Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

2019

Alternating Between Spectral and Spatial Estimation for Speech Separation and Enhancement.

[BibT_eX]

[DOI]

CoRR, 2019

Universal Sound Separation.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2019

VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Differentiable Consistency Constraints for Improved Deep Speech Enhancement.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

2018

Exploring Tradeoffs in Models for Low-Latency Speech Enhancement.

[BibT_eX]

[DOI]

Proceedings of the 16th International Workshop on Acoustic Signal Enhancement, 2018

AVA-Speech: A Densely Labeled Dataset of Speech Activity in Movies.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

2017

Multichannel Signal Processing With Deep Neural Networks for Automatic Speech Recognition.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2017

Acoustic Modeling for Google Home.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

CNN architectures for large-scale audio classification.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Raw Multichannel Processing Using Deep Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the New Era for Robust Speech Recognition, Exploiting Deep Learning., 2017

2016

AutoMOS: Learning a non-intrusive assessor of naturalness-of-speech.

[BibT_eX]

[DOI]

Brian Patton

Yannis Agiomyrgiannakis

CoRR, 2016

Reducing the Computational Complexity of Multimicrophone Acoustic Models with Integrated Feature Extraction.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Neural Network Adaptive Beamforming for Robust Multichannel Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Factored spatial and spectral multichannel raw waveform CLDNNs.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015

Learning the speech front-end with raw waveform CLDNNs.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Speech acoustic modeling from raw multichannel waveforms.

[BibT_eX]

[DOI]

Yedid Hoshen

Ron J. Weiss

Kevin W. Wilson

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Speaker location and microphone spacing invariant acoustic modeling from raw multichannel waveforms.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

2010

Ungrounded independent non-negative factor analysis.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Spectrogram dimensionality reductionwith independence constraints.

[BibT_eX]

[DOI]

Kevin W. Wilson

Bhiksha Raj

Proceedings of the IEEE International Conference on Acoustics, 2010

2008

Regularized non-negative matrix factorization with temporal dependencies for speech denoising.

[BibT_eX]

[DOI]

Kevin W. Wilson

Bhiksha Raj

Paris Smaragdis

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Speech denoising using nonnegative matrix factorization with priors.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2008

2007

An SVM Framework for Genre-Independent Scene Change Detection.

[BibT_eX]

[DOI]

Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007

2006

Estimating uncertainty models for speech source localization in real-world environments.

[BibT_eX]

[DOI]

Kevin W. Wilson

PhD thesis, 2006

Learning a Precedence Effect-Like Weighting Function for the Generalized Cross-Correlation Framework.

[BibT_eX]

[DOI]

Kevin W. Wilson

Trevor Darrell

IEEE Trans. Speech Audio Process., 2006

2005

Visual Speech Recognition with Loosely Synchronized Feature Streams.

[BibT_eX]

[DOI]

Proceedings of the 10th IEEE International Conference on Computer Vision (ICCV 2005), 2005

Improving audio source localization by learning the precedence effect.

[BibT_eX]

[DOI]

Kevin W. Wilson

Trevor Darrell

Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

2004

Real-time audio-visual tracking for meeting analysis.

[BibT_eX]

[DOI]

Proceedings of the 6th International Conference on Multimodal Interfaces, 2004

Multiple person and speaker activity tracking with a particle filter.

[BibT_eX]

[DOI]

Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2003

A multi-modal approach for determining speaker location and focus.

[BibT_eX]

[DOI]

Michael Siracusa

Louis-Philippe Morency

Kevin W. Wilson

John W. Fisher III

Trevor Darrell

Proceedings of the 5th International Conference on Multimodal Interfaces, 2003

A Probabilistic Framework for Multi-modal Multi-Person Tracking.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2003

2002

Audiovisual Arrays for Untethered Spoken Interfaces.

[BibT_eX]

[DOI]

Proceedings of the 4th IEEE International Conference on Multimodal Interfaces (ICMI 2002), 2002

Audio-video array source localization for intelligent environments.

[BibT_eX]

[DOI]

Kevin W. Wilson

Trevor Darrell

Proceedings of the IEEE International Conference on Acoustics, 2002

2001

Audio-video array source separation for perceptual user interfaces.

[BibT_eX]

[DOI]

Proceedings of the 2001 workshop on Perceptive user interfaces, 2001

Kevin W. Wilson

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...