Malcolm Slaney

Marco Tagliasacchi

Proceedings of the IEEE International Conference on Acoustics, 2023

2022

Disentangling speech from surroundings in a neural audio codec.

[BibT_eX]

[DOI]

Ahmed Omran

Neil Zeghidour

Zalán Borsos

Félix de Chaumont Quitry

Marco Tagliasacchi

CoRR, 2022

Neural Architecture Search for Energy Efficient Always-on Audio Models.

[BibT_eX]

[DOI]

CoRR, 2022

Multi-Channel Speech Denoising for Machine Ears.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

VHP: Vibrotactile Haptics Platform for On-body Applications.

[BibT_eX]

[DOI]

Proceedings of the UIST '21: The 34th Annual ACM Symposium on User Interface Software and Technology, 2021

2020

Deep Canonical Correlation Analysis For Decoding The Auditory Brain.

[BibT_eX]

[DOI]

Jaswanth Reddy Katthi

Sriram Ganapathy

Sandeep Kothinti

Proceedings of the 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2020

2018

Decoding the auditory brain with canonical component analysis.

[BibT_eX]

[DOI]

Alain de Cheveigné

Daniel D. E. Wong

Giovanni M. Di Liberto

Jens Hjortkjær

Nathaniel-Georg S. Gutierrez

Edmund C. Lalor

NeuroImage, 2018

Using audio-visual information to understand speaker activity: Tracking active speakers on and off screen.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Towards mobile gaze-directed beamforming: a novel neuro-technology for hearing loss.

[BibT_eX]

[DOI]

Markham H. Anderson

Britt W. Yazel

Matthew P. F. Stickle

Fernando D. Espinosa

Sanjay S. Joshi

Lee M. Miller

Proceedings of the 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2018

2017

Putting a Face to the Voice: Fusing Audio and Visual Signals Across a Video to Determine Speakers.

[BibT_eX]

[DOI]

CoRR, 2017

CNN architectures for large-scale audio classification.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

2015

A Study of Multimodal Addressee Detection in Human-Human-Computer Interaction.

[BibT_eX]

[DOI]

T. J. Tsai

IEEE Trans. Multim., 2015

Multimodal addressee detection in multiparty dialogue systems.

[BibT_eX]

[DOI]

T. J. Tsai

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Probabilistic features for connecting eye gaze to spoken language understanding.

[BibT_eX]

[DOI]

Anna Prokofieva

Dilek Hakkani-Tür

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2014

Artificial neural network features for speaker diarization.

[BibT_eX]

[DOI]

Sree Harsha Yella

Proceedings of the 2014 IEEE Spoken Language Technology Workshop, 2014

Eye gaze for understanding conversational speech.

[BibT_eX]

[DOI]

Anna Prokofieva

Dilek Hakkani-Tür

Proceedings of the 2014 IEEE Spoken Language Technology Workshop, 2014

The influence of pitch and noise on the discriminability of filterbank features.

[BibT_eX]

[DOI]

Michael L. Seltzer

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Towards better performance with heterogeneous training data in acoustic modeling using deep neural networks.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

The Relation of Eye Gaze and Face Pose: Potential Impact on Speech Recognition.

[BibT_eX]

[DOI]

Dilek Hakkani-Tür

Proceedings of the 16th International Conference on Multimodal Interaction, 2014

Eye Gaze for Spoken Language Understanding in Multi-modal Conversational Interactions.

[BibT_eX]

[DOI]

Proceedings of the 16th International Conference on Multimodal Interaction, 2014

Gaze-enhanced speech recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

2013

Introduction to the special section on the 20<sup>th</sup> anniversary of the ACM international conference on multimedia.

[BibT_eX]

[DOI]

Klara Nahrstedt

ACM Trans. Multim. Comput. Commun. Appl., 2013

Micro Stories and Mega Stories.

[BibT_eX]

[DOI]

Ramesh C. Jain

IEEE Multim., 2013

Data driven suppression rule for speech enhancement.

[BibT_eX]

[DOI]

Ivan Tashev

Proceedings of the 2013 Information Theory and Applications Workshop, 2013

QBT-Extended: An Annotated Dataset of Melodically Contoured Tapped Queries.

[BibT_eX]

[DOI]

Proceedings of the 14th International Society for Music Information Retrieval Conference, 2013

Pitch-gesture modeling using subband autocorrelation change detection.

[BibT_eX]

[DOI]

Elizabeth Shriberg

Jui-Ting Huang

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Characteristic contours of syllabic-level units in laughter.

[BibT_eX]

[DOI]

Jieun Oh

Eunjoon Cho

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

2012

Optimal Parameters for Locality-Sensitive Hashing.

[BibT_eX]

[DOI]

Yury Lifshits

Junfeng He

Proc. IEEE, 2012

Web-Scale Multimedia Processing and Applications [Scanning the Issue].

[BibT_eX]

[DOI]

Edward Y. Chang

Shih-Fu Chang

Alexander G. Hauptmann

Thomas S. Huang

Proc. IEEE, 2012

Don't Click Here.

[BibT_eX]

[DOI]

David A. Shamma

IEEE Multim., 2012

Tell Me a Story.

[BibT_eX]

[DOI]

Aisling Kelliher

IEEE Multim., 2012

Coulda, woulda, shoulda: 20 years of multimedia opportunities.

[BibT_eX]

[DOI]

Klara Nahrstedt

Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Learning Sparse Feature Representations for Music Annotation and Retrieval.

[BibT_eX]

[DOI]

Proceedings of the 13th International Society for Music Information Retrieval Conference, 2012

A model of attention-driven scene analysis.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011

Audio and Acoustic Signal Processing [In the Spotlight].

[BibT_eX]

[DOI]

Patrick A. Naylor

IEEE Signal Process. Mag., 2011

Academia Meets Industry at the Multimedia Grand Challenge.

[BibT_eX]

[DOI]

Cees G. M. Snoek

IEEE Multim., 2011

Precision-Recall Is Wrong for Multimedia.

[BibT_eX]

[DOI]

IEEE Multim., 2011

Web-Scale Multimedia Analysis: Does Content Matter?

[BibT_eX]

[DOI]

IEEE Multim., 2011

Identifying authoritative sources of multimedia content: mining specificity and expertise from large-scale multimedia databases.

[BibT_eX]

[DOI]

Lyndon Kennedy

Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

A Classification-Based Polyphonic Piano Transcription Approach Using Learned Feature Representations.

[BibT_eX]

[DOI]

Proceedings of the 12th International Society for Music Information Retrieval Conference, 2011

Recommender Systems, Missing Data and Statistical Model Estimation.

[BibT_eX]

[DOI]

Proceedings of the IJCAI 2011, 2011

Using gaze patterns to study and predict reading struggles due to distraction.

[BibT_eX]

[DOI]

Vidhya Navalpakkam

Justin Rao

Proceedings of the International Conference on Human Factors in Computing Systems, 2011

2010

Solving Demodulation as an Optimization Problem.

[BibT_eX]

[DOI]

Gregory Sell

IEEE Trans. Speech Audio Process., 2010

Scalable Audio-Content Analysis.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2010

Processing web-scale multimedia data.

[BibT_eX]

[DOI]

Edward Y. Chang

Proceedings of the 18th International Conference on Multimedia 2010, 2010

Image classification using the web graph.

[BibT_eX]

[DOI]

Dhruv Kumar Mahajan

Proceedings of the 18th International Conference on Multimedia 2010, 2010

Multimodal retrieval and ranking: more than waveforms.

[BibT_eX]

[DOI]

Proceedings of the 11th ACM SIGMM International Conference on Multimedia Information Retrieval, 2010

The information content of demodulated speech.

[BibT_eX]

[DOI]

Gregory Sell

Proceedings of the IEEE International Conference on Acoustics, 2010

2009

Unsupervised image ranking.

[BibT_eX]

[DOI]

Proceedings of the First ACM workshop on Large-scale multimedia retrieval and mining, 2009

Periodicity Detection and Localization using Spike Timing from the AER EAR.

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Circuits and Systems (ISCAS 2009), 2009

Reconciliation of human and machine speech recognition performance.

[BibT_eX]

[DOI]

Misha Pavel

Hynek Hermansky

Proceedings of the IEEE International Conference on Acoustics, 2009

2008

Acoustic Chord Transcription and Key Extraction From Audio Using Key-Dependent HMMs Trained on Synthesized Audio.

[BibT_eX]

[DOI]

Kyogu Lee

IEEE Trans. Speech Audio Process., 2008

Analysis of Minimum Distances in High-Dimensional Musical Spaces.

[BibT_eX]

[DOI]

Christophe Rhodes

IEEE Trans. Speech Audio Process., 2008

Locality-Sensitive Hashing for Finding Nearest Neighbors [Lecture Notes].

[BibT_eX]

[DOI]

IEEE Signal Process. Mag., 2008

Content-Based Music Information Retrieval: Current Directions and Future Challenges.

[BibT_eX]

[DOI]

Proc. IEEE, 2008

Resolving tag ambiguity.

[BibT_eX]

[DOI]

Kilian Q. Weinberger

Roelof van Zwol

Proceedings of the 16th International Conference on Multimedia 2008, 2008

Learning a Metric for Music Similarity.

[BibT_eX]

[DOI]

Kilian Q. Weinberger

William White

Proceedings of the ISMIR 2008, 2008

Comparing Local Feature Descriptors in pLSA-Based Image Models.

[BibT_eX]

[DOI]

Proceedings of the Pattern Recognition, 2008

Continuous visual vocabulary modelsfor pLSA-based scene recognition.

[BibT_eX]

[DOI]

Eva Hörster

Proceedings of the 7th ACM International Conference on Image and Video Retrieval, 2008

2007

Collaborative Filtering and the Missing at Random Assumption.

[BibT_eX]

[DOI]

Proceedings of the UAI 2007, 2007

Similarity Based on Rating Data.

[BibT_eX]

[DOI]

William White

Proceedings of the 8th International Conference on Music Information Retrieval, 2007

A Unified System for Chord Transcription and Key Extraction Using Hidden Markov Models.

[BibT_eX]

[DOI]

Kyogu Lee

Proceedings of the 8th International Conference on Music Information Retrieval, 2007

PLSA on Large Scale Image Databases.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2007

Fast Recognition of Remixed Music Audio.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2007

Varying Time Constants and Gain Adaptation in Feature Extraction for Speech Processing.

[BibT_eX]

[DOI]

David V. Anderson

Sourabh Ravindran

Proceedings of the IEEE International Conference on Acoustics, 2007

Image retrieval on large-scale image databases.

[BibT_eX]

[DOI]

Eva Hörster

Proceedings of the 6th ACM International Conference on Image and Video Retrieval, 2007

2006

Discrimination of speech from nonspeech based on multiscale spectro-temporal Modulations.

[BibT_eX]

[DOI]

Nima Mesgarani

Shihab A. Shamma

IEEE Trans. Speech Audio Process., 2006

Automatic Chord Recognition from Audio Using a HMM with Supervised Learning.

[BibT_eX]

Kyogu Lee

Proceedings of the ISMIR 2006, 2006

Song Intersection by Approximate Nearest Neighbor Search.

[BibT_eX]

Proceedings of the ISMIR 2006, 2006

A statistical model of timbre perception.

[BibT_eX]

[DOI]

Hiroko Terasawa

Jonathan Berger

Proceedings of the ISCA Tutorial and Research Workshop on Statistical and Perceptual Audition, 2006

Improving the noise-robustness of mel-frequency cepstral coefficients for speech processing.

[BibT_eX]

[DOI]

Sourabh Ravindran

David V. Anderson

Proceedings of the ISCA Tutorial and Research Workshop on Statistical and Perceptual Audition, 2006

The Importance of Sequences in Musical Similarity.

[BibT_eX]

[DOI]

Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Being Literate with Large Document Collections: Observational Studies and Cost Structure Tradeoffs.

[BibT_eX]

[DOI]

Proceedings of the 39th Hawaii International International Conference on Systems Science (HICSS-39 2006), 2006

2005

A timbre space for speech.

[BibT_eX]

[DOI]

Hiroko Terasawa

Jonathan Berger

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Analytic Worksheets: A Framework to Support Human Analysis of Large Streaming Data Volumes.

[BibT_eX]

[DOI]

Proceedings of the Human-Computer Interaction, 2005

Measuring Information Understanding in Large Document Collections.

[BibT_eX]

[DOI]

Daniel M. Russell

Proceedings of the 38th Hawaii International Conference on System Sciences (HICSS-38 2005), 2005

The History and Future of CASA.

[BibT_eX]

[DOI]

Proceedings of the Speech Separation by Humans and Machines, 2005

2004

Low-power audio classification for ubiquitous sensor networks.

[BibT_eX]

[DOI]

Sourabh Ravindran

David V. Anderson

Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Speech discrimination based on multiscale spectro-temporal modulations.

[BibT_eX]

[DOI]

Nima Mesgarani

Shihab A. Shamma

Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2003

BabyEars: A recognition system for affective vocalizations.

[BibT_eX]

[DOI]

Gerald McRoberts

Speech Commun., 2003

Modeling Multitasking Users.

[BibT_eX]

[DOI]

Jayashree Subrahmonia

Paul P. Maglio

Proceedings of the User Modeling 2003, 2003

2002

Mixtures of probability experts for audio retrieval and indexing.

[BibT_eX]

[DOI]

Proceedings of the 2002 IEEE International Conference on Multimedia and Expo, 2002

Semantic-audio retrieval.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2002

2001

Multimedia edges: finding hierarchy in all dimensions.

[BibT_eX]

[DOI]

Dulce B. Ponceleon

James H. Kaufman

Proceedings of the 9th ACM International Conference on Multimedia 2001, Ottawa, Ontario, Canada, September 30, 2001

Hierarchical segmentation using latent semantic indexing in scale space.

[BibT_eX]

[DOI]

Dulce B. Ponceleon

Proceedings of the IEEE International Conference on Acoustics, 2001

FastMPEG: time-scale modification of bit-compressed audio information.

[BibT_eX]

[DOI]

Art Rothstein

Proceedings of the IEEE International Conference on Acoustics, 2001

Temporal Events in All Dimensions and Scales.

[BibT_eX]

[DOI]

Dulce B. Ponceleon

James H. Kaufman

Proceedings of the IEEE Workshop on Detection and Recognition of Events in Video, 2001

Principles of computerized tomographic imaging.

[BibT_eX]

Avinash C. Kak

Classics in applied mathematics 33, SIAM, ISBN: 978-0-89871-494-4, 2001

2000

FaceSync: A Linear Operator for Measuring Synchronization of Video Facial Images and Audio Tracks.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 13, 2000

1998

Baby Ears: a recognition system for affective vocalizations.

[BibT_eX]

[DOI]

Gerald McRoberts

Proceedings of the 1998 IEEE International Conference on Acoustics, 1998

MACH1: nonuniform time-scale modification of speech.

[BibT_eX]

[DOI]

Margaret Withgott

Proceedings of the 1998 IEEE International Conference on Acoustics, 1998

1997

Video Rewrite: driving visual speech with audio.

[BibT_eX]

[DOI]

Christoph Bregler

Proceedings of the 24th Annual Conference on Computer Graphics and Interactive Techniques, 1997

Construction and evaluation of a robust multifeature speech/music discriminator.

[BibT_eX]

[DOI]

Eric D. Scheirer

Proceedings of the 1997 IEEE International Conference on Acoustics, 1997

Video rewrite: visual speech synthesis from video.

[BibT_eX]

[DOI]

Christoph Bregler

Proceedings of the ESCA Workshop on Audio-Visual Speech Processing, 1997

1996

Automatic audio morphing.

[BibT_eX]

[DOI]

Bud Lassiter

Proceedings of the 1996 IEEE International Conference on Acoustics, 1996

1994

Pattern Playback in the 90s.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 7, 1994

Auditory model inversion for sound separation.

[BibT_eX]

[DOI]

Daniel Naar

Richard F. Lyon

Proceedings of ICASSP '94: IEEE International Conference on Acoustics, 1994

1990

A perceptual pitch detector.

[BibT_eX]

[DOI]

Richard F. Lyon

Proceedings of the 1990 International Conference on Acoustics, 1990

Speaker-independent vowel recognition: spectrograms versus cochleagrams.

[BibT_eX]

[DOI]

Yeshwant K. Muthusamy

Ronald A. Cole