We stand with Ukraine

We stand with Ukraine

Xavier Anguera Miró

Orcid: 0000-0001-8659-3991

According to our database¹, Xavier Anguera Miró authored at least 97 papers between 1996 and 2023.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2023

ELSA Speech Analyzer: English Communication Assessment of Spontaneous Speech.

[DOI]

,

,

Kristina Gulordava

,

Balázs Tarján

,

Nicholas Parslow

,

Vladimir Dobrovolskii

,

Francisco Valente

,

Proceedings of the 9th Workshop on Speech and Language Technology in Education, 2023

2019

Teaching American English pronunciation using a TTS service.

[DOI]

,

Ganna Raboshchuk

,

,

Paula Lopez-Otero

,

Proceedings of the 8th ISCA International Workshop on Speech and Language Technology in Education, 2019

2017

The zero resource speech challenge 2017.

[DOI]

,

,

,

Julien Karadayi

,

Mathieu Bernard

,

Laurent Besacier

,

,

Emmanuel Dupoux

Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

2016

The Zero Resource Speech Challenge 2015: Proposed Approaches and Results.

[DOI]

Maarten Versteegh

,

,

,

Emmanuel Dupoux

Proceedings of the SLTU-2016, 2016

Zero-Cost Speech Recognition Task at Mediaeval 2016.

[DOI]

,

Proceedings of the Working Notes Proceedings of the MediaEval 2016 Workshop, 2016

English Language Speech Assistant.

[DOI]

,

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

2015

Fast Single- and Cross-Show Speaker Diarization Using Binary Key Speaker Modeling.

[DOI]

Héctor Delgado

,

,

Corinne Fredouille

,

IEEE ACM Trans. Audio Speech Lang. Process., 2015

Automatic Extraction of the Passing Strategies of Soccer Teams.

[DOI]

László Gyarmati

,

CoRR, 2015

Query by Example Search on Speech at Mediaeval 2015.

[DOI]

,

Luis Javier Rodríguez-Fuentes

,

,

,

,

,

,

Proceedings of the Working Notes Proceedings of the MediaEval 2015 Workshop, 2015

The zero resource speech challenge 2015.

[DOI]

Maarten Versteegh

,

Roland Thiollière

,

,

,

,

,

Emmanuel Dupoux

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Effect of gender and call duration on customer satisfaction in call center big data.

[DOI]

,

,

,

Zoraida Hidalgo

,

,

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Novel clustering selection criterion for fast binary key speaker diarization.

[DOI]

Héctor Delgado

,

,

Corinne Fredouille

,

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Multimodal read-aloud ebooks for language learning.

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

An information-theoretic metric of fingerprint effectiveness.

[DOI]

,

Gerald Friedland

,

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

MASK+: Data-driven regions selection for acoustic fingerprinting.

[DOI]

,

,

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

QUESST2014: Evaluating Query-by-Example Speech Search in a zero-resource setting with real-life queries.

[DOI]

,

Luis Javier Rodríguez-Fuentes

,

,

,

,

Mikel Peñagarikano

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Improved binary key speaker diarization system.

[DOI]

Héctor Delgado

,

,

Corinne Fredouille

,

Proceedings of the 23rd European Signal Processing Conference, 2015

2014

Language independent search in MediaEval's Spoken Web Search task.

[DOI]

,

,

Etienne Barnard

,

Marelie H. Davel

,

Guillaume Gravier

Comput. Speech Lang., 2014

Query-by-example spoken term detection evaluation on low-resource languages.

[DOI]

,

Luis Javier Rodríguez-Fuentes

,

,

,

,

Mikel Peñagarikano

Proceedings of the 4th Workshop on Spoken Language Technologies for Under-resourced Languages, 2014

Query by Example Search on Speech at Mediaeval 2014.

[DOI]

,

Luis Javier Rodríguez-Fuentes

,

,

,

Proceedings of the Working Notes Proceedings of the MediaEval 2014 Workshop, 2014

Query-by-example spoken term detection on multilingual unconstrained speech.

[DOI]

,

Luis Javier Rodríguez-Fuentes

,

,

,

,

Mikel Peñagarikano

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Audio-to-text alignment for speech recognition with very limited resources.

[DOI]

,

,

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Inferring social relationships in a phone call from a single party's speech.

[DOI]

Sree Harsha Yella

,

,

Proceedings of the IEEE International Conference on Acoustics, 2014

Sentiment retrieval on web reviews using spontaneous natural speech.

[DOI]

José Costa Pereira

,

,

Proceedings of the IEEE International Conference on Acoustics, 2014

Phoneme-Lattice to Phoneme-Sequence Matching Algorithm Based on Dynamic Programming.

[DOI]

,

,

,

Proceedings of the Advances in Speech and Language Technologies for Iberian Languages, 2014

Flexible Stand-Alone Keyword Recognition Application Using Dynamic Time Warping.

[DOI]

Miquel Ferrarons

,

,

Proceedings of the Advances in Speech and Language Technologies for Iberian Languages, 2014

Global Speaker Clustering towards Optimal Stopping Criterion in Binary Key Speaker Diarization.

[DOI]

Héctor Delgado

,

,

Corinne Fredouille

,

Proceedings of the Advances in Speech and Language Technologies for Iberian Languages, 2014

On the modeling of natural vocal emotion expressions through binary key.

[DOI]

,

Proceedings of the 22nd European Signal Processing Conference, 2014

Combining temporal and spectral information for Query-by-Example Spoken Term Detection.

[DOI]

,

,

Proceedings of the 22nd European Signal Processing Conference, 2014

2013

Query-by-Example Spoken Term Detection ALBAYZIN 2012 evaluation: overview, systems, results, and discussion.

[DOI]

,

Doroteo T. Toledano

,

,

,

Lluís F. Hurtado

,

,

EURASIP J. Audio Speech Music. Process., 2013

The CMTECH Spoken Web Search System for MediaEval 2013.

[DOI]

,

,

Proceedings of the MediaEval 2013 Multimedia Benchmark Workshop, 2013

The Telefonica Research Spoken Web Search System for MediaEval 2013.

[DOI]

,

Miroslav Skácel

,

,

Proceedings of the MediaEval 2013 Multimedia Benchmark Workshop, 2013

The Spoken Web Search Task.

[DOI]

,

,

,

,

Luis Javier Rodríguez-Fuentes

Proceedings of the MediaEval 2013 Multimedia Benchmark Workshop, 2013

Information retrieval-based dynamic time warping.

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Two-Level Clustering towards Unsupervised Discovery of Acoustic Classes.

[DOI]

Ciro Gracia Pons

,

,

Proceedings of the 12th International Conference on Machine Learning and Applications, 2013

A Riemannian Stopping Criterion for Unsupervised Phonetic Segmentation.

[DOI]

Ciro Gracia Pons

,

,

Proceedings of the 12th International Conference on Machine Learning and Applications, 2013

Memory efficient subsequence DTW for Query-by-Example Spoken Term Detection.

[DOI]

,

Miquel Ferrarons

Proceedings of the 2013 IEEE International Conference on Multimedia and Expo, 2013

The spoken web search task at MediaEval 2012.

[DOI]

,

,

Etienne Barnard

,

Marelie H. Davel

,

Guillaume Gravier

Proceedings of the IEEE International Conference on Acoustics, 2013

Speed improvements to Information Retrieval-based dynamic time warping using hierarchical K-Means clustering.

[DOI]

Gautam Varma Mantena

,

Proceedings of the IEEE International Conference on Acoustics, 2013

Perceptually inspired features for speaker likability classification.

[DOI]

,

Proceedings of the IEEE International Conference on Acoustics, 2013

2012

Speaker Diarization: A Review of Recent Research.

[DOI]

Xavier Anguera Miró

,

,

Nicholas W. D. Evans

,

Corinne Fredouille

,

Gerald Friedland

,

IEEE Trans. Speech Audio Process., 2012

The ICSI RT-09 Speaker Diarization System.

[DOI]

Gerald Friedland

,

,

,

Xavier Anguera Miró

,

Luke R. Gottlieb

,

Marijn Huijbregts

,

,

IEEE Trans. Speech Audio Process., 2012

The Spoken Web Search Task.

[DOI]

,

Etienne Barnard

,

Marelie H. Davel

,

Charl Johannes van Heerden

,

,

Guillaume Gravier

,

Nitendra Rajput

Proceedings of the Working Notes Proceedings of the MediaEval 2012 Workshop, 2012

Telefonica Research System for the Spoken Web Search task at Mediaeval 2012.

[DOI]

Proceedings of the Working Notes Proceedings of the MediaEval 2012 Workshop, 2012

MASK: Robust Local Features for Audio Fingerprinting.

[DOI]

,

,

Proceedings of the 2012 IEEE International Conference on Multimedia and Expo, 2012

Expert Talk for Time Machine Session: Dynamic Time Warping New Youth.

[DOI]

Proceedings of the 2012 IEEE International Conference on Multimedia and Expo, 2012

The Spoken Web Search Task at MediaEval 2011.

[DOI]

,

Nitendra Rajput

,

,

Marelie H. Davel

,

Guillaume Gravier

,

Charl Johannes van Heerden

,

Gautam Varma Mantena

,

Armando Muscariello

,

Kishore Prahallad

,

,

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Speaker independent discriminant feature extraction for acoustic pattern-matching.

[DOI]

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011

Combining Features at Search Time: PRISMA at Video Copy Detection Task.

[DOI]

Juan Manuel Barrios

,

Benjamin Bustos

,

Proceedings of the 2011 TREC Video Retrieval Evaluation, 2011

Telefonica Research at TRECVID 2011 Content-Based Copy Detection.

[DOI]

,

,

,

Juan Manuel Barrios

Proceedings of the 2011 TREC Video Retrieval Evaluation, 2011

Multimodal fusion for video copy detection.

[DOI]

,

Juan Manuel Barrios

,

,

Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Telefonica System for the Spoken Web Search Task at Mediaeval 2011.

[DOI]

Proceedings of the Working Notes Proceedings of the MediaEval 2011 Workshop, 2011

Speaker Modeling Using Local Binary Decisions.

[DOI]

Jean-François Bonastre

,

Xavier Anguera Miró

,

Gabriel Hernández Sierra

,

Pierre-Michel Bousquet

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Real-time synchronisation of multimedia streams in a mobile device.

[DOI]

,

Joachim Neumann

,

,

,

Proceedings of the 2011 IEEE International Conference on Multimedia and Expo, 2011

Automatic synchronization of electronic and audio books via TTS alignment and silence filtering.

[DOI]

,

,

,

Proceedings of the 2011 IEEE International Conference on Multimedia and Expo, 2011

Closed-form expressions vs. BIC: A comparison for speaker clustering.

[DOI]

Themos Stafylakis

,

Xavier Anguera Miró

,

Vassilis Katsouros

,

George Carayannis

Proceedings of the IEEE International Conference on Acoustics, 2011

Fast speaker diarization based on binary keys.

[DOI]

Xavier Anguera Miró

,

Jean-François Bonastre

Proceedings of the IEEE International Conference on Acoustics, 2011

Discriminant binary data representation for speaker recognition.

[DOI]

Jean-François Bonastre

,

Pierre-Michel Bousquet

,

,

Xavier Anguera Miró

Proceedings of the IEEE International Conference on Acoustics, 2011

Spoken WordCloud: Clustering recurrent patterns in speech.

[DOI]

,

,

Proceedings of the 9th International Workshop on Content-Based Multimedia Indexing, 2011

2010

Telefonica Research at TRECVID 2010 Content-Based Copy Detection.

[DOI]

Ehsan Younessian

,

,

,

Proceedings of the TRECVID 2010 workshop participants notebook papers, 2010

Improvements to the equal-parameter BIC for speaker diarization.

[DOI]

Themos Stafylakis

,

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

System output combination for improved speaker diarization.

[DOI]

,

Nicholas W. D. Evans

,

,

,

Gerald Friedland

,

Corinne Fredouille

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

A novel speaker binary key derived from anchor models.

[DOI]

,

Jean-François Bonastre

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Enriching music mood annotation by semantic association reasoning.

[DOI]

,

,

,

Proceedings of the 2010 IEEE International Conference on Multimedia and Expo, 2010

MuViSync: Realtime music video alignment.

[DOI]

,

,

Proceedings of the 2010 IEEE International Conference on Multimedia and Expo, 2010

Partial sequence matching using an Unbounded Dynamic Time Warping algorithm.

[DOI]

,

,

Proceedings of the IEEE International Conference on Acoustics, 2010

2009

Telefonica Research Content-Based Copy Detection TRECVID Submission.

[DOI]

,

,

,

,

Proceedings of the TRECVID 2009 workshop participants notebook papers, 2009

The role of tags and image aesthetics in social image search.

[DOI]

,

,

Rodrigo de Oliveira

,

Proceedings of the first SIGMM workshop on Social media, 2009

Multimodal video copy detection applied to social media.

[DOI]

,

,

Proceedings of the first SIGMM workshop on Social media, 2009

Text versus speech: a comparison of tagging input modalities for camera phones.

[DOI]

Mauro Cherubini

,

,

,

Rodrigo de Oliveira

Proceedings of the 11th Conference on Human-Computer Interaction with Mobile Devices and Services, 2009

Minivectors: an improved GMM-SVM approach for speaker verification.

[DOI]

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Audio-based automatic management of TV commercials.

[DOI]

,

,

Proceedings of the IEEE International Conference on Acoustics, 2009

2008

Multimodal photo annotation and retrieval on a mobile phone.

[DOI]

,

,

Proceedings of the 1st ACM SIGMM International Conference on Multimedia Information Retrieval, 2008

MAMI: multimodal annotations on a camera phone.

[DOI]

,

Proceedings of the 10th Conference on Human-Computer Interaction with Mobile Devices and Services, 2008

TV Advertisements Detection and Clustering Based on Acoustic Information.

[DOI]

,

Proceedings of the 2008 International Conferences on Computational Intelligence for Modelling, 2008

2007

Speaker Diarization For Multiple-Distant-Microphone Meetings Using Several Sources of Information.

[DOI]

,

,

IEEE Trans. Computers, 2007

Acoustic Beamforming for Speaker Diarization of Meetings.

[DOI]

,

,

Javier Hernando

IEEE Trans. Speech Audio Process., 2007

Automatic Weighting for the Combination of TDOA and Acoustic Features in Speaker Diarization for Meetings.

[DOI]

Xavier Anguera Miró

,

,

José Manuel Pardo

,

Javier Hernando

Proceedings of the IEEE International Conference on Acoustics, 2007

Model Complexity Selection and Cross-Validation EM Training for Robust Speaker Diarization.

[DOI]

Xavier Anguera Miró

,

Takahiro Shinozaki

,

,

Javier Hernando

Proceedings of the IEEE International Conference on Acoustics, 2007

The SRI-ICSI Spring 2007 Meeting and Lecture Recognition System.

[DOI]

Andreas Stolcke

,

,

,

,

,

Mathew Magimai-Doss

,

,

Proceedings of the Multimodal Technologies for Perception of Humans, 2007

Speaker Diarization for Conference Room: The UPC RT07s Evaluation System.

[DOI]

,

,

,

Javier Hernando

Proceedings of the Multimodal Technologies for Perception of Humans, 2007

2006

Robust speaker diarization for meetings.

[DOI]

Xavier Anguera Miró

PhD thesis, 2006

Hybrid Speech/non-speech detector applied to Speaker Diarization of Meetings.

[DOI]

,

,

,

,

Javier Hernando

Proceedings of the Odyssey 2006: The Speaker and Language Recognition Workshop, 2006

Speaker Diarization for Multi-microphone Meetings Using Only Between-Channel Differences.

[DOI]

,

,

Proceedings of the Machine Learning for Multimodal Interaction, 2006

The ICSI-SRI Spring 2006 Meeting Recognition System.

[DOI]

,

Andreas Stolcke

,

,

,

,

,

Proceedings of the Machine Learning for Multimodal Interaction, 2006

Robust Speaker Diarization for Meetings: ICSI RT06S Meetings Evaluation System.

[DOI]

,

,

Proceedings of the Machine Learning for Multimodal Interaction, 2006

Automatic Cluster Complexity and Quantity Selection: Towards Robust Speaker Diarization.

[DOI]

,

,

Javier Hernando

Proceedings of the Machine Learning for Multimodal Interaction, 2006

Speaker diarization for multiple distant microphone meetings: mixing acoustic features and inter-channel time differences.

[DOI]

,

,

Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Multi-stream speaker diarization systems for the meetings domain.

[DOI]

Ascensión Gallardo-Antolín

,

,

Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Robust speaker diarization for meetings: ICSI RT06s evaluation system.

[DOI]

,

,

Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Friends and enemies: a novel initialization for speaker diarization.

[DOI]

,

,

Javier Hernando

Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Purity Algorithms for Speaker Diarization of Meetings Data.

[DOI]

,

,

Javier Hernando

Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

2005

Further Progress in Meeting Recognition: The ICSI-SRI Spring 2005 Speech-to-Text Evaluation System.

[DOI]

Andreas Stolcke

,

,

,

,

Frantisek Grézl

,

,

,

,

,

Proceedings of the Machine Learning for Multimodal Interaction, 2005

Robust Speaker Segmentation for Meetings: The ICSI-SRI Spring 2005 Diarization System.

[DOI]

,

,

,

Proceedings of the Machine Learning for Multimodal Interaction, 2005

2004

Evolutive speaker segmentation using a repository system.

[DOI]

Xavier Anguera Miró

,

Javier Hernando Pericas

Proceedings of the 8th International Conference on Spoken Language Processing, 2004

1998

A VQ based speaker recognition system based in histogram distances. text independent and for noisy environments.

[DOI]

,

,

Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

1996

Text independent speaker identification on noisy environments by means of self organizing maps.

[DOI]

,

Javier Hernando Pericas

,

,

Proceedings of the 4th International Conference on Spoken Language Processing, 1996

Loading...