Jian Xue

Jinyu Li

CoRR, June, 2025

Streaming Speaker Change Detection and Gender Classification for Transducer-Based Multi-Talker Speech Translation.

[BibT_eX]

[DOI]

CoRR, February, 2025

Multi-agent reinforcement learning with weak ties.

[BibT_eX]

[DOI]

Inf. Fusion, 2025

Length Aware Speech Translation for Video Dubbing.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Failing Forward: Improving Generative Error Correction for ASR with Synthetic Data and Retrieval Augmentation.

[BibT_eX]

[DOI]

Sreyan Ghosh

Mohammad Sadegh Rasooli

Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024

Evaluating the Snow Cover Service Value on the Qinghai-Tibet Plateau.

[BibT_eX]

[DOI]

Remote. Sens., July, 2024

Pathogenicity classification of missense mutations based on deep generative model.

[BibT_eX]

[DOI]

Comput. Biol. Medicine, March, 2024

AU-aware Algorithm for 3D Facial Reconstruction.

[BibT_eX]

[DOI]

Int. J. Softw. Informatics, 2024

MambaDETR: Query-based Temporal Modeling using State Space Model for Multi-View 3D Object Detection.

[BibT_eX]

[DOI]

CoRR, 2024

Isochrony-Controlled Speech-to-Text Translation: A study on translating from Sino-Tibetan to Indo-European Languages.

[BibT_eX]

[DOI]

CoRR, 2024

Towards Unified Facial Action Unit Recognition Framework by Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

Soft Language Identification for Language-Agnostic Many-to-One End-to-End Speech Translation.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

ETAU: Towards Emotional Talking Head Generation Via Facial Action Unit.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

LLMNDC: A Novel Approach for Network Device Configuration based on Fine-tuned Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the 5th International Conference on Computer Engineering and Intelligent Control, 2024

Diarist: Streaming Speech Translation with Speaker Diarization.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Leveraging Timestamp Information for Serialized Joint Streaming Recognition and Translation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

LAMASSU: A Streaming Language-Agnostic Multilingual Speech Recognition and Translation Model Using Neural Transducers.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Target HRRP Length Estimation Based on Significant Energy Aggregation.

[BibT_eX]

[DOI]

Meiyan Pan

Xingyu Cai

Proceedings of the IEEE International Conference on Signal Processing, 2023

Fast and Accurate Factorized Neural Transducer for Text Adaption of End-to-End Speech Recognition Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

A Weakly-Supervised Streaming Multilingual Speech Model with Truly Zero-Shot Capability.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

Building High-Accuracy Multilingual ASR With Gated Language Experts and Curriculum Training.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

Token-Level Serialized Output Training for Joint Streaming ASR and ST Leveraging Textual Alignments.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

Improving Stability in Simultaneous Speech Translation: A Revision-Controllable Decoding Approach.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022

Streaming, fast and accurate on-device Inverse Text Normalization for Automatic Speech Recognition.

[BibT_eX]

[DOI]

CoRR, 2022

LAMASSU: Streaming Language-Agnostic Multilingual Speech Recognition and Translation Using Neural Transducers.

[BibT_eX]

[DOI]

CoRR, 2022

Streaming, Fast and Accurate on-Device Inverse Text Normalization for Automatic Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

A Local-Global Metric Learning Method for Facial Expression Animation.

[BibT_eX]

[DOI]

Proceedings of the 5th IEEE International Conference on Multimedia Information Processing and Retrieval, 2022

Large-Scale Streaming End-to-End Speech Translation with Neural Transducers.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

2021

Insulator Surface Breakage Recognition Based on Multiscale Residual Neural Network.

[BibT_eX]

[DOI]

IEEE Trans. Instrum. Meas., 2021

The Influence of Substituting Prices, Product Returns, and Service Quality on Repurchase Intention.

[BibT_eX]

[DOI]

Complex., 2021

Improving Multilingual Transformer Transducer Models by Reducing Language Confusions.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

On Addressing Practical Challenges for RNN-Transducer.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2019

A High-Gain Approach to Event-Triggered Control With Applications to Motor Systems.

[BibT_eX]

[DOI]

IEEE Trans. Ind. Electron., 2019

Modelling of gene signal attribute reduction based on neighbourhood granulation and rough approximation.

[BibT_eX]

[DOI]

Int. J. Model. Identif. Control., 2019

A Markerless Body Motion Capture System for Character Animation Based on Multi-view Cameras.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

2017

The Opensesame NIST 2016 Speaker Recognition Evaluation System.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

On stability of sampled-data extended state observer for networked systems.

[BibT_eX]

[DOI]

Proceedings of the 11th Asian Control Conference, 2017

2015

Volume visualization for out-of-core 3D images based on semi-adaptive partitioning.

[BibT_eX]

[DOI]

Ke Lii

Proceedings of the 2015 IEEE International Conference on Image Processing, 2015

Investigating online low-footprint speaker adaptation using generalized linear regression and click-through data.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2014

Singular value decomposition based low-footprint speaker adaptation and personalization for deep neural network.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

2013

Harmonizing model with transfer tax on water pollution across regional boundaries in a China's lake basin.

[BibT_eX]

[DOI]

Eur. J. Oper. Res., 2013

The IBM speech-to-speech translation system for smartphone: Improvements for resource-constrained tasks.

[BibT_eX]

[DOI]

Comput. Speech Lang., 2013

Restructuring of deep neural network acoustic models with singular value decomposition.

[BibT_eX]

[DOI]

Jinyu Li

Yifan Gong

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Investigations on hessian-free optimization for cross-entropy training of deep neural networks.

[BibT_eX]

[DOI]

Simon Wiesler

Jinyu Li

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

2012

Hidden Markov Acoustic Modeling With Bootstrap and Restructuring for Low-Resourced Languages.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2012

Model of transfer tax on transboundary water pollution in China's river basin.

[BibT_eX]

[DOI]

Oper. Res. Lett., 2012

Research based on improved fuzzy immune PID algorithm optimized copper electrolysis rectifier system.

[BibT_eX]

[DOI]

He Zhu

Proceedings of the 2nd IEEE International Conference on Cloud Computing and Intelligence Systems, 2012

2011

Towards High Performance LVCSR in Speech-to-Speech Translation System on Smart Phones.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Acoustic Modeling with Bootstrap and Restructuring Based on Full Covariance.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Clustering of bootstrapped acoustic model with full covariance.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

2010

Acoustic modeling with bootstrap and restructuring for low-resourced languages.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Role assignment for Soccer Robot using fuzzy inference system.

[BibT_eX]

[DOI]

Proceedings of the Seventh International Conference on Fuzzy Systems and Knowledge Discovery, 2010

2009

A study of bootstrapping with multiple acoustic features for improved automatic speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Semi-tied covariance matrices for acoustic models based on random forests of phonetic decision trees.

[BibT_eX]

[DOI]

L. Che

Proceedings of the IEEE International Conference on Acoustics, 2009

Improving online incremental speaker adaptation with eigen feature space MLLR.

[BibT_eX]

[DOI]

Xiaodong Cui

Bowen Zhou

Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009

2008

Random Forests of Phonetic Decision Trees for Acoustic Modeling in Conversational Speech Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2008

High-performance low-latency speech recognition via multi-layered feature streaming and fast Gaussian computation.

[BibT_eX]

[DOI]

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Random-forests-based phonetic decision trees for conversational speech recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2008

2007

Bioinformatics and its Applications in Agriculture.

[BibT_eX]

[DOI]

Proceedings of the Computer And Computing Technologies In Agriculture, 2007

Novel Lookahead Decision Tree State Tying for Acoustic Modeling.

[BibT_eX]

[DOI]

Yuxin Zhao

Proceedings of the IEEE International Conference on Acoustics, 2007

2006

New improvements in decoding speed and latency for automatic captioning.

[BibT_eX]

[DOI]

Rusheng Hu

Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

An Automatic Captioning System for Telemedicine.

[BibT_eX]

[DOI]

Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Random Forests-Based Confidence Annotation Using Novel Features from Confusion Network.

[BibT_eX]

[DOI]

Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

2005

Incremental largest margin linear regression and MAP adaptation for speech separation in telemedicine applications.

[BibT_eX]

[DOI]

Rusheng Hu

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Improved Confusion Network Algorithm and Shortest Path Search from Word Lattice.

[BibT_eX]

[DOI]

Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

1997

A novel approach to the optimal biorthogonal analysis window sequence of the discrete Gabor expansion.

[BibT_eX]

[DOI]