Yangyang Shi

Orcid: 0000-0001-5297-4155

According to our database1, Yangyang Shi authored at least 83 papers between 2011 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Self-Calibration Method of Displacement Sensor in AMB-Rotor System Based on Magnetic Bearing Current Control.
IEEE Trans. Ind. Electron., May, 2024

MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases.
CoRR, 2024

Not All Weights Are Created Equal: Enhancing Energy Efficiency in On-Device Streaming Speech Recognition.
CoRR, 2024

FADI-AEC: Fast Score Based Diffusion Model Guided by Far-end Signal for Acoustic Echo Cancellation.
CoRR, 2024

2023
Model Reference Adaptive Compensation and Robust Controller for Magnetic Bearing Systems With Strong Persistent Disturbances.
IEEE Trans. Ind. Electron., November, 2023

Characterizing the Survival-Associated Interactions Between Tumor-Infiltrating Lymphocytes and Tumors From Pathological Images and Multi-Omics Data.
IEEE Trans. Medical Imaging, October, 2023

On The Open Prompt Challenge In Conditional Audio Generation.
CoRR, 2023

In-Context Prompt Editing For Conditional Audio Generation.
CoRR, 2023

TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch.
CoRR, 2023

FoleyGen: Visually-Guided Audio Generation.
CoRR, 2023

Stack-and-Delay: a new codebook pattern for music generation.
CoRR, 2023

Enhance audio generation controllability through representation similarity regularization.
CoRR, 2023

Folding Attention: Memory and Power Optimization for On-Device Transformer-based Streaming Speech Recognition.
CoRR, 2023

DISGO: Automatic End-to-End Evaluation for Scene Text OCR.
CoRR, 2023

LLM-QAT: Data-Free Quantization Aware Training for Large Language Models.
CoRR, 2023

Multi-Head State Space Model for Speech Recognition.
CoRR, 2023

SCA: Streaming Cross-Attention Alignment For Echo Cancellation.
Proceedings of the IEEE International Conference on Acoustics, 2023

Improving fast-slow Encoder based Transducer with Streaming Deliberation.
Proceedings of the IEEE International Conference on Acoustics, 2023

TorchAudio 2.1: Advancing Speech Recognition, Self-Supervised Learning, and Audio Processing Components for Pytorch.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

Towards Zero-Shot Multilingual Transfer for Code-Switched Responses.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Binary and Ternary Natural Language Generation.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Revisiting Sample Size Determination in Natural Language Understanding.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

2022
Position Extraction of Ultralow-Speed Gimbal Servo System With Linear Hall Sensors.
IEEE Trans. Ind. Electron., 2022

Synergistic Digital Twin and Holographic Augmented-Reality-Guided Percutaneous Puncture of Respiratory Liver Tumor.
IEEE Trans. Hum. Mach. Syst., 2022

LiCo-Net: Linearized Convolution Network for Hardware-efficient Keyword Spotting.
CoRR, 2022

Biased Self-supervised learning for ASR.
CoRR, 2022

SCA: Streaming Cross-attention Alignment for Echo Cancellation.
CoRR, 2022

Learning a Dual-Mode Speech Recognition Model VIA Self-Pruning.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Streaming parallel transducer beam search with fast slow cascaded encoders.
Proceedings of the Interspeech 2022, 2022

Streaming Transformer Transducer based Speech Recognition Using Non-Causal Convolution.
Proceedings of the IEEE International Conference on Acoustics, 2022

Gadgets Splicing: Dynamic Binary Transformation for Precise Rewriting.
Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2022

2021
TorchAudio: Building Blocks for Audio and Speech Processing.
CoRR, 2021

Transferring Voice Knowledge for Acoustic Event Detection: An Empirical Study.
CoRR, 2021

On lattice-free boosted MMI training of HMM and CTC-based full-context ASR models.
CoRR, 2021

Flexi-Transducer: Optimizing Latency, Accuracy and Compute forMulti-Domain On-Device Scenarios.
CoRR, 2021

A multiple-relaxation-time collision model by Hermite expansion.
CoRR, 2021

Versatile multi-constrained planning for thermal ablation of large liver tumors.
Comput. Medical Imaging Graph., 2021

Streaming Attention-Based Models with Augmented Memory for End-To-End Speech Recognition.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Internal Motion Estimation during Free-Breathing via External/Internal Correlation Model.
Proceedings of the IEEE International Conference on Real-time Computing and Robotics, 2021

Transformer-Based Acoustic Modeling for Streaming Speech Synthesis.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Dynamic Encoder Transducer: A Flexible Solution for Trading Off Accuracy for Latency.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Dissecting User-Perceived Latency of On-Device E2E Speech Recognition.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Collaborative Training of Acoustic Encoders for Speech Recognition.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Flexi-Transducer: Optimizing Latency, Accuracy and Compute for Multi-Domain On-Device Scenarios.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Contextualized Streaming End-to-End Speech Recognition with Trie-Based Deep Biasing and Shallow Fusion.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Transformer in Action: A Comparative Study of Transformer-Based Acoustic Models for Large Scale Speech Recognition Applications.
Proceedings of the IEEE International Conference on Acoustics, 2021

Emformer: Efficient Memory Transformer Based Acoustic Model for Low Latency Streaming Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2021

On Lattice-Free Boosted MMI Training of HMM and CTC-Based Full-Context ASR Models.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020
Emformer: Efficient Memory Transformer Based Acoustic Model For Low Latency Streaming Speech Recognition.
CoRR, 2020

Incorporating Android Code Smells into Java Static Code Metrics for Security Risk Prediction of Android Applications.
Proceedings of the 20th IEEE International Conference on Software Quality, 2020

Functional code clone detection with syntax and semantics fusion learning.
Proceedings of the ISSTA '20: 29th ACM SIGSOFT International Symposium on Software Testing and Analysis, 2020

Streaming Transformer-Based Acoustic Models Using Self-Attention with Augmented Memory.
Proceedings of the Interspeech 2020, 2020

Weak-Attention Suppression for Transformer Based Speech Recognition.
Proceedings of the Interspeech 2020, 2020

Mining Effective Negative Training Samples for Keyword Spotting.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Region Proposal Network Based Small-Footprint Keyword Spotting.
IEEE Signal Process. Lett., 2019

Knowledge Distillation for Recurrent Neural Network Language Modeling with Trust Regularization.
Proceedings of the IEEE International Conference on Acoustics, 2019

End-to-end Speech Recognition Using a High Rank LSTM-CTC Based Model.
Proceedings of the IEEE International Conference on Acoustics, 2019

2018
A review of "linear programming computation" by Ping-Qi Pan.
Eur. J. Oper. Res., 2018

Robust Control for a Magnetically Suspended Control Moment Gyro with Strong Gyroscopic Effects.
Proceedings of the IECON 2018, 2018

2017
基于Feistel结构的超轻量级分组密码算法(PFP) (Ultra-lightweight Block Cipher Algorithm (PFP) Based on Feistel Structure).
计算机科学, 2017

2016
Deep LSTM based Feature Mapping for Query Classification.
Proceedings of the NAACL HLT 2016, 2016

Recurrent Support Vector Machines For Slot Tagging In Spoken Language Understanding.
Proceedings of the NAACL HLT 2016, 2016

2015
Integrating meta-information into recurrent neural network language models.
Speech Commun., 2015

Recurrent neural network language model adaptation with curriculum learning.
Comput. Speech Lang., 2015

RNN-based labeled data generation for spoken language understanding.
Proceedings of the INTERSPEECH 2015, 2015

Contextual spoken language understanding using recurrent neural networks.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

A factorization network based method for multi-lingual domain classification.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Semi-supervised slot tagging in spoken language understanding using recurrent transductive support vector machines.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

2014
Spoken language understanding using long short-term memory neural networks.
Proceedings of the 2014 IEEE Spoken Language Technology Workshop, 2014

Cluster based Chinese abbreviation modeling.
Proceedings of the INTERSPEECH 2014, 2014

2013
Classifying the socio-situational settings of transcripts of spoken discourses.
Speech Commun., 2013

K-Component Adaptive Recurrent Neural Network Language Models.
Proceedings of the Text, Speech, and Dialogue - 16th International Conference, 2013

Recurrent neural networks for language understanding.
Proceedings of the INTERSPEECH 2013, 2013

Exploiting the succeeding words in recurrent neural network language models.
Proceedings of the INTERSPEECH 2013, 2013

Speed up of recurrent neural network language models with sentence independent subsampling stochastic gradient descent.
Proceedings of the INTERSPEECH 2013, 2013

K-component recurrent neural network language models using curriculum learning.
Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, 2013

2012
Adaptive Language Modeling with a Set of Domain Dependent Models.
Proceedings of the Text, Speech and Dialogue - 15th International Conference, 2012

TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization with one-vs-all classifiers.
Proceedings of the Working Notes Proceedings of the MediaEval 2012 Workshop, 2012

MediaEval 2012 Tagging Task: Prediction based on One Best List and Confusion Networks.
Proceedings of the Working Notes Proceedings of the MediaEval 2012 Workshop, 2012

Towards Recurrent Neural Networks Language Models with Linguistic and Contextual Features.
Proceedings of the INTERSPEECH 2012, 2012

Dynamic Bayesian socio-situational setting classification.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011
Combining Topic Specific Language Models.
Proceedings of the Text, Speech and Dialogue - 14th International Conference, 2011

Socio-situational setting classification based on language use.
Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011


  Loading...