Jin Xu

Orcid: 0000-0002-1409-6731

Affiliations:
  • Tsinghua University, Beijing, China


According to our database1, Jin Xu authored at least 36 papers between 2004 and 2026.

Collaborative distances:

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

Online presence:

On csauthors.net:

Bibliography

2026
Semantic-Aware Interruption Detection in Spoken Dialogue Systems: Benchmark, Metric, and Model.
CoRR, March, 2026

The Interspeech 2026 Audio Reasoning Challenge: Evaluating Reasoning Process Quality for Audio Reasoning Models and Agents.
CoRR, February, 2026

LLM-ForcedAligner: A Non-Autoregressive and Accurate LLM-Based Forced Aligner for Multilingual and Long-Form Speech.
CoRR, January, 2026

2025
Omni-Captioner: Data Pipeline, Models, and Benchmark for Omni Detailed Perception.
CoRR, October, 2025

OmniVideoBench: Towards Audio-Visual Understanding Evaluation for Omni MLLMs.
CoRR, October, 2025

ContextASR-Bench: A Massive Contextual Speech Recognition Benchmark.
CoRR, July, 2025

WavReward: Spoken Dialogue Models With Generalist Reward Evaluators.
CoRR, May, 2025

WavRAG: Audio-Integrated Retrieval Augmented Generation for Spoken Dialogue Models.
CoRR, February, 2025

Speech Token Prediction via Compressed-to-fine Language Modeling for Speech Generation.
Proceedings of the 33rd ACM International Conference on Multimedia, 2025

Analyzing and Mitigating Inconsistency in Discrete Speech Tokens for Neural Codec Language Models.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024
WavChat: A Survey of Spoken Dialogue Models.
CoRR, 2024

Analyzing and Mitigating Inconsistency in Discrete Audio Tokens for Neural Codec Language Models.
CoRR, 2024

Qwen2-Audio Technical Report.
CoRR, 2024

Qwen2 Technical Report.
CoRR, 2024

AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023
Qwen-Audio: Advancing Universal Audio Understanding via Unified Large-Scale Audio-Language Models.
CoRR, 2023

LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT.
CoRR, 2023

Qwen Technical Report.
CoRR, 2023

AKE-GNN: Effective Graph Learning with Adaptive Knowledge Exchange.
Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, 2023

2022
Learning to Break the Loop: Analyzing and Mitigating Repetitions for Neural Text Generation.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Analyzing and Mitigating Interference in Neural Architecture Search.
Proceedings of the International Conference on Machine Learning, 2022

AutoHEnsGNN: Winning Solution to AutoGraph Challenge for KDD Cup 2020.
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022

Residual Learning of Neural Text Generation with n-gram Language Model.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

2021
FastCorrect 2: Fast Error Correction on Multiple Candidates for Automatic Speech Recognition.
CoRR, 2021

Graph Symbiosis Learning.
CoRR, 2021

Improving Long-Tailed Classification from Instance Level.
CoRR, 2021

FastCorrect: Fast Error Correction with Edit Alignment for Automatic Speech Recognition.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Speech-T: Transducer for Text to Speech and Beyond.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

NAS-BERT: Task-Agnostic and Adaptive-Size BERT Compression with Neural Architecture Search.
Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021

MixSpeech: Data Augmentation for Low-Resource Automatic Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2021

FastCorrect 2: Fast Error Correction on Multiple Candidates for Automatic Speech Recognition.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

2020
LRSpeech: Extremely Low-Resource Speech Synthesis and Recognition.
Proceedings of the KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2020

MultiSpeech: Multi-Speaker Text to Speech with Transformer.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

An Adaptive Master-Slave Regularized Model for Unexpected Revenue Prediction Enhanced with Alternative Data.
Proceedings of the 36th IEEE International Conference on Data Engineering, 2020

2019
A Collaborative Learning Framework to Tag Refinement for Points of Interest.
Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2019

2004
NLPR at TREC 2004: Robust Experiments.
Proceedings of the Thirteenth Text REtrieval Conference, 2004


  Loading...