Shanbo Cheng

Orcid: 0000-0002-6115-9483

According to our database1, Shanbo Cheng authored at least 30 papers between 2016 and 2025.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
Seed LiveInterpret 2.0: End-to-end Simultaneous Speech-to-speech Translation with Your Voice.
CoRR, July, 2025

Seed-X: Building Strong Multilingual Translation LLM with 7B Parameters.
CoRR, July, 2025

From Tens of Hours to Tens of Thousands: Scaling Back-Translation for Speech Recognition.
CoRR, May, 2025

TRANS-ZERO: Self-Play Incentivizes Large Language Models for Multilingual Translation Without Parallel Data.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

SeqPO-SiMT: Sequential Policy Optimization for Simultaneous Machine Translation.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024
Eliciting the Translation Ability of Large Language Models via Multilingual Finetuning with Translation Instructions.
Trans. Assoc. Comput. Linguistics, 2024

Towards Achieving Human Parity on End-to-end Simultaneous Speech Translation via LLM Agent.
CoRR, 2024

MT-PATCHER: Selective and Extendable Knowledge Distillation from Large Language Models for Machine Translation.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

G-DIG: Towards Gradient-based DIverse and hiGh-quality Instruction Data Selection for Machine Translation.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Retaining Key Information under High Compression Ratios: Query-Guided Compressor for LLMs.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023
Speech Translation with Large Language Models: An Industrial Practice.
CoRR, 2023

Only 5% Attention Is All You Need: Efficient Long-range Document-level Neural Machine Translation.
Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, 2023

Visual Information Matters for ASR Error Correction.
Proceedings of the IEEE International Conference on Acoustics, 2023

Beyond Triplet: Leveraging the Most Data for Multimodal Machine Translation.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

Controlling Styles in Neural Machine Translation with Activation Prompt.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

BigVideo: A Large-scale Video Subtitle Translation Dataset for Multimodal Machine Translation.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

2022
Beyond Triplet: Leveraging the Most Data for Multimodal Machine Translation.
CoRR, 2022

Better Datastore, Better Translation: Generating Datastores from Pre-Trained Models for Nearest Neural Machine Translation.
CoRR, 2022

Zero-shot Domain Adaptation for Neural Machine Translation with Retrieved Phrase-level Prompts.
CoRR, 2022

switch-GLAT: Multilingual Parallel Machine Translation Via Code-Switch Decoder.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Unified Multimodal Punctuation Restoration Framework for Mixed-Modality Corpus.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
The Volctrans GLAT System: Non-autoregressive Translation Meets WMT21.
Proceedings of the Sixth Conference on Machine Translation, 2021

Learning Kernel-Smoothed Machine Translation with Retrieved Examples.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Language Tags Matter for Zero-Shot Neural Machine Translation.
Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021

2020
AR: Auto-Repair the Synthetic Data for Neural Machine Translation.
CoRR, 2020

Language-aware Interlingua for Multilingual Neural Machine Translation.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Acquiring Knowledge from Pre-Trained Model to Neural Machine Translation.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2018
Alibaba's Neural Machine Translation Systems for WMT18.
Proceedings of the Third Conference on Machine Translation: Shared Task Papers, 2018

2017
Sogou Neural Machine Translation Systems for WMT17.
Proceedings of the Second Conference on Machine Translation, 2017

2016
PRIMT: A Pick-Revise Framework for Interactive Machine Translation.
Proceedings of the NAACL HLT 2016, 2016


  Loading...