Shimin Tao

Orcid: 0000-0002-2795-6921

According to our database¹, Shimin Tao authored at least 121 papers between 2015 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

Unlocking Fine-Grained Translation Quality Estimation in LRMs through Synergistically Evolving Implicit and Explicit Reasoning.

[BibT_eX]

[DOI]

CoRR, May, 2026

Loong: A Human-Like Long Document Translation Agent with Observe-and-Act Adaptive Context Selection.

[BibT_eX]

[DOI]

CoRR, May, 2026

C-Mining: Unsupervised Discovery of Seeds for Cultural Data Synthesis via Geometric Misalignment.

[BibT_eX]

[DOI]

CoRR, April, 2026

Cross-Preference Learning for Sentence-Level and Context-Aware Machine Translation.

[BibT_eX]

[DOI]

CoRR, March, 2026

Chart Specification: Structural Representations for Incentivizing VLM Reasoning in Chart-to-Code Generation.

[BibT_eX]

[DOI]

CoRR, February, 2026

The GaoYao Benchmark: A Comprehensive Framework for Evaluating Multilingual and Multicultural Abilities of Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2026

DeReA: Improving Idiom Translation with Detect-Retrieve-Arbitrate Reasoning.

[BibT_eX]

[DOI]

Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2026

The "Knowledge-Behavior Gap" in Cultural Taboo Safety of Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2026

ELSPR: Evaluator LLM Training Data Self-Purification on Non-Transitive Preferences via Tournament Graph Reconstruction.

[BibT_eX]

[DOI]

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

MIDB: Multilingual Instruction Data Booster for Enhancing Cultural Equality in Multilingual Instruction Synthesis.

[BibT_eX]

[DOI]

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

Measuring the Unmeasurable: Unveiling Latent Cognitive Capabilities of LLM.

[BibT_eX]

[DOI]

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025

LogEval: A comprehensive benchmark suite for LLMs in log analysis.

[BibT_eX]

[DOI]

Empir. Softw. Eng., December, 2025

Do Large Language Models Truly Understand Cross-cultural Differences?

[BibT_eX]

[DOI]

CoRR, December, 2025

R-Log: Incentivizing Log Analysis Capability in LLMs via Reasoning-based Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, September, 2025

CultureScope: A Dimensional Lens for Probing Cultural Understanding in LLMs.

[BibT_eX]

[DOI]

CoRR, September, 2025

A method for improving multilingual quality and diversity of instruction fine-tuning datasets.

[BibT_eX]

[DOI]

CoRR, September, 2025

RationAnomaly: Log Anomaly Detection with Rationality via Chain-of-Thought and Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, September, 2025

ELSPR: Evaluator LLM Training Data Self-Purification on Non-Transitive Preferences via Tournament Graph Reconstruction.

[BibT_eX]

[DOI]

CoRR, May, 2025

MIDB: Multilingual Instruction Data Booster for Enhancing Multilingual Instruction Synthesis.

[BibT_eX]

[DOI]

CoRR, May, 2025

Improving LLM-based Document-level Machine Translation with Multi-Knowledge Fusion.

[BibT_eX]

[DOI]

CoRR, March, 2025

R1-T1: Fully Incentivizing Translation Capability in LLMs via Reasoning Learning.

[BibT_eX]

[DOI]

CoRR, February, 2025

Improving Matching Models With Contextual Attention for Multi-Turn Response Selection in Retrieval-Based Chatbots.

[BibT_eX]

[DOI]

IEEE Trans. Netw. Sci. Eng., 2025

Degradation-Aware Prompted Transformer for Unified Medical Image Restoration.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2025

Improving LLM-Based Document-Level MT with Multi-Knowledge Fusion.

[BibT_eX]

[DOI]

Proceedings of the Natural Language Processing and Chinese Computing, 2025

Rethinking Diffusion Bridge Model with Dual Alignments for Medical Image Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 33rd ACM International Conference on Multimedia, 2025

SuperFC: Selective Data Utilization for a Sustainable and Effective Function-Calling Agent.

[BibT_eX]

[DOI]

Proceedings of the International Joint Conference on Neural Networks, 2025

LogLM: From Task-Based to Instruction-Based Automated Log Analysis.

[BibT_eX]

[DOI]

Proceedings of the 47th IEEE/ACM International Conference on Software Engineering: Software Engineering in Practice, 2025

Taming Text-to-Image Synthesis for Novices: User-centric Prompt Generation via Multi-turn Guidance.

[BibT_eX]

[DOI]

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

M-Ped: Multi-Prompt Ensemble Decoding for Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

Adapting Large Language Models to Log Analysis with Interpretable Domain Knowledge.

[BibT_eX]

[DOI]

Proceedings of the 34th ACM International Conference on Information and Knowledge Management, 2025

Two Intermediate Translations Are Better Than One: Fine-tuning LLMs for Document-level Translation Refinement.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

SRDC: Semantics-based Ransomware Detection and Classification with LLM-assisted Pre-training.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024

Multi-Source Log Parsing With Pre-Trained Domain Classifier.

[BibT_eX]

[DOI]

IEEE Trans. Netw. Serv. Manag., June, 2024

M-Ped: Multi-Prompt Ensemble Decoding for Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

Adapting Large Language Models to Log Analysis with Interpretable Domain Knowledge.

[BibT_eX]

[DOI]

CoRR, 2024

What Do You Want? User-centric Prompt Generation for Text-to-image Synthesis via Multi-turn Guidance.

[BibT_eX]

[DOI]

CoRR, 2024

LogEval: A Comprehensive Benchmark Suite for Large Language Models In Log Analysis.

[BibT_eX]

[DOI]

CoRR, 2024

Clustering and Ranking: Diversity-preserved Instruction Selection through Expert-aligned Quality Estimation.

[BibT_eX]

[DOI]

CoRR, 2024

Interpretable Online Log Analysis Using Large Language Models with Prompt Strategies.

[BibT_eX]

[DOI]

Proceedings of the 32nd IEEE/ACM International Conference on Program Comprehension, 2024

A Multitask Training Approach to Enhance Whisper with Open-Vocabulary Keyword Spotting.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Using Large Language Model for End-to-End Chinese ASR and NER.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

From Handcrafted Features to LLMs: A Brief Survey for Machine Translation Quality Estimation.

[BibT_eX]

[DOI]

Proceedings of the International Joint Conference on Neural Networks, 2024

LogPrompt: Prompt Engineering Towards Zero-Shot and Interpretable Log Analysis.

[BibT_eX]

[DOI]

Proceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion Proceedings, 2024

CoachLM: Automatic Instruction Revisions Improve the Data Quality in LLM Instruction Tuning.

[BibT_eX]

[DOI]

Proceedings of the 40th IEEE International Conference on Data Engineering, 2024

Knowledge-Prompted Estimator: A Novel Approach to Explainable Machine Translation Assessment.

[BibT_eX]

[DOI]

Proceedings of the 26th International Conference on Advanced Communications Technology, 2024

DeMPT: Decoding-enhanced Multi-phase Prompt Tuning for Making LLMs Be Better Context-aware Translators.

[BibT_eX]

[DOI]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Clustering and Ranking: Diversity-preserved Instruction Selection through Expert-aligned Quality Estimation.

[BibT_eX]

[DOI]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Evaluation Dataset for Lexical Translation Consistency in Chinese-to-English Document-level Translation.

[BibT_eX]

[DOI]

Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

Translate Meanings, Not Just Words: IdiomKB's Role in Optimizing Idiomatic Translation with Language Models.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

LogSummary: Unstructured Log Summarization for Software Systems.

[BibT_eX]

[DOI]

IEEE Trans. Netw. Serv. Manag., September, 2023

Exploiting Spatial-Temporal Behavior Patterns for Fraud Detection in Telecom Networks.

[BibT_eX]

[DOI]

IEEE Trans. Dependable Secur. Comput., 2023

P-Transformer: Towards Better Document-to-Document Neural Machine Translation.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2023

Collective Human Opinions in Semantic Textual Similarity.

[BibT_eX]

[DOI]

Trans. Assoc. Comput. Linguistics, 2023

Automatic Instruction Optimization for Open-source LLM Instruction Tuning.

[BibT_eX]

[DOI]

CoRR, 2023

NJUNLP's Participation for the WMT2023 Quality Estimation Shared Task.

[BibT_eX]

[DOI]

CoRR, 2023

LogPrompt: Prompt Engineering Towards Zero-Shot and Interpretable Log Analysis.

[BibT_eX]

[DOI]

CoRR, 2023

Implicit Cross-Lingual Word Embedding Alignment for Reference-Free Machine Translation Evaluation.

[BibT_eX]

[DOI]

IEEE Access, 2023

Weakly Supervised Entity Alignment with Positional Inspiration.

[BibT_eX]

[DOI]

Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining, 2023

HW-TSC's Participation in the WMT 2023 Automatic Post Editing Shared Task.

[BibT_eX]

[DOI]

Proceedings of the Eighth Conference on Machine Translation, 2023

Empowering a Metric with LLM-assisted Named Entity Annotation: HW-TSC's Submission to the WMT23 Metrics Shared Task.

[BibT_eX]

[DOI]

Proceedings of the Eighth Conference on Machine Translation, 2023

Unify Word-level and Span-level Tasks: NJUNLP's Participation for the WMT2023 Quality Estimation Shared Task.

[BibT_eX]

[DOI]

Proceedings of the Eighth Conference on Machine Translation, 2023

Multi-order Matched Neighborhood Consistent Graph Alignment in a Union Vector Space.

[BibT_eX]

[DOI]

Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023

LogDAPT: Log Data Anomaly Detection with Domain-Adaptive Pretraining (industry track).

[BibT_eX]

[DOI]

Proceedings of the 24th International Middleware Conference Industrial Track, 2023

The HW-TSC's Speech-to-Speech Translation System for IWSLT 2023.

[BibT_eX]

[DOI]

Proceedings of the 20th International Conference on Spoken Language Translation, 2023

Biglog: Unsupervised Large-scale Pre-training for a Unified Log Representation.

[BibT_eX]

[DOI]

Proceedings of the 31st IEEE/ACM International Symposium on Quality of Service, 2023

WhiSLU: End-to-End Spoken Language Understanding with Whisper.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

CONFPILOT: A Pilot for Faster Configuration by Learning from Device Manuals.

[BibT_eX]

[DOI]

Proceedings of the 43rd IEEE International Conference on Distributed Computing Systems, 2023

Zephyr: Zero-Shot Punctuation Restoration.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

UCorrect: An Unsupervised Framework for Automatic Speech Recognition Error Correction.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

TeacherSim: Cross-lingual Machine Translation Evaluation with Monolingual Embedding as Teacher.

[BibT_eX]

[DOI]

Proceedings of the 25th International Conference on Advanced Communication Technology, 2023

Chinese ASR and NER Improvement Based on Whisper Fine-Tuning.

[BibT_eX]

[DOI]

Proceedings of the 25th International Conference on Advanced Communication Technology, 2023

SmartSpanNER: Making SpanNER Robust in Low Resource Scenarios.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Improved Pseudo Data for Machine Translation Quality Estimation with Constrained Beam Search.

[BibT_eX]

[DOI]

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

DA-Parser: A Pre-trained Domain-aware Parsing Framework for Heterogeneous Log Analysis.

[BibT_eX]

[DOI]

Proceedings of the 47th IEEE Annual Computers, Software, and Applications Conference, 2023

Knowledge Prompt for Whisper: An ASR Entity Correction Approach with Knowledge Base.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Big Data, 2023

Incorporating Pinyin into Pipeline Named Entity Recognition from Chinese Speech.

[BibT_eX]

[DOI]

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

Lexical Translation Inconsistency-Aware Document-Level Translation Repair.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

Denoising Pre-training for Machine Translation Quality Estimation with Curriculum Learning.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022

LogStamp: Automatic Online Log Parsing Based on Sequence Labelling.

[BibT_eX]

[DOI]

SIGMETRICS Perform. Evaluation Rev., 2022

CrossQE: HW-TSC 2022 Submission for the Quality Estimation Shared Task.

[BibT_eX]

[DOI]

Proceedings of the Seventh Conference on Machine Translation, 2022

HW-TSC's Submission for the WMT22 Efficiency Task.

[BibT_eX]

[DOI]

Proceedings of the Seventh Conference on Machine Translation, 2022

Partial Could Be Better than Whole. HW-TSC 2022 Submission for the Metrics Shared Task.

[BibT_eX]

[DOI]

Proceedings of the Seventh Conference on Machine Translation, 2022

NJUNLP's Participation for the WMT2022 Quality Estimation Shared Task.

[BibT_eX]

[DOI]

Proceedings of the Seventh Conference on Machine Translation, 2022

Exploring Robustness of Machine Translation Metrics: A Study of Twenty-Two Automatic Metrics in the WMT22 Metric Task.

[BibT_eX]

[DOI]

Proceedings of the Seventh Conference on Machine Translation, 2022

HW-TSC at SemEval-2022 Task 7: Ensemble Model Based on Pretrained Models for Identifying Plausible Clarifications.

[BibT_eX]

[DOI]

Proceedings of the 16th International Workshop on Semantic Evaluation, SemEval@NAACL 2022, 2022

WRS: Workflow Retrieval System for Cloud Automatic Remediation.

[BibT_eX]

[DOI]

Hongyi Huang

Wenfei Wu

Shimin Tao

Proceedings of the 2022 IEEE/IFIP Network Operations and Management Symposium, 2022

Neighbors Are Not Strangers: Improving Non-Autoregressive Translation under Low-Frequency Lexical Constraints.

[BibT_eX]

[DOI]

Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

CCDC: A Chinese-Centric Cross Domain Contrastive Learning Framework.

[BibT_eX]

[DOI]

Proceedings of the Knowledge Science, Engineering and Management, 2022

The HW-TSC's Offline Speech Translation System for IWSLT 2022 Evaluation.

[BibT_eX]

[DOI]

Proceedings of the 19th International Conference on Spoken Language Translation, 2022

The HW-TSC's Simultaneous Speech Translation System for IWSLT 2022 Evaluation.

[BibT_eX]

[DOI]

Proceedings of the 19th International Conference on Spoken Language Translation, 2022

The HW-TSC's Speech to Speech Translation System for IWSLT 2022 Evaluation.

[BibT_eX]

[DOI]

Proceedings of the 19th International Conference on Spoken Language Translation, 2022

Part Represents Whole: Improving the Evaluation of Machine Translation System Using Entropy Enhanced Metrics.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: AACL-IJCNLP 2022, 2022

Modeling Consistency Preference via Lexical Chains for Document-level Neural Machine Translation.

[BibT_eX]

[DOI]

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

Diformer: Directional Transformer for Neural Machine Translation.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the European Association for Machine Translation, 2022

Target-Side Language Model for Reference-Free Machine Translation Evaluation.

[BibT_eX]

[DOI]

Proceedings of the Machine Translation - 18th China Conference, 2022

PEACook: Post-editing Advancement Cookbook.

[BibT_eX]

[DOI]

Proceedings of the Machine Translation - 18th China Conference, 2022

CCMT 2022 Translation Quality Estimation Task.

[BibT_eX]

[DOI]

Proceedings of the Machine Translation - 18th China Conference, 2022

Incorporating Multilingual Knowledge Distillation into Machine Translation Evaluation.

[BibT_eX]

[DOI]

Proceedings of the Knowledge Graph and Semantic Computing: Knowledge Graph Empowers the Digital Economy, 2022

EntityRank: Unsupervised Mining of Bilingual Named Entity Pairs from Parallel Corpora for Neural Machine Translation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Big Data, 2022

HwTscSU's Submissions on WAT 2022 Shared Task.

[BibT_eX]

[DOI]

Proceedings of the 9th Workshop on Asian Translation, 2022

Capture Human Disagreement Distributions by Calibrated Networks for Natural Language Inference.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

2021

Deep graph alignment network.

[BibT_eX]

[DOI]

Neurocomputing, 2021

Joint-training on Symbiosis Networks for Deep Nueral Machine Translation models.

[BibT_eX]

[DOI]

CoRR, 2021

Self-Distillation Mixup Training for Non-autoregressive Neural Machine Translation.

[BibT_eX]

[DOI]

CoRR, 2021

UniLog: Deploy One Model and Specialize it for All Log Analysis Tasks.

[BibT_eX]

[DOI]

CoRR, 2021

The HW-TSC's Offline Speech Translation Systems for IWSLT 2021 Evaluation.

[BibT_eX]

[DOI]

CoRR, 2021

HW-TSC's Participation in the WMT 2021 Efficiency Shared Task.

[BibT_eX]

[DOI]

Proceedings of the Sixth Conference on Machine Translation, 2021

HW-TSC's Participation at WMT 2021 Quality Estimation Shared Task.

[BibT_eX]

[DOI]

Proceedings of the Sixth Conference on Machine Translation, 2021

Make the Blind Translator See The World: A Novel Transfer Learning Solution for Multimodal Machine Translation.

[BibT_eX]

[DOI]

Proceedings of the 18th Biennial Machine Translation Summit - Volume 1: Research Track, 2021

HI-CMLM: Improve CMLM with Hybrid Decoder Input.

[BibT_eX]

[DOI]

Proceedings of the 14th International Conference on Natural Language Generation, 2021

Prefix-Graph: A Versatile Log Parsing Approach Merging Prefix Tree with Probabilistic Graph.

[BibT_eX]

[DOI]

Proceedings of the 37th IEEE International Conference on Data Engineering, 2021

Incorporating Complete Syntactical Knowledge for Spoken Language Understanding.

[BibT_eX]

[DOI]

Proceedings of the Knowledge Graph and Semantic Computing: Knowledge Graph Empowers New Infrastructure Construction, 2021

How Length Prediction Influence the Performance of Non-Autoregressive Translation?

[BibT_eX]

[DOI]

Proceedings of the Fourth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP, 2021

2020

Summarizing Unstructured Logs in Online Services.

[BibT_eX]

[DOI]

CoRR, 2020

HW-TSC's Participation at WMT 2020 Automatic Post Editing Shared Task.

[BibT_eX]

[DOI]

Proceedings of the Fifth Conference on Machine Translation, 2020

HW-TSC's Participation at WMT 2020 Quality Estimation Shared Task.

[BibT_eX]

[DOI]

Proceedings of the Fifth Conference on Machine Translation, 2020

LogParse: Making Log Parsing Adaptive through Word Classification.

[BibT_eX]

[DOI]

Proceedings of the 29th International Conference on Computer Communications and Networks, 2020

2019

LogAnomaly: Unsupervised Detection of Sequential and Quantitative Anomalies in Unstructured Logs.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

2018

FUNNEL: Assessing Software Changes in Web-Based Services.

[BibT_eX]

[DOI]

IEEE Trans. Serv. Comput., 2018

2017

Segmentation of Time Series Based on Kinetic Characteristics for Storage Consumption Prediction.

[BibT_eX]

[DOI]

Proceedings of the 37th IEEE International Conference on Distributed Computing Systems, 2017

2015

Rapid and robust impact assessment of software changes in large internet-based services.

[BibT_eX]

[DOI]

Proceedings of the 11th ACM Conference on Emerging Networking Experiments and Technologies, 2015

Shimin Tao

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...