Chenghao Xiao

Orcid: 0000-0001-7623-8232

According to our database1, Chenghao Xiao authored at least 26 papers between 2022 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
VL-Cogito: Progressive Curriculum Reinforcement Learning for Advanced Multimodal Reasoning.
CoRR, July, 2025

Adversarial Defence without Adversarial Defence: Enhancing Language Model Robustness via Instance-level Principal Component Removal.
CoRR, July, 2025

ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning.
CoRR, June, 2025

Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning.
CoRR, June, 2025

Beyond One-Size-Fits-All: Inversion Learning for Highly Effective NLG Evaluation Prompts.
CoRR, April, 2025

MIEB: Massive Image Embedding Benchmark.
CoRR, April, 2025

MMTEB: Massive Multilingual Text Embedding Benchmark.
CoRR, February, 2025

CAST: Corpus-Aware Self-similarity Enhanced Topic modelling.
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

Analyzing LLMs' Knowledge Boundary Cognition Across Languages Through the Lens of Internal Representations.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024
Everything is a Video: Unifying Modalities through Next-Frame Prediction.
CoRR, 2024

BioMNER: A Dataset for Biomedical Method Entity Recognition.
CoRR, 2024

SimsChat: A Customisable Persona-Driven Role-Playing Agent.
CoRR, 2024

The Power of Next-Frame Prediction for Learning Physical Laws.
CoRR, 2024

RAR-b: Reasoning as Retrieval Benchmark.
CoRR, 2024

Pixel Sentence Representation Learning.
CoRR, 2024

MERT: Acoustic Music Understanding Model with Large-Scale Self-supervised Training.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

On the Rigour of Scientific Writing: Criteria, Analysis, and Insights.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Effective Distillation of Table-based Reasoning Ability from LLMs.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

SciMMIR: Benchmarking Scientific Multi-modal Information Retrieval.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023
Audio Contrastive based Fine-tuning.
CoRR, 2023

Can Text Encoders be Deceived by Length Attack?
Proceedings of the First Tiny Papers Track at ICLR 2023, 2023

Length is a Curse and a Blessing for Document-level Semantics.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

On Isotropy, Contextualization and Learning Dynamics of Contrastive-based Sentence Representation Learning.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

2022
On Isotropy and Learning Dynamics of Contrastive-based Sentence Representation Learning.
CoRR, 2022

Fine-grained Main Ideas Extraction and Clustering of Online Course Reviews.
Proceedings of the Artificial Intelligence in Education - 23rd International Conference, 2022

SimStu-Transformer: A Transformer-Based Approach to Simulating Student Behaviour.
Proceedings of the Artificial Intelligence in Education. Posters and Late Breaking Results, Workshops and Tutorials, Industry and Innovation Tracks, Practitioners' and Doctoral Consortium, 2022


  Loading...