Tianyi Bai
Orcid: 0009-0009-5057-7100
According to our database1,
Tianyi Bai authored at least 27 papers
between 2022 and 2026.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2026
CoRR, April, 2026
CoRR, February, 2026
Synthesizing Multimodal Geometry Datasets from Scratch and Enabling Visual Alignment via Plotting Code.
CoRR, February, 2026
Research on World Models Is Not Merely Injecting World Knowledge into Specific Tasks.
CoRR, February, 2026
From Completion to Editing: Unlocking Context-Aware Code Infilling via Search-and-Replace Instruction Tuning.
CoRR, January, 2026
2025
DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI.
CoRR, December, 2025
CoRR, December, 2025
CoRR, November, 2025
CoRR, November, 2025
VADE: Variance-Aware Dynamic Sampling via Online Sample-Level Difficulty Estimation for Multimodal RL.
CoRR, November, 2025
CoRR, October, 2025
CoRR, June, 2025
Hallucination at a Glance: Controlled Visual Edits and Fine-Grained Multimodal Learning.
CoRR, June, 2025
TAH-QUANT: Effective Activation Quantization in Pipeline Parallelism over Slow Network.
CoRR, June, 2025
CoRR, February, 2025
CoRR, January, 2025
Harnessing Diversity for Important Data Selection in Pretraining Large Language Models.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
Meta-rater: A Multi-dimensional Data Selection Method for Pre-training Language Models.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025
Efficient Pretraining Data Selection for Language Models via Multi-Actor Collaboration.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025
UrBench: A Comprehensive Benchmark for Evaluating Large Multimodal Models in Multi-View Urban Scenarios.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025
2024
Harnessing Diversity for Important Data Selection in Pretraining Large Language Models.
CoRR, 2024
CoRR, 2024
2023
2022
Proceedings of the KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14, 2022