Yaya Shi
Orcid: 0000-0003-0465-6712
According to our database1,
Yaya Shi authored at least 19 papers
between 2018 and 2025.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2025
TaskGalaxy: Scaling Multi-modal Instruction Fine-tuning with Tens of Thousands Vision Task Types.
CoRR, February, 2025
TaskGalaxy: Scaling Multi-modal Instruction Fine-tuning with Tens of Thousands Vision Task Types.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
Proceedings of the Findings of the Association for Computational Linguistics, 2025
2024
ACM Trans. Multim. Comput. Commun. Appl., August, 2024
mPLUG-PaperOwl: Scientific Diagram Analysis with the Multimodal Large Language Model.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024
Semantics-enhanced Cross-modal Masked Image Modeling for Vision-Language Pre-training.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024
2023
ACM Trans. Multim. Comput. Commun. Appl., 2023
Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Dataset for Pre-training and Benchmarks.
CoRR, 2023
CoRR, 2023
Proceedings of the 31st ACM International Conference on Multimedia, 2023
Proceedings of the International Conference on Machine Learning, 2023
2022
A Simple and Strong Baseline for Universal Targeted Attacks on Siamese Visual Tracking.
IEEE Trans. Circuits Syst. Video Technol., 2022
EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
2020
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020
2019
VATEX Captioning Challenge 2019: Multi-modal Information Fusion and Multi-stage Training Strategy for Video Captioning.
CoRR, 2019
2018
Permafrost Presence/Absence Mapping of the Qinghai-Tibet Plateau Based on Multi-Source Remote Sensing Data.
Remote. Sens., 2018