Soravit Changpinyo

Orcid: 0000-0002-4013-1190

According to our database1, Soravit Changpinyo authored at least 33 papers between 2013 and 2023.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2023
PaLI-X: On Scaling up a Multilingual Vision and Language Model.
CoRR, 2023

What You See is What You Read? Improving Text-Image Alignment Evaluation.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

PaLI: A Jointly-Scaled Multilingual Language-Image Model.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

PreSTU: Pre-Training for Scene-Text Understanding.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Can Pre-trained Vision and Language Models Answer Visual Information-Seeking Questions?
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

MaXM: Towards Multilingual Visual Question Answering.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Connecting Vision and Language with Video Localized Narratives.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

MetaCLUE: Towards Comprehensive Visual Metaphors Research.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
2.5D visual relationship detection.
Comput. Vis. Image Underst., 2022

PaLI: A Jointly-Scaled Multilingual Language-Image Model.
CoRR, 2022

Towards Multi-Lingual Visual Question Answering.
CoRR, 2022

All You May Need for VQA are Image Captions.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

PACTran: PAC-Bayesian Metrics for Estimating the Transferability of Pretrained Models to Classification Tasks.
Proceedings of the Computer Vision - ECCV 2022, 2022

Denoising Large-Scale Image Captioning from Alt-text Data Using Content Selection Models.
Proceedings of the 29th International Conference on Computational Linguistics, 2022

2021
A Simple and Effective Use of Object-Centric Images for Long-Tailed Object Detection.
CoRR, 2021

Telling the What while Pointing the Where: Fine-grained Mouse Trace and Language Supervision for Improved Image Retrieval.
CoRR, 2021

On Model Calibration for Long-Tailed Object Detection and Instance Segmentation.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Robust Visual Reasoning via Language Guided Neural Module Networks.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

MosaicOS: A Simple and Effective Use of Object-Centric Images for Long-Tailed Object Detection.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Telling the What while Pointing to the Where: Multimodal Queries for Image Retrieval.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

CrossVQA: Scalably Generating Benchmarks for Systematically Testing VQA Generalization.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020
A Game-theoretic Approach to Data Interaction.
ACM Trans. Database Syst., 2020

Classifier and Exemplar Synthesis for Zero-Shot Learning.
Int. J. Comput. Vis., 2020

Weakly Supervised Content Selection for Improved Image Captioning.
CoRR, 2020

Connecting Vision and Language with Localized Narratives.
Proceedings of the Computer Vision - ECCV 2020, 2020

2019
Decoupled Box Proposal and Featurization with Ultrafine-Grained Semantic Labels Improve Image Captioning and Visual Question Answering.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

2018
Multi-Task Learning for Sequence Tagging: An Empirical Study.
Proceedings of the 27th International Conference on Computational Linguistics, 2018

2017
The Power of Sparsity in Convolutional Neural Networks.
CoRR, 2017

Predicting Visual Exemplars of Unseen Classes for Zero-Shot Learning.
Proceedings of the IEEE International Conference on Computer Vision, 2017

2016
An Empirical Study and Analysis of Generalized Zero-Shot Learning for Object Recognition in the Wild.
Proceedings of the Computer Vision - ECCV 2016, 2016

Synthesized Classifiers for Zero-Shot Learning.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

2013
Similarity Component Analysis.
Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013


  Loading...