Saining Xie
According to our database1,
Saining Xie
authored at least 77 papers
between 2012 and 2025.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2025
CoRR, June, 2025
CoRR, June, 2025
Traveling Across Languages: Benchmarking Cross-Lingual Consistency in Multimodal LLMs.
CoRR, May, 2025
Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis.
CoRR, May, 2025
BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset.
CoRR, May, 2025
CoRR, April, 2025
PISA Experiments: Exploring Physics Post-Training for Video Diffusion Models by Watching Stuff Drop.
CoRR, March, 2025
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training.
CoRR, January, 2025
CoRR, January, 2025
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
2024
CoRR, 2024
Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning.
CoRR, 2024
Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
What Does a Visual Formal Analysis of the World's 500 Most Famous Paintings Tell Us About Multimodal LLMs?
Proceedings of the Second Tiny Papers Track at ICLR 2024, 2024
Proceedings of the Twelfth International Conference on Learning Representations, 2024
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
Proceedings of the Computer Vision - ECCV 2024, 2024
SiT: Exploring Flow and Diffusion-Based Generative Models with Scalable Interpolant Transformers.
Proceedings of the Computer Vision - ECCV 2024, 2024
Proceedings of the Computer Vision - ECCV 2024, 2024
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
2023
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
2022
Sample-Efficient Neural Architecture Search by Learning Actions for Monte Carlo Tree Search.
IEEE Trans. Pattern Anal. Mach. Intell., 2022
Proceedings of the Computer Vision - ECCV 2022, 2022
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
2021
A Fistful of Words: Learning Transferable Visual Models from Bag-of-Words Supervision.
CoRR, 2021
On Interaction Between Augmentations and Corruptions in Natural Corruption Robustness.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021
2020
Proceedings of the 37th International Conference on Machine Learning, 2020
Proceedings of the 8th International Conference on Learning Representations, 2020
Proceedings of the Computer Vision - ECCV 2020, 2020
Proceedings of the Computer Vision - ECCV 2020, 2020
FBNetV2: Differentiable Neural Architecture Search for Spatial and Channel Dimensions.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020
2019
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019
2018
Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification.
Proceedings of the Computer Vision - ECCV 2018, 2018
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018
2017
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017
2016
Proceedings of the Computer Vision - ECCV 2016, 2016
2015
Hyper-class augmented and regularized deep learning for fine-grained image classification.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015
Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, 2015
2014
Neural Networks, 2014
Semi-supervised non-negative matrix factorization for image clustering with graph Laplacian.
Multim. Tools Appl., 2014
2013
Proceedings of the British Machine Vision Conference, 2013
2012
Proceedings of the 21st International Conference on Pattern Recognition, 2012