Xiangru Jian

Orcid: 0009-0004-7138-7078

According to our database1, Xiangru Jian authored at least 26 papers between 2023 and 2026.

Collaborative distances:

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
FORGE: Fine-grained Multimodal Evaluation for Manufacturing Scenarios.
CoRR, April, 2026

CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents.
CoRR, March, 2026

Spatio-temporal traffic accidents detection via graph based generative adversarial network.
Eng. Appl. Artif. Intell., 2026

The Duality between Large Language Models and Data Management System.
IEEE Data Eng. Bull., 2026

2025
Grounding Computer Use Agents on Human Demonstrations.
CoRR, November, 2025

InteracSPARQL: An Interactive System for SPARQL Query Refinement Using Natural Language Explanations.
CoRR, November, 2025

Scope: Selective Cross-modal Orchestration of Visual Perception Experts.
CoRR, October, 2025

Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers.
CoRR, May, 2025

LazyVLM: Neuro-Symbolic Approach to Video Analytics.
CoRR, May, 2025

GraphOmni: A Comprehensive and Extendable Benchmark Framework for Large Language Models on Graph-theoretic Tasks.
CoRR, April, 2025

UI-Vision: A Desktop-centric GUI Benchmark for Visual Perception and Interaction.
CoRR, March, 2025

AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding.
CoRR, February, 2025

Rethinking Spectral Augmentation for Contrast-based Graph Self-Supervised Learning.
Trans. Mach. Learn. Res., 2025

Graph convolutional network for traffic incidents duration classification.
Eng. Appl. Artif. Intell., 2025

The Underappreciated Power of Vision Models for Graph Structural Understanding.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Document Understanding.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

DREAM: Improving Video-Text Retrieval Through Relevance-Based Augmentation Using Large Foundation Models.
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

UI-Vision: A Desktop-centric GUI Benchmark for Visual Perception and Interaction.
Proceedings of the Forty-second International Conference on Machine Learning, 2025

BigDocs: An Open Dataset for Training Multimodal Models on Document and Code Tasks.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

2024
BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks.
CoRR, 2024

Do spectral cues matter in contrast-based graph self-supervised learning?
CoRR, 2024

HaVTR: Improving Video-Text Retrieval Through Augmentation Using Large Foundation Models.
CoRR, 2024

2023
Balance Act: Mitigating Hubness in Cross-Modal Retrieval with Query and Gallery Banks.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

InvGC: Robust Cross-Modal Retrieval by Inverse Graph Convolution.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Communication-Efficient Decentralized Online Continuous DR-Submodular Maximization.
Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, 2023

Roughness Index for Loss Landscapes of Neural Network Models of Partial Differential Equations<sup>*</sup>.
Proceedings of the IEEE International Conference on Big Data, 2023


  Loading...