We stand with Ukraine

We stand with Ukraine

Xiangru Jian

Orcid: 0009-0004-7138-7078

According to our database¹, Xiangru Jian authored at least 26 papers between 2023 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

FORGE: Fine-grained Multimodal Evaluation for Manufacturing Scenarios.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, April, 2026

CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents.

[DOI]

,

,

Kevin Qinghong Lin

,

,

,

Patrice Béchard

,

,

CoRR, March, 2026

Spatio-temporal traffic accidents detection via graph based generative adversarial network.

[DOI]

,

,

,

,

Eng. Appl. Artif. Intell., 2026

The Duality between Large Language Models and Data Management System.

[DOI]

,

Kerem Akillioglu

,

IEEE Data Eng. Bull., 2026

2025

Grounding Computer Use Agents on Human Demonstrations.

[DOI]

,

,

,

Kevin Qinghong Lin

,

,

,

,

Johan Obando-Ceron

,

Juan A. Rodríguez

,

Nicolas Chapados

,

,

Adriana Romero-Soriano

,

Reihaneh Rabbany

,

Perouz Taslakian

,

Christopher Pal

,

,

CoRR, November, 2025

InteracSPARQL: An Interactive System for SPARQL Query Refinement Using Natural Language Explanations.

[DOI]

,

,

CoRR, November, 2025

Scope: Selective Cross-modal Orchestration of Visual Perception Experts.

[DOI]

,

,

,

Juan A. Rodríguez

,

,

,

,

Perouz Taslakian

CoRR, October, 2025

Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers.

[DOI]

,

Kevin Qinghong Lin

,

,

,

CoRR, May, 2025

LazyVLM: Neuro-Symbolic Approach to Video Analytics.

[DOI]

,

,

,

,

CoRR, May, 2025

GraphOmni: A Comprehensive and Extendable Benchmark Framework for Large Language Models on Graph-theoretic Tasks.

[DOI]

,

,

,

,

,

,

,

,

,

CoRR, April, 2025

UI-Vision: A Desktop-centric GUI Benchmark for Visual Perception and Interaction.

[DOI]

,

,

Kevin Qinghong Lin

,

Juan A. Rodríguez

,

,

,

Nicolas Chapados

,

,

Aishwarya Agrawal

,

,

Christopher Pal

,

Perouz Taslakian

,

,

CoRR, March, 2025

AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding.

[DOI]

CoRR, February, 2025

Rethinking Spectral Augmentation for Contrast-based Graph Self-Supervised Learning.

[DOI]

,

,

,

,

,

,

Trans. Mach. Learn. Res., 2025

Graph convolutional network for traffic incidents duration classification.

[DOI]

,

,

,

Eng. Appl. Artif. Intell., 2025

The Underappreciated Power of Vision Models for Graph Structural Understanding.

[DOI]

,

,

,

,

,

,

Xiaozhuang Song

,

,

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Document Understanding.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

DREAM: Improving Video-Text Retrieval Through Relevance-Based Augmentation Using Large Foundation Models.

[DOI]

,

,

,

,

,

,

Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

UI-Vision: A Desktop-centric GUI Benchmark for Visual Perception and Interaction.

[DOI]

,

,

Kevin Qinghong Lin

,

Juan A. Rodríguez

,

,

Nicolas Chapados

,

,

Aishwarya Agrawal

,

,

Christopher Pal

,

Perouz Taslakian

,

,

Proceedings of the Forty-second International Conference on Machine Learning, 2025

BigDocs: An Open Dataset for Training Multimodal Models on Document and Code Tasks.

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

2024

BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks.

[DOI]

CoRR, 2024

Do spectral cues matter in contrast-based graph self-supervised learning?

[DOI]

,

,

,

,

,

,

CoRR, 2024

HaVTR: Improving Video-Text Retrieval Through Augmentation Using Large Foundation Models.

[DOI]

,

,

,

,

,

CoRR, 2024

2023

Balance Act: Mitigating Hubness in Cross-Modal Retrieval with Query and Gallery Banks.

[DOI]

,

,

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

InvGC: Robust Cross-Modal Retrieval by Inverse Graph Convolution.

[DOI]

,

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Communication-Efficient Decentralized Online Continuous DR-Submodular Maximization.

[DOI]

,

,

,

,

,

Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, 2023

Roughness Index for Loss Landscapes of Neural Network Models of Partial Differential Equations<sup>*</sup>.

[DOI]

,

,

,

,

Proceedings of the IEEE International Conference on Big Data, 2023

Loading...