Jing Bi
Orcid: 0009-0006-8235-2158Affiliations:
- Univeristy of Rochester, Rochester, NY, USA
According to our database1,
Jing Bi authored at least 32 papers
between 2018 and 2026.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on linkedin.com
-
on orcid.org
On csauthors.net:
Bibliography
2026
CoRR, March, 2026
IEEE Trans. Circuits Syst. Video Technol., February, 2026
Omni-Judge: Can Omni-LLMs Serve as Human-Aligned Judges for Text-Conditioned Audio-Video Generation?
CoRR, February, 2026
Caption Anything in Video: Fine-grained Object-centric Captioning via Spatiotemporal Multimodal Prompting.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026
2025
CoRR, November, 2025
CoRR, October, 2025
Video-LMM Post-Training: A Deep Dive into Video Reasoning with Large Multimodal Models.
CoRR, October, 2025
CoRR, July, 2025
I<sup>2</sup>G: Generating Instructional Illustrations via Text-Conditioned Diffusion.
CoRR, May, 2025
CoRR, April, 2025
VERIFY: A Benchmark of Visual Explanation and Reasoning for Investigating Multimodal Reasoning Fidelity.
CoRR, March, 2025
MMPerspective: Do MLLMs Understand Perspective? A Comprehensive Benchmark for Perspective Perception, Reasoning, and Robustness.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025
Proceedings of the IEEE/CVF International Conference on Computer Vision, ICCV 2025, 2025
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
Empowering LLMs with Pseudo-Untrimmed Videos for Audio-Visual Temporal Understanding.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025
2024
CoRR, 2024
AVicuna: Audio-Visual LLM with Interleaver and Context-Boundary Alignment for Temporal Referential Dialogue.
CoRR, 2024
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2024, 2024
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024
2023
2021
Procedure Planning in Instructional Videos via Contextual Modeling and Model-based Policy Learning.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
2020
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020
2018