Hanrong Ye

Orcid: 0000-0002-7986-6143

According to our database¹, Hanrong Ye authored at least 33 papers between 2019 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

JetViT: Efficient High-Resolution Vision Transformer with Post-Training Attention Search.

[BibT_eX]

[DOI]

CoRR, May, 2026

Attend Before Attention: Efficient and Scalable Video Understanding via Autoregressive Gazing.

[BibT_eX]

[DOI]

CoRR, March, 2026

Speech-Hands: A Self-Reflection Voice Agentic Approach to Speech Recognition and Audio Reasoning with Omni Perception.

[BibT_eX]

[DOI]

CoRR, January, 2026

Speech-Hands: A Self-Reflection Voice Agentic Approach to Speech Recognition and Audio Reasoning with Omni Perception.

[BibT_eX]

[DOI]

Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2026

2025

GSPN-2: Efficient Parallel Sequence Modeling.

[BibT_eX]

[DOI]

CoRR, December, 2025

ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration.

[BibT_eX]

[DOI]

CoRR, November, 2025

Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models.

[BibT_eX]

[DOI]

Yonggan Fu

Xin Dong

Shizhe Diao

Matthijs Van Keirsbilck

CoRR, November, 2025

OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM.

[BibT_eX]

[DOI]

CoRR, October, 2025

UALM: Unified Audio Language Model for Understanding, Generation and Reasoning.

[BibT_eX]

[DOI]

CoRR, October, 2025

QeRL: Beyond Efficiency - Quantization-enhanced Reinforcement Learning for LLMs.

[BibT_eX]

[DOI]

CoRR, October, 2025

Test-Time Scaling Strategies for Generative Retrieval in Multimodal Conversational Recommendations.

[BibT_eX]

[DOI]

CoRR, August, 2025

Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models.

[BibT_eX]

[DOI]

Yonggan Fu

Xin Dong

Shizhe Diao

Matthijs Van Keirsbilck

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

Scaling RL to Long Videos.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

Multi-Task Label Discovery via Hierarchical Task Tokens for Partially Annotated Dense Predictions.

[BibT_eX]

[DOI]

Proceedings of the 33rd ACM International Conference on Multimedia, 2025

I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

MMEgo: Towards Building Egocentric Multimodal LLMs for Video QA.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

MIA-Bench: Towards Better Instruction Following Evaluation of Multimodal LLMs.

[BibT_eX]

[DOI]

Yusu Qian

Hanrong Ye

Jean-Philippe Fauconnier

Peter Grasch

Yinfei Yang

Zhe Gan

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

2024

InvPT++: Inverted Pyramid Multi-Task Transformer for Visual Scene Understanding.

[BibT_eX]

[DOI]

Hanrong Ye

Dan Xu

IEEE Trans. Pattern Anal. Mach. Intell., December, 2024

MM-Ego: Towards Building Egocentric Multimodal LLMs.

[BibT_eX]

[DOI]

CoRR, 2024

X-VILA: Cross-Modality Alignment for Large Language Model.

[BibT_eX]

[DOI]

CoRR, 2024

SegGen: Supercharging Segmentation Models with Text2Mask and Mask2Img Synthesis.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

DiffusionMTL: Learning Multi-Task Denoising Diffusion Model from Partially Annotated Data.

[BibT_eX]

[DOI]

Hanrong Ye

Dan Xu

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023

Joint 2D-3D Multi-Task Learning on Cityscapes-3D: 3D Detection, Segmentation, and Depth Estimation.

[BibT_eX]

[DOI]

Hanrong Ye

Dan Xu

CoRR, 2023

TaskPrompter: Spatial-Channel Multi-Task Prompting for Dense Scene Understanding.

[BibT_eX]

[DOI]

Hanrong Ye

Dan Xu

Proceedings of the Eleventh International Conference on Learning Representations, 2023

TaskExpert: Dynamically Assembling Multi-Task Representations with Memorial Mixture-of-Experts.

[BibT_eX]

[DOI]

Hanrong Ye

Dan Xu

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Contrastive Multi-Task Dense Prediction.

[BibT_eX]

[DOI]

Siwei Yang

Hanrong Ye

Dan Xu

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022

Improving Model Training with Multi-fidelity Hyperparameter Evaluation.

[BibT_eX]

[DOI]

Proceedings of the Fifth Conference on Machine Learning and Systems, 2022

Inverted Pyramid Multi-task Transformer for Dense Scene Understanding.

[BibT_eX]

[DOI]

Hanrong Ye

Dan Xu

Proceedings of the Computer Vision - ECCV 2022, 2022

2021

Bi-Directional Exponential Angular Triplet Loss for RGB-Infrared Person Re-Identification.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2021

Modality-aware Style Adaptation for RGB-Infrared Person Re-Identification.

[BibT_eX]

[DOI]

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

2020

Bi-directional Exponential Angular Triplet Loss for RGB-Infrared Person Re-Identification.

[BibT_eX]

[DOI]

CoRR, 2020

Video Logo Retrieval Based on Local Features.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Image Processing, 2020

2019

Self-Refining Deep Symmetry Enhanced Network for Rain Removal.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE International Conference on Image Processing, 2019

Hanrong Ye

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...