Rong Xiao

Orcid: 0000-0003-2207-5698

Affiliations:

Intellifusion Inc., Shenzhen, Guangdong, China
Ping An Property & Casualty Insurance Company
Microsoft Research Asia, Beijing, China (former)

According to our database¹, Rong Xiao authored at least 49 papers between 2006 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Bibliography

2025

HCAttention: Extreme KV Cache Compression via Heterogeneous Attention Computing for LLMs.

[BibT_eX]

[DOI]

CoRR, July, 2025

UVLM: Benchmarking Video Language Model for Underwater World Understanding.

[BibT_eX]

[DOI]

CoRR, July, 2025

Not All Attention Heads Are What You Need: Refining CLIP's Image Representation with Attention Ablation.

[BibT_eX]

[DOI]

CoRR, July, 2025

Language Embedding Meets Dynamic Graph: A New Exploration for Neural Architecture Representation Learning.

[BibT_eX]

[DOI]

CoRR, June, 2025

MiniMax-Remover: Taming Bad Noise Helps Video Object Removal.

[BibT_eX]

[DOI]

CoRR, May, 2025

From Word to Sentence: A Large-Scale Multi-Instance Dataset for Open-Set Aerial Detection.

[BibT_eX]

[DOI]

CoRR, May, 2025

Señorita-2M: A High-Quality Instruction-based Dataset for General Video Editing by Video Specialists.

[BibT_eX]

[DOI]

CoRR, February, 2025

Neural Normalized Cut: A differential and generalizable approach for spectral clustering.

[BibT_eX]

[DOI]

Pattern Recognit., 2025

BiTA: Bi-directional tuning for lossless acceleration in large language models.

[BibT_eX]

[DOI]

Expert Syst. Appl., 2025

Elucidating the design space of language models for image generation.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

Taming Transformer Without Using Learning Rate Warmup.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Exploring a Principled Framework for Deep Subspace Clustering.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

BiGR: Harnessing Binary Latent Codes for Image Generation and Improved Visual Representation Capabilities.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

CoCoCo: Improving Text-Guided Video Inpainting for Better Consistency, Controllability and Compatibility.

[BibT_eX]

[DOI]

Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024

Graph Cut-guided Maximal Coding Rate Reduction for Learning Image Embedding and Clustering.

[BibT_eX]

[DOI]

CoRR, 2024

OVA-DETR: Open Vocabulary Aerial Object Detection Using Image-Text Alignment and Fusion.

[BibT_eX]

[DOI]

CoRR, 2024

Generation Meets Verification: Accelerating Large Language Model Inference with Smart Parallel Auto-Correct Decoding.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2024

Graph Cut-Guided Maximal Coding Rate Reduction for Learning Image Embedding and Clustering.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ACCV 2024, 2024

2023

NAR-Former V2: Rethinking Transformer for Universal Neural Network Representation Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

2022

EMU: Effective Multi-Hot Encoding Net for Lightweight Scene Text Recognition With a Large Character Set.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., 2022

Learning graph normalization for graph neural networks.

[BibT_eX]

[DOI]

Neurocomputing, 2022

2021

MASTER: Multi-aspect non-local network for scene text recognition.

[BibT_eX]

[DOI]

Pattern Recognit., 2021

1st Place Solution for ICDAR 2021 Competition on Mathematical Formula Detection.

[BibT_eX]

[DOI]

CoRR, 2021

PingAn-VCGroup's Solution for ICDAR 2021 Competition on Scientific Literature Parsing Task B: Table Recognition to HTML.

[BibT_eX]

[DOI]

CoRR, 2021

PingAn-VCGroup's Solution for ICDAR 2021 Competition on Scientific Table Image Recognition to Latex.

[BibT_eX]

[DOI]

CoRR, 2021

2020

Hamming OCR: A Locality Sensitive Hashing Neural Network for Scene Text Recognition.

[BibT_eX]

[DOI]

CoRR, 2020

Neural Mesh Refiner for 6-DoF Pose Estimation.

[BibT_eX]

[DOI]

CoRR, 2020

PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks.

[BibT_eX]

[DOI]

Proceedings of the 25th International Conference on Pattern Recognition, 2020

2019

MASTER: Multi-Aspect Non-local Network for Scene Text Recognition.

[BibT_eX]

[DOI]

CoRR, 2019

Learning to Count Objects with Few Exemplar Annotations.

[BibT_eX]

[DOI]

CoRR, 2019

A Novel Joint Character Categorization and Localization Approach for Character-Level Scene Text Recognition.

[BibT_eX]

[DOI]

Proceedings of the Second International Workshop on Machine Learning, 2019

High Frequency Residual Learning for Multi-Scale Image Classification.

[BibT_eX]

[DOI]

Proceedings of the 30th British Machine Vision Conference 2019, 2019

2018

Revisit Multinomial Logistic Regression in Deep Learning: Data Dependent Model Initialization for Image Recognition.

[BibT_eX]

[DOI]

CoRR, 2018

2014

Pairwise Rotation Invariant Co-Occurrence Local Binary Pattern.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2014

2012

A rapid flower/leaf recognition system.

[BibT_eX]

[DOI]

Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Pairwise Rotation Invariant Co-occurrence Local Binary Pattern.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2012, 2012

2011

Location relevance classification for travelogue digests.

[BibT_eX]

[DOI]

Proceedings of the 20th International Conference on World Wide Web, 2011

On theme location discovery for travelogue services.

[BibT_eX]

[DOI]

Proceedings of the Proceeding of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2011

Rank-SIFT: Learning to rank repeatable local interest points.

[BibT_eX]

[DOI]

Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011

2010

Equip tourists with knowledge mined from travelogues.

[BibT_eX]

[DOI]

Proceedings of the 19th International Conference on World Wide Web, 2010

An efficient location extraction algorithm by leveraging web contextual information.

[BibT_eX]

[DOI]

Proceedings of the 18th ACM SIGSPATIAL International Symposium on Advances in Geographic Information Systems, 2010

2009

TravelScope: standing on the shoulders of dedicated travelers.

[BibT_eX]

[DOI]

Proceedings of the 17th International Conference on Multimedia 2009, 2009

2008

3D Face Recognition by Local Shape Difference Boosting.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision, 2008

Face Alignment Via Component-Based Discriminative Search.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision, 2008

2007

Dynamic Cascades for Face Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE 11th International Conference on Computer Vision, 2007

Linear Laplacian Discrimination for Feature Extraction.

[BibT_eX]

[DOI]

Proceedings of the 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2007), 2007

A Face Annotation Framework with Partial Clustering and Interactive Labeling.

[BibT_eX]

[DOI]

Proceedings of the 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2007), 2007

EasyAlbum: an interactive photo annotation system based on face clustering and re-ranking.

[BibT_eX]

[DOI]

Proceedings of the 2007 Conference on Human Factors in Computing Systems, 2007

2006

Joint Boosting Feature Selection for Robust Face Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2006), 2006

Rong Xiao

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...