Xiangtai Li

Orcid: 0000-0002-0550-8247

According to our database¹, Xiangtai Li authored at least 97 papers between 2019 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Bibliography

2024

EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm.

[BibT_eX]

[DOI]

Int. J. Comput. Vis., September, 2024

Change Detection Methods for Remote Sensing in the Last Decade: A Comprehensive Review.

[BibT_eX]

[DOI]

Remote. Sens., July, 2024

Multi-Task Learning With Multi-Query Transformer for Dense Prediction.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., February, 2024

Sfnet: Faster and Accurate Semantic Segmentation Via Semantic Flow.

[BibT_eX]

[DOI]

Int. J. Comput. Vis., February, 2024

Toward Robust Referring Image Segmentation.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2024

Towards Open Vocabulary Learning: A Survey.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2024

OV-VG: A benchmark for open-vocabulary visual grounding.

[BibT_eX]

[DOI]

Neurocomputing, 2024

ModelNet-O: A large-scale synthetic dataset for occlusion-aware point cloud classification.

[BibT_eX]

[DOI]

Comput. Vis. Image Underst., 2024

You Can't Ignore Either: Unifying Structure and Feature Denoising for Robust Graph Learning.

[BibT_eX]

[DOI]

CoRR, 2024

LLAVADI: What Matters For Multimodal Large Language Models Distillation.

[BibT_eX]

[DOI]

CoRR, 2024

Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language.

[BibT_eX]

[DOI]

CoRR, 2024

OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding.

[BibT_eX]

[DOI]

CoRR, 2024

Mamba or RWKV: Exploring High-Quality and High-Efficiency Segment Anything Model.

[BibT_eX]

[DOI]

CoRR, 2024

MG-LLaVA: Towards Multi-Granularity Visual Instruction Tuning.

[BibT_eX]

[DOI]

CoRR, 2024

MotionBooth: Motion-Aware Customized Text-to-Video Generation.

[BibT_eX]

[DOI]

CoRR, 2024

Towards Semantic Equivalence of Tokenization in Multimodal LLM.

[BibT_eX]

[DOI]

CoRR, 2024

BACON: Bayesian Optimal Condensation Framework for Dataset Distillation.

[BibT_eX]

[DOI]

CoRR, 2024

SemFlow: Binding Semantic Segmentation and Image Synthesis via Rectified Flow.

[BibT_eX]

[DOI]

CoRR, 2024

Reason3D: Searching and Reasoning 3D Segmentation via Large Language Model.

[BibT_eX]

[DOI]

CoRR, 2024

Adversarial Attacks on Both Face Recognition and Face Anti-spoofing Models.

[BibT_eX]

[DOI]

CoRR, 2024

CPT-Interp: Continuous sPatial and Temporal Motion Modeling for 4D Medical Image Interpolation.

[BibT_eX]

[DOI]

CoRR, 2024

Face Adapter for Pre-Trained Diffusion Models with Fine-Grained ID and Attribute Control.

[BibT_eX]

[DOI]

CoRR, 2024

Point-In-Context: Understanding Point Cloud via In-Context Learning.

[BibT_eX]

[DOI]

CoRR, 2024

Learning Feature Inversion for Multi-class Anomaly Detection under General-purpose COCO-AD Benchmark.

[BibT_eX]

[DOI]

CoRR, 2024

DGMamba: Domain Generalization via Generalized State Space Model.

[BibT_eX]

[DOI]

CoRR, 2024

MambaAD: Exploring State Space Models for Multi-class Unsupervised Anomaly Detection.

[BibT_eX]

[DOI]

CoRR, 2024

DVIS-DAQ: Improving Video Segmentation via Dynamic Anchor Queries.

[BibT_eX]

[DOI]

CoRR, 2024

GenView: Enhancing View Quality with Pretrained Generative Model for Self-Supervised Learning.

[BibT_eX]

[DOI]

CoRR, 2024

Explore In-Context Segmentation via Latent Diffusion Models.

[BibT_eX]

[DOI]

CoRR, 2024

Point Cloud Mamba: Point Cloud Learning via State Space Model.

[BibT_eX]

[DOI]

CoRR, 2024

Generalizable Entity Grounding via Assistance of Large Language Model.

[BibT_eX]

[DOI]

CoRR, 2024

OMG-Seg: Is One Model Good Enough For All Segmentation?

[BibT_eX]

[DOI]

CoRR, 2024

RAP-SAM: Towards Real-Time All-Purpose Segment Anything.

[BibT_eX]

[DOI]

CoRR, 2024

Towards Language-Driven Video Inpainting via Multimodal Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

Open-Vocabulary SAM: Segment and Recognize Twenty-thousand Classes Interactively.

[BibT_eX]

[DOI]

CoRR, 2024

An Open and Comprehensive Pipeline for Unified Object Grounding and Detection.

[BibT_eX]

[DOI]

CoRR, 2024

BA-SAM: Scalable Bias-Mode Attention Mask for Segment Anything Model.

[BibT_eX]

[DOI]

CoRR, 2024

A Generalist FaceX via Learning Unified Facial Representation.

[BibT_eX]

[DOI]

CoRR, 2024

VG4D: Vision-Language Model Goes 4D Video Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Robotics and Automation, 2024

CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023

Exploring Self-Supervised Learning for Multi-Modal Remote Sensing Pre-Training via Asymmetric Attention Fusion.

[BibT_eX]

[DOI]

Remote. Sens., December, 2023

Convolution-Enhanced Evolving Attention Networks.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., July, 2023

TransVOD: End-to-End Video Object Detection With Spatial-Temporal Transformers.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., June, 2023

Improving Video Instance Segmentation via Temporal Pyramid Routing.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., May, 2023

RTMO: Towards High-Performance One-Stage Real-Time Multi-Person Pose Estimation.

[BibT_eX]

[DOI]

CoRR, 2023

Exploring Plain ViT Reconstruction for Multi-class Unsupervised Anomaly Detection.

[BibT_eX]

[DOI]

CoRR, 2023

EdgeSAM: Prompt-In-the-Loop Distillation for On-Device Deployment of SAM.

[BibT_eX]

[DOI]

CoRR, 2023

Skeleton-in-Context: Unified Skeleton Sequence Modeling with In-Context Learning.

[BibT_eX]

[DOI]

CoRR, 2023

Effective Adapter for Face Recognition in the Wild.

[BibT_eX]

[DOI]

CoRR, 2023

Rethinking Evaluation Metrics of Open-Vocabulary Segmentaion.

[BibT_eX]

[DOI]

CoRR, 2023

DST-Det: Simple Dynamic Self-Training for Open-Vocabulary Object Detection.

[BibT_eX]

[DOI]

CoRR, 2023

MosaicFusion: Diffusion Models as Data Augmenters for Large Vocabulary Instance Segmentation.

[BibT_eX]

[DOI]

CoRR, 2023

Neural Collapse Terminus: A Unified Solution for Class Incremental Learning and Its Variants.

[BibT_eX]

[DOI]

CoRR, 2023

Pair then Relation: Pair-Net for Panoptic Scene Graph Generation.

[BibT_eX]

[DOI]

CoRR, 2023

Change Detection Methods for Remote Sensing in the Last Decade: A Comprehensive Review.

[BibT_eX]

[DOI]

CoRR, 2023

Transformer-Based Visual Segmentation: A Survey.

[BibT_eX]

[DOI]

CoRR, 2023

Tube-Link: A Flexible Cross Tube Baseline for Universal Video Segmentation.

[BibT_eX]

[DOI]

CoRR, 2023

Reference Twice: A Simple and Unified Baseline for Few-Shot Instance Segmentation.

[BibT_eX]

[DOI]

CoRR, 2023

Rethinking Mobile Block for Efficient Neural Models.

[BibT_eX]

[DOI]

CoRR, 2023

PanopticPartFormer++: A Unified and Decoupled View for Panoptic Part Segmentation.

[BibT_eX]

[DOI]

CoRR, 2023

4D Panoptic Scene Graph Generation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Explore In-Context Learning for 3D Point Cloud Understanding.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Neural Collapse Inspired Feature-Classifier Alignment for Few-Shot Class-Incremental Learning.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Iterative Robust Visual Grounding with Masked Reference based Centerpoint Supervision.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Rethinking Mobile Block for Efficient Attention-based Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Betrayed by Captions: Joint Caption Grounding and Generation for Open Vocabulary Instance Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Tube-Link: A Flexible Cross Tube Framework for Universal Video Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Panoptic Video Scene Graph Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

Towards Robust Referring Image Segmentation.

[BibT_eX]

[DOI]

CoRR, 2022

SFNet: Faster, Accurate, and Domain Agnostic Semantic Segmentation via Semantic Flow.

[BibT_eX]

[DOI]

CoRR, 2022

EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm.

[BibT_eX]

[DOI]

CoRR, 2022

Multi-Task Learning with Multi-query Transformer for Dense Prediction.

[BibT_eX]

[DOI]

CoRR, 2022

Do We Really Need a Learnable Classifier at the End of Deep Neural Network?

[BibT_eX]

[DOI]

CoRR, 2022

Inducing Neural Collapse in Imbalanced Learning: Do We Really Need a Learnable Classifier at the End of Deep Neural Network?

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Query Learning of Both Thing and Stuff for Panoptic Segmentation.

[BibT_eX]

[DOI]

Proceedings of the 2022 IEEE International Conference on Image Processing, 2022

PolyphonicFormer: Unified Query Learning for Depth-Aware Video Panoptic Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Fashionformer: A Simple, Effective and Unified Baseline for Human Fashion Segmentation and Recognition.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Panoptic-PartFormer: Learning a Unified Model for Panoptic Part Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Video K-Net: A Simple, Strong, and Unified Baseline for Video Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

Global Aggregation Then Local Distribution for Scene Parsing.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2021

Towards Efficient Scene Understanding via Squeeze Reasoning.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2021

Improving Video Instance Segmentation via Temporal Pyramid Routing.

[BibT_eX]

[DOI]

CoRR, 2021

BoundarySqueeze: Image Segmentation as Boundary Squeezing.

[BibT_eX]

[DOI]

CoRR, 2021

End-to-End Video Object Detection with Spatial-Temporal Transformers.

[BibT_eX]

[DOI]

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Fast and Accurate Scene Parsing via Bi-Direction Alignment Networks.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE International Conference on Image Processing, 2021

Dynamic Dual Sampling Module For Fine-Grained Semantic Segmentation.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE International Conference on Image Processing, 2021

Enhanced Boundary Learning for Glass-like Object Segmentation.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

PointFlow: Flowing Semantics Through Points for Aerial Image Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Involution: Inverting the Inherence of Convolution for Visual Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020

Semantic Flow for Fast and Accurate Scene Parsing.

[BibT_eX]

[DOI]

CoRR, 2020

Semantic Flow for Fast and Accurate Scene Parsing.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

Improving Semantic Segmentation via Decoupled Body and Edge Supervision.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

Gated Fully Fusion for Semantic Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

GFF: Gated Fully Fusion for Semantic Segmentation.

[BibT_eX]

[DOI]

CoRR, 2019

Flow2Seg: Motion-Aided Semantic Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Artificial Neural Networks and Machine Learning - ICANN 2019: Image Processing, 2019

Dual Graph Convolutional Network for Semantic Segmentation.

[BibT_eX]

[DOI]

Proceedings of the 30th British Machine Vision Conference 2019, 2019

Global Aggregation then Local Distribution in Fully Convolutional Networks.

[BibT_eX]

[DOI]

Proceedings of the 30th British Machine Vision Conference 2019, 2019

Xiangtai Li

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...