Xu Tang

Orcid: 0009-0003-5220-2026

Affiliations:

Xiaohongshu Inc.

According to our database¹, Xu Tang authored at least 40 papers between 2018 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Bibliography

2025

Cross-Scenario Unified Modeling of User Interests at Billion Scale.

[BibT_eX]

[DOI]

CoRR, October, 2025

HyMiRec: A Hybrid Multi-interest Learning Framework for LLM-based Sequential Recommendation.

[BibT_eX]

[DOI]

CoRR, October, 2025

InstanceAssemble: Layout-Aware Image Generation via Instance Assembling Attention.

[BibT_eX]

[DOI]

CoRR, September, 2025

FireRedChat: A Pluggable, Full-Duplex Voice Interaction System with Cascaded and Semi-Cascaded Implementations.

[BibT_eX]

[DOI]

CoRR, September, 2025

FireRedTTS-2: Towards Long Conversational Speech Generation for Podcast and Chatbot.

[BibT_eX]

[DOI]

CoRR, September, 2025

Flux-Sculptor: Text-Driven Rich-Attribute Portrait Editing through Decomposed Spatial Flow Control.

[BibT_eX]

[DOI]

CoRR, July, 2025

FireRedTTS-1S: An Upgraded Streamable Foundation Text-to-Speech System.

[BibT_eX]

[DOI]

CoRR, March, 2025

CQ-DINO: Mitigating Gradient Dilution via Category Queries for Vast Vocabulary Object Detection.

[BibT_eX]

[DOI]

CoRR, March, 2025

FireRedASR: Open-Source Industrial-Grade Mandarin Speech Recognition Models from Encoder-Decoder to LLM Integration.

[BibT_eX]

[DOI]

CoRR, January, 2025

DynamicFace: High-Quality and Consistent Video Face Swapping using Composable 3D Facial Priors.

[BibT_eX]

[DOI]

CoRR, January, 2025

2024

OV-VIS: Open-Vocabulary Video Instance Segmentation.

[BibT_eX]

[DOI]

Int. J. Comput. Vis., November, 2024

OV-DAR: Open-Vocabulary Object Detection and Attributes Recognition.

[BibT_eX]

[DOI]

Int. J. Comput. Vis., November, 2024

PiClick: Picking the desired mask from multiple candidates in click-based interactive segmentation.

[BibT_eX]

[DOI]

Neurocomputing, 2024

Single Trajectory Distillation for Accelerating Image and Video Style Transfer.

[BibT_eX]

[DOI]

CoRR, 2024

Target-Driven Distillation: Consistency Distillation with Target Timestep Selection and Decoupled Guidance.

[BibT_eX]

[DOI]

CoRR, 2024

Mining Open Semantics from CLIP: A Relation Transition Perspective for Few-Shot Learning.

[BibT_eX]

[DOI]

CoRR, 2024

Stable-Makeup: When Real-World Makeup Transfer Meets Diffusion Model.

[BibT_eX]

[DOI]

CoRR, 2024

GeoFormer: Learning Point Cloud Completion with Tri-Plane Integrated Transformer.

[BibT_eX]

[DOI]

Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

SSR-Encoder: Encoding Selective Subject Representation for Subject-Driven Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

ZONE: Zero-Shot Instruction-Guided Local Editing.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Controllable Mind Visual Diffusion Model.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

ZONE: Zero-Shot Instruction-Guided Local Editing.

[BibT_eX]

[DOI]

CoRR, 2023

IPDreamer: Appearance-Controllable 3D Object Generation with Image Prompts.

[BibT_eX]

[DOI]

CoRR, 2023

PiClick: Picking the desired mask in click-based interactive segmentation.

[BibT_eX]

[DOI]

CoRR, 2023

MVP-SEG: Multi-View Prompt Learning for Open-Vocabulary Semantic Segmentation.

[BibT_eX]

[DOI]

CoRR, 2023

Towards Open-Vocabulary Video Instance Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

OvarNet: Towards Open-Vocabulary Object Attribute Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

End-to-End Temporal Action Detection With Transformer.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2022

SVIP: Sequence VerIfication for Procedures in Videos.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

End-to-end Temporal Action Detection with Transformer.

[BibT_eX]

[DOI]

CoRR, 2021

Pyramid Self-attention for Semantic Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Pattern Recognition and Computer Vision - 4th Chinese Conference, 2021

Decoupled IoU Regression for Object Detection.

[BibT_eX]

[DOI]

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Deep Interactive Video Inpainting: An Invisibility Cloak for Harry Potter.

[BibT_eX]

[DOI]

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

2020

Progressively Refined Face Detection Through Semantics-Enriched Representation Learning.

[BibT_eX]

[DOI]

IEEE Trans. Inf. Forensics Secur., 2020

Learning Global Structure Consistency for Robust Object Tracking.

[BibT_eX]

[DOI]

Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

HAMBox: Delving Into Mining High-Quality Anchors on Face Detection.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

BFBox: Searching Face-Appropriate Backbone and Feature Pyramid Network for Face Detector.

[BibT_eX]

[DOI]

Yang Liu

Xu Tang

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019

HAMBox: Delving into Online High-quality Anchors Mining for Detecting Outer Faces.

[BibT_eX]

[DOI]

CoRR, 2019

PyramidBox++: High Performance Detector for Finding Tiny Face.

[BibT_eX]

[DOI]

CoRR, 2019

2018

Face Aging With Identity-Preserved Conditional Generative Adversarial Networks.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Xu Tang

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...