Xu Tang

Orcid: 0009-0003-5220-2026

Affiliations:
  • Xiaohongshu Inc.


According to our database1, Xu Tang authored at least 34 papers between 2018 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
InstanceAssemble: Layout-Aware Image Generation via Instance Assembling Attention.
CoRR, September, 2025

FireRedChat: A Pluggable, Full-Duplex Voice Interaction System with Cascaded and Semi-Cascaded Implementations.
CoRR, September, 2025

FireRedTTS-2: Towards Long Conversational Speech Generation for Podcast and Chatbot.
CoRR, September, 2025

Flux-Sculptor: Text-Driven Rich-Attribute Portrait Editing through Decomposed Spatial Flow Control.
CoRR, July, 2025

FireRedTTS-1S: An Upgraded Streamable Foundation Text-to-Speech System.
CoRR, March, 2025

CQ-DINO: Mitigating Gradient Dilution via Category Queries for Vast Vocabulary Object Detection.
CoRR, March, 2025

FireRedASR: Open-Source Industrial-Grade Mandarin Speech Recognition Models from Encoder-Decoder to LLM Integration.
CoRR, January, 2025

DynamicFace: High-Quality and Consistent Video Face Swapping using Composable 3D Facial Priors.
CoRR, January, 2025

2024
OV-VIS: Open-Vocabulary Video Instance Segmentation.
Int. J. Comput. Vis., November, 2024

OV-DAR: Open-Vocabulary Object Detection and Attributes Recognition.
Int. J. Comput. Vis., November, 2024

PiClick: Picking the desired mask from multiple candidates in click-based interactive segmentation.
Neurocomputing, 2024

Single Trajectory Distillation for Accelerating Image and Video Style Transfer.
CoRR, 2024

Target-Driven Distillation: Consistency Distillation with Target Timestep Selection and Decoupled Guidance.
CoRR, 2024

Mining Open Semantics from CLIP: A Relation Transition Perspective for Few-Shot Learning.
CoRR, 2024

Stable-Makeup: When Real-World Makeup Transfer Meets Diffusion Model.
CoRR, 2024

GeoFormer: Learning Point Cloud Completion with Tri-Plane Integrated Transformer.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

SSR-Encoder: Encoding Selective Subject Representation for Subject-Driven Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

ZONE: Zero-Shot Instruction-Guided Local Editing.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Controllable Mind Visual Diffusion Model.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
ZONE: Zero-Shot Instruction-Guided Local Editing.
CoRR, 2023

IPDreamer: Appearance-Controllable 3D Object Generation with Image Prompts.
CoRR, 2023

PiClick: Picking the desired mask in click-based interactive segmentation.
CoRR, 2023

MVP-SEG: Multi-View Prompt Learning for Open-Vocabulary Semantic Segmentation.
CoRR, 2023

Towards Open-Vocabulary Video Instance Segmentation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

OvarNet: Towards Open-Vocabulary Object Attribute Recognition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
End-to-End Temporal Action Detection With Transformer.
IEEE Trans. Image Process., 2022

SVIP: Sequence VerIfication for Procedures in Videos.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
End-to-end Temporal Action Detection with Transformer.
CoRR, 2021

Pyramid Self-attention for Semantic Segmentation.
Proceedings of the Pattern Recognition and Computer Vision - 4th Chinese Conference, 2021

Decoupled IoU Regression for Object Detection.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Deep Interactive Video Inpainting: An Invisibility Cloak for Harry Potter.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

2020
Learning Global Structure Consistency for Robust Object Tracking.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

2019
HAMBox: Delving into Online High-quality Anchors Mining for Detecting Outer Faces.
CoRR, 2019

2018
Face Aging With Identity-Preserved Conditional Generative Adversarial Networks.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018


  Loading...