Shancheng Fang

Orcid: 0000-0002-3100-3664

Affiliations:
  • Shenzhen University, Research Institute for Future Media Computing, Shenzhen, China
  • YuanShi Technology, China
  • ByteDance Ltd, Beijing, China
  • University of Science and Technology of China, Hefei, China
  • Chinese Academy of Sciences, Institute of Information Engineering, Beijing, China (PhD 2020)


According to our database1, Shancheng Fang authored at least 38 papers between 2017 and 2026.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

Online presence:

On csauthors.net:

Bibliography

2026
Seeing is Improving: Visual Feedback for Iterative Text Layout Refinement.
CoRR, March, 2026

FineRef: Fine-Grained Error Reflection and Correction for Long-Form Generation with Citations.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025
IGD: Instructional Graphic Design with Multimodal Layer Generation.
CoRR, July, 2025

Mask<sup>2</sup>DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation.
CoRR, March, 2025

IterMeme: Expert-Guided Multimodal LLM for Interactive Meme Creation with Layout-Aware Generation.
Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence, 2025

Igd: Instructional Graphic Design With Multimodal Layer Generatio.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

Mask^2DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024
Sentiment-Oriented Transformer-Based Variational Autoencoder Network for Live Video Commenting.
ACM Trans. Multim. Comput. Commun. Appl., April, 2024

CDistNet: Perceiving Multi-domain Character Distance for Robust Text Recognition.
Int. J. Comput. Vis., February, 2024

A Graph-Based Synthetic Data Pipeline for Scaling High-Quality Reasoning Instructions.
CoRR, 2024

DEADiff: An Efficient Stylization Diffusion Model with Disentangled Representations.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

DreamIdentity: Enhanced Editability for Efficient Face-Identity Preserved Image Generation.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
ABINet++: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Spotting.
IEEE Trans. Pattern Anal. Mach. Intell., June, 2023

Learning Pixel Affinity Pyramid for Arbitrary-Shaped Text Detection.
ACM Trans. Multim. Comput. Commun. Appl., February, 2023

ADNet: Rethinking the Shrunk Polygon-Based Approach in Scene Text Detection.
IEEE Trans. Multim., 2023

DreamIdentity: Improved Editability for Efficient Face-identity Preserved Image Generation.
CoRR, 2023

Design Booster: A Text-Guided Diffusion Model for Image Translation with Spatial Layout Preservation.
CoRR, 2023

Crossing the Gap: Domain Generalization for Image Captioning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
Boundary-Aware Arbitrary-Shaped Scene Text Detector With Learnable Embedding Network.
IEEE Trans. Multim., 2022

PETR: Rethinking the Capability of Transformer-Based Language Model in Scene Text Recognition.
IEEE Trans. Image Process., 2022

Semantically Similarity-Wise Dual-Branch Network for Scene Graph Generation.
IEEE Trans. Circuits Syst. Video Technol., 2022

Semi-Supervised Text Detection With Accurate Pseudo-Labels.
IEEE Signal Process. Lett., 2022

Fine-tuning with Multi-modal Entity Prompts for News Image Captioning.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Background Layout Generation and Object Knowledge Transfer for Text-to-Image Generation.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

ER-SAN: Enhanced-Adaptive Relation Self-Attention Network for Image Captioning.
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

2021
A Simple and Strong Baseline: Progressively Region-based Scene Text Removal Networks.
CoRR, 2021

TDI TextSpotter: Taking Data Imbalance into Account in Scene Text Spotting.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

End-to-end Boundary Exploration for Weakly-supervised Semantic Segmentation.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

From Two to One: A New Scene Text Recognizer with Visual Language Modeling Network.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020
CRNet: A Center-aware Representation for Detecting Text of Arbitrary Shapes.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

2019
Convolutional Attention Networks for Scene Text Recognition.
ACM Trans. Multim. Comput. Commun. Appl., 2019

Learning to Draw Text in Natural Images with Conditional Adversarial Networks.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

MLTS: A Multi-Language Scene Text Spotter.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2019

2018
Uyghur Text Matching in Graphic Images for Biomedical Semantic Analysis.
Neuroinformatics, 2018

Attention and Language Ensemble for Scene Text Recognition with Convolutional Sequence Modeling.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Deep Convolutional Nets for Pulmonary Nodule Detection and Classification.
Proceedings of the Knowledge Science, Engineering and Management, 2018

2017
Detecting Uyghur text in complex background images with convolutional neural network.
Multim. Tools Appl., 2017


  Loading...