Jinfeng Bai

Orcid: 0000-0001-8940-480X

According to our database¹, Jinfeng Bai authored at least 65 papers between 2012 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

VitaGlyph: Vitalizing Artistic Typography with Flexible Dual-branch Diffusion Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2026

2025

MDIQA: Unified Image Quality Assessment for Multi-dimensional Evaluation and Restoration.

[BibT_eX]

[DOI]

CoRR, August, 2025

SOLIDGEO: Measuring Multimodal Spatial Math Reasoning in Solid Geometry.

[BibT_eX]

[DOI]

CoRR, May, 2025

Decoupled Visual Interpretation and Linguistic Reasoning for Math Problem Solving.

[BibT_eX]

[DOI]

CoRR, May, 2025

Real face foundation representation learning for generalized deepfake detection.

[BibT_eX]

[DOI]

Pattern Recognit., 2025

Integrating Visual Interpretation and Linguistic Reasoning for Geometric Problem Solving.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

CMMaTH: A Chinese Multi-modal Math Skill Evaluation Benchmark for Foundation Models.

[BibT_eX]

[DOI]

Proceedings of the 31st International Conference on Computational Linguistics, 2025

Enhancing Multimodal Continual Instruction Tuning with BranchLoRA.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

Explicit Relational Reasoning Network for Scene Text Detection.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024

Leveraging BERT to Improve Spoken Language Identification of Code-Switching Speech.

[BibT_eX]

[DOI]

Int. J. Asian Lang. Process., March, 2024

CMMaTH: A Chinese Multi-modal Math Skill Evaluation Benchmark for Foundation Models.

[BibT_eX]

[DOI]

CoRR, 2024

Two Optimizers Are Better Than One: LLM Catalyst for Enhancing Gradient-Based Optimization.

[BibT_eX]

[DOI]

CoRR, 2024

MasterWeaver: Taming Editability and Identity for Personalized Text-to-Image Generation.

[BibT_eX]

[DOI]

CoRR, 2024

MuMath: Multi-perspective Data Augmentation for Mathematical Reasoning in Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2024, 2024

Collaborative Domain Alignment for Multi-source Domain Adaptation.

[BibT_eX]

[DOI]

Proceedings of the Pattern Recognition - 27th International Conference, 2024

DPA-2D: Depth Propagation and Alignment with 2D Observations Guidance for Human Mesh Recovery.

[BibT_eX]

[DOI]

Proceedings of the 18th IEEE International Conference on Automatic Face and Gesture Recognition, 2024

MuMath-Code: Combining Tool-Use Large Language Models with Multi-perspective Data Augmentation for Mathematical Reasoning.

[BibT_eX]

[DOI]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

MasterWeaver: Taming Editability and Face Identity for Personalized Text-to-Image Generation.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

HPNet: Dynamic Trajectory Forecasting with Historical Prediction Attention.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

CK12: A Rounded K12 Knowledge Graph Based Benchmark for Chinese Holistic Cognition Evaluation.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

LRANet: Towards Accurate and Efficient Scene Text Detection with Low-Rank Approximation Network.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Leveraging Local Variance for Pseudo-Label Selection in Semi-supervised Learning.

[BibT_eX]

[DOI]

Zeping Min

Jinfeng Bai

Chengfei Li

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Decoupled Textual Embeddings for Customized Image Generation.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

Dual Contrastive Prediction for Incomplete Multi-View Representation Learning.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., April, 2023

Robust Multi-View Clustering With Incomplete Information.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2023

GPT Can Solve Mathematical Problems Without a Calculator.

[BibT_eX]

[DOI]

CoRR, 2023

Patch Is Not All You Need.

[BibT_eX]

[DOI]

CoRR, 2023

DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

TPS++: Attention-Enhanced Thin-Plate Spline for Scene Text Recognition.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

CCLAP: Controllable Chinese Landscape Painting Generation Via Latent Diffusion Model.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

Decoupling Visual-Semantic Features Learning with Dual Masked Autoencoder for Self-Supervised Scene Text Recognition.

[BibT_eX]

[DOI]

Proceedings of the Document Analysis and Recognition - ICDAR 2023, 2023

ViSA: Visual and Semantic Alignment for Robust Scene Text Recognition.

[BibT_eX]

[DOI]

Proceedings of the Document Analysis and Recognition - ICDAR 2023, 2023

Semantic Graph Representation Learning for Handwritten Mathematical Expression Recognition.

[BibT_eX]

[DOI]

Proceedings of the Document Analysis and Recognition - ICDAR 2023, 2023

ICDAR 2023 Competition on Recognition of Multi-line Handwritten Mathematical Expressions.

[BibT_eX]

[DOI]

Proceedings of the Document Analysis and Recognition - ICDAR 2023, 2023

ELITE: Encoding Visual Concepts into Textual Embeddings for Customized Text-to-Image Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

A Synthetic Corpus Generation Method for Neural Vocoder Training.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

DSPGAN: A Gan-Based Universal Vocoder for High-Fidelity TTS by Time-Frequency Domain Supervision from DSP.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Unveiling the Implicit Toxicity in Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Black-Box Tuning of Vision-Language Models with Effective Gradient Approximation.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Texts as Images in Prompt Tuning for Multi-Label Image Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Inferring and Leveraging Parts from Object Shape for Improving Semantic Image Synthesis.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

ReCoT: Regularized Co-Training for Facial Action Unit Recognition with Noisy Labels.

[BibT_eX]

[DOI]

Proceedings of the 34th British Machine Vision Conference 2023, 2023

Hybrid Syllable and Character Representations for Mandarin ASR.

[BibT_eX]

[DOI]

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

2022

Unsupervised Neural Rendering for Image Hazing.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2022

1st Place Solution for YouTubeVOS Challenge 2022: Referring Video Object Segmentation.

[BibT_eX]

[DOI]

CoRR, 2022

Position-Aware Contrastive Alignment for Referring Image Segmentation.

[BibT_eX]

[DOI]

CoRR, 2022

1st Place Solutions for UG2+ Challenge 2022 ATMOSPHERIC TURBULENCE MITIGATION.

[BibT_eX]

[DOI]

CoRR, 2022

1st Place Solutions for the UVO Challenge 2022.

[BibT_eX]

[DOI]

CoRR, 2022

BERT-LID: Leveraging BERT to Improve Spoken Language Identification.

[BibT_eX]

[DOI]

CoRR, 2022

Towards Diverse and Faithful One-shot Adaption of Generative Adversarial Networks.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Improving Speech Separation with Knowledge Distilled from Self-supervised Pre-trained Models.

[BibT_eX]

[DOI]

Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

BERT-LID: Leveraging BERT to Improve Spoken Language Identification.

[BibT_eX]

[DOI]

Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

Summary On The ISCSLP 2022 Chinese-English Code-Switching ASR Challenge.

[BibT_eX]

[DOI]

Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

TALCS: An open-source Mandarin-English code-switching corpus and a speech recognition baseline.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

A Vision Transformer Based Scene Text Recognizer with Multi-grained Encoding and Decoding.

[BibT_eX]

[DOI]

Proceedings of the Frontiers in Handwriting Recognition - 18th International Conference, 2022

Time-Domain Audio-Visual Speech Separation on Low Quality Videos.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

When Counting Meets HMER: Counting-Aware Network for Handwritten Mathematical Expression Recognition.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

2014

Anchor Shot Detection with Deep Neural Network.

[BibT_eX]

[DOI]

Proceedings of the Advances in Multimedia Information Processing - PCM 2014, 2014

CeleLabel: an interactive system for annotating celebrities in web videos.

[BibT_eX]

[DOI]

Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014

Image character recognition using deep convolutional neural network learned from different languages.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE International Conference on Image Processing, 2014

Chinese Image Text Recognition on grayscale pixels.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

Chinese Image Character Recognition Using DNN and Machine Simulated Training Samples.

[BibT_eX]

[DOI]

Proceedings of the Artificial Neural Networks and Machine Learning - ICANN 2014, 2014

2013

Camera based cross devices manipulating with augmented reality.

[BibT_eX]

[DOI]

Proceedings of the 2013 IEEE International Conference on Multimedia and Expo Workshops, 2013

Binarization of natural scene text based on L1-Norm PCA.

[BibT_eX]

[DOI]

Jinfeng Bai

Bailan Feng

Bo Xu

Proceedings of the 2013 IEEE International Conference on Multimedia and Expo Workshops, 2013

2012

Multi-modal information fusion for news story segmentation in broadcast video.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Jinfeng Bai

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...