Jianshu Zhang

Orcid: 0000-0002-2713-2535

Affiliations:
  • iFLYTEK AI Research, China
  • University of Science and Technology of China, Department of Electronic Information Engineering, Hefei, China


According to our database1, Jianshu Zhang authored at least 54 papers between 2016 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
Enhancing the Geometric Problem-Solving Ability of Multimodal LLMs via Symbolic-Neural Integration.
CoRR, April, 2025

MMC: Iterative Refinement of VLM Reasoning via MCTS-based Multimodal Critique.
CoRR, April, 2025

PRM-BAS: Enhancing Multimodal Reasoning through PRM-guided Beam Annealing Search.
CoRR, April, 2025

Skeleton and Font Generation Network for Zero-shot Chinese Character Generation.
CoRR, January, 2025

Count, decompose and correct: A new approach to handwritten Chinese character error correction.
Pattern Recognit., 2025

DocMamba: Efficient Document Pre-training with State Space Model.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024
SEMv2: Table separation line detection based on instance segmentation.
Pattern Recognit., 2024

See then Tell: Enhancing Key Information Extraction with Vision Grounding.
CoRR, 2024

SRFUND: A Multi-Granularity Hierarchical Structure Reconstruction Benchmark in Form Understanding.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Maths: Multimodal Transformer-Based Human-Readable Solver.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

UniTabNet: Bridging Vision and Language Models for Enhanced Table Structure Recognition.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

2023
Multimodal Pre-Training Based on Graph Attention Network for Document Understanding.
IEEE Trans. Multim., 2023

A Tree-Structure Analysis Network on Handwritten Chinese Character Error Correction.
IEEE Trans. Multim., 2023

Count, Decode and Fetch: A New Approach to Handwritten Chinese Character Error Correction.
CoRR, 2023

SEMv2: Table Separation Line Detection Based on Conditional Convolution.
CoRR, 2023

Group, Contrast and Recognize: A Self-supervised Method for Chinese Character Recognition.
Proceedings of the Document Analysis and Recognition - ICDAR 2023, 2023

Enhancing Math Word Problem Solving Through Salient Clue Prioritization: A Joint Token-Phrase-Level Feature Integration Approach.
Proceedings of the International Conference on Asian Language Processing, 2023

USTC-iFLYTEK at DocILE: A Multi-modal Approach Using Domain-specific GraphDoc.
Proceedings of the Working Notes of the Conference and Labs of the Evaluation Forum (CLEF 2023), 2023

HRDoc: Dataset and Baseline Method toward Hierarchical Reconstruction of Document Structures.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Split, Embed and Merge: An accurate table structure recognizer.
Pattern Recognit., 2022

Tree-based data augmentation and mutual learning for offline handwritten mathematical expression recognition.
Pattern Recognit., 2022

A multimodal attention fusion network with a dynamic vocabulary for TextVQA.
Pattern Recognit., 2022

Scene Text Recognition with Self-supervised Contrastive Predictive Coding.
Proceedings of the 26th International Conference on Pattern Recognition, 2022

Multimodal Tree Decoder for Table of Contents Extraction in Document Images.
Proceedings of the 26th International Conference on Pattern Recognition, 2022

Learning Contextually Fused Audio-Visual Representations For Audio-Visual Speech Recognition.
Proceedings of the 2022 IEEE International Conference on Image Processing, 2022

Improving Isolated Glyph Classification Task for Palm Leaf Manuscripts.
Proceedings of the Frontiers in Handwriting Recognition - 18th International Conference, 2022

TDv2: A Novel Tree-Structured Decoder for Offline Mathematical Expression Recognition.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
SRD: A Tree Structure Based Decoder for Online Handwritten Mathematical Expression Recognition.
IEEE Trans. Multim., 2021

Stroke constrained attention network for online handwritten mathematical expression recognition.
Pattern Recognit., 2021

Split, embed and merge: An accurate table structure recognizer.
CoRR, 2021

Radical Composition Network for Chinese Character Generation.
Proceedings of the 16th International Conference on Document Analysis and Recognition, 2021

MRD: A Memory Relation Decoder for Online Handwritten Mathematical Expression Recognition.
Proceedings of the 16th International Conference on Document Analysis and Recognition, 2021

2020
Radical analysis network for learning hierarchies of Chinese characters.
Pattern Recognit., 2020

Stroke Constrained Attention Network for Online Handwritten Mathematical Expression Recognition.
CoRR, 2020

Semi-Supervised End-to-End ASR via Teacher-Student Learning with Conditional Posterior Distribution.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

A Transformer-based Radical Analysis Network for Chinese Character Recognition.
Proceedings of the 25th International Conference on Pattern Recognition, 2020

Stroke Based Posterior Attention for Online Handwritten Mathematical Expression Recognition.
Proceedings of the 25th International Conference on Pattern Recognition, 2020

Radical Counter Network for Robust Chinese Character Recognition.
Proceedings of the 25th International Conference on Pattern Recognition, 2020

A Tree-Structured Decoder for Image-to-Markup Generation.
Proceedings of the 37th International Conference on Machine Learning, 2020

2019
Track, Attend, and Parse (TAP): An End-to-End Framework for Online Handwritten Mathematical Expression Recognition.
IEEE Trans. Multim., 2019

Joint Spatial and Radical Analysis Network For Distorted Chinese Character Recognition.
Proceedings of the Second International Workshop on Machine Learning, 2019

Multi-modal Attention Network for Handwritten Mathematical Expression Recognition.
Proceedings of the 2019 International Conference on Document Analysis and Recognition, 2019

Episodic Training for Domain Generalization.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

2018
Attention Based Fully Convolutional Network for Speech Emotion Recognition.
CoRR, 2018

Trajectory-based Radical Analysis Network for Online Handwritten Chinese Character Recognition.
Proceedings of the 24th International Conference on Pattern Recognition, 2018

Multi-Scale Attention with Dense Encoder for Handwritten Mathematical Expression Recognition.
Proceedings of the 24th International Conference on Pattern Recognition, 2018

Radical Analysis Network for Zero-Shot Learning in Printed Chinese Character Recognition.
Proceedings of the 2018 IEEE International Conference on Multimedia and Expo, 2018

DenseRAN for Offline Handwritten Chinese Character Recognition.
Proceedings of the 16th International Conference on Frontiers in Handwriting Recognition, 2018

Attention Based Fully Convolutional Network for Speech Emotion Recognition.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

2017
Watch, attend and parse: An end-to-end neural network based approach to handwritten mathematical expression recognition.
Pattern Recognit., 2017

RAN: Radical analysis networks for zero-shot learning of Chinese characters.
CoRR, 2017

A GRU-Based Encoder-Decoder Approach with Attention for Online Handwritten Mathematical Expression Recognition.
Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition, 2017

Dual Learning of the Generator and Recognizer for Chinese Characters.
Proceedings of the 4th IAPR Asian Conference on Pattern Recognition, 2017

2016
RNN-BLSTM Based Multi-Pitch Estimation.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016


  Loading...