Dingkang Liang

Orcid: 0000-0003-3035-1373

According to our database1, Dingkang Liang authored at least 54 papers between 2020 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
Shuffle-R1: Efficient RL framework for Multimodal Large Language Models via Data-centric Dynamic Shuffle.
CoRR, August, 2025

Less is Enough: Training-Free Video Diffusion Acceleration via Runtime-Adaptive Caching.
CoRR, July, 2025

Extending Large Vision-Language Model for Diverse Interactive Tasks in Autonomous Driving.
CoRR, May, 2025

DriVerse: Navigation World Model for Driving Simulation via Multimodal Trajectory Prompting and Motion Alignment.
CoRR, April, 2025

An Empirical Study of Ground Segmentation for 3-D Object Detection.
IEEE Trans. Intell. Transp. Syst., March, 2025

ORION: A Holistic End-to-End Autonomous Driving Framework by Vision-Language Instructed Action Generation.
CoRR, March, 2025

Seeing the Future, Perceiving the Future: A Unified Driving World Model for Future Generation and Perception.
CoRR, March, 2025

The Role of World Models in Shaping Autonomous Driving: A Comprehensive Survey.
CoRR, February, 2025

HERMES: A Unified Self-Driving World Model for Simultaneous 3D Scene Understanding and Generation.
CoRR, January, 2025

Layerlink: Bridging remote sensing object detection and large vision models with efficient fine-tuning.
Pattern Recognit., 2025

AVS-Net: Point sampling with adaptive voxel size for 3D scene understanding.
Neurocomputing, 2025

Mini-Monkey: Alleviating the Semantic Sawtooth Effect for Lightweight MLLMs via Complementary Image Pyramid.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

MINIMA: Modality Invariant Image Matching.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

SemiETS: Integrating Spatial and Content Consistencies for Semi-Supervised End-to-end Text Spotting.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

A Unified Image-Dense Annotation Generation Model for Underwater Scenes.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024
A Discrepancy Aware Framework for Robust Anomaly Detection.
IEEE Trans. Ind. Informatics, March, 2024

LATFormer: Locality-Aware Point-View Fusion Transformer for 3D shape recognition.
Pattern Recognit., 2024

Parameter-Efficient Fine-Tuning in Spectral Domain for Point Cloud Learning.
CoRR, 2024

Mini-Monkey: Multi-Scale Adaptive Cropping for Multimodal Large Language Models.
CoRR, 2024

SOOD++: Leveraging Unlabeled Data to Boost Oriented Object Detection.
CoRR, 2024

Anomaly Detection by Adapting a pre-trained Vision Language Model.
CoRR, 2024

AVS-Net: Point Sampling with Adaptive Voxel Size for 3D Point Cloud Analysis.
CoRR, 2024

SAM3D: zero-shot 3D object detection via the segment anything model.
Sci. China Inf. Sci., 2024

Not All Texts Are the Same: Dynamically Querying Texts for Scene Text Detection.
Proceedings of the Pattern Recognition and Computer Vision - 7th Chinese Conference, 2024

MoE Jetpack: From Dense Checkpoints to Adaptive Mixture of Experts for Vision Tasks.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

PointMamba: A Simple State Space Model for Point Cloud Analysis.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

A Unified Framework for 3D Scene Understanding.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Make Your ViT-Based Multi-view 3D Detectors Faster via Token Compression.
Proceedings of the Computer Vision - ECCV 2024, 2024

Well Begun is Half Done: The Importance of Initialization in Dataset Distillation.
Proceedings of the Computer Vision - ECCV 2024 Workshops, 2024

Dynamic Adapter Meets Prompt Tuning: Parameter-Efficient Transfer Learning for Point Cloud Analysis.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
You Only Look Bottom-Up for Monocular 3D Object Detection.
IEEE Robotics Autom. Lett., November, 2023

Focal Inverse Distance Transform Maps for Crowd Localization.
IEEE Trans. Multim., 2023

Visual Information Extraction in the Wild: Practical Dataset and End-to-end Solution.
CoRR, 2023

Diffusion-Based 3D Object Detection with Random Boxes.
Proceedings of the Pattern Recognition and Computer Vision - 6th Chinese Conference, 2023

Query-based Temporal Fusion with Explicit Motion for 3D Object Detection.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

DDS3D: Dense Pseudo-Labels with Dynamic Threshold for Semi-Supervised 3D Object Detection.
Proceedings of the IEEE International Conference on Robotics and Automation, 2023

Visual Information Extraction in the Wild: Practical Dataset and End-to-End Solution.
Proceedings of the Document Analysis and Recognition - ICDAR 2023, 2023

A Simple Vision Transformer for Weakly Semi-supervised 3D Object Detection.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Super-Resolution Information Enhancement for Crowd Counting.
Proceedings of the IEEE International Conference on Acoustics, 2023

CrowdCLIP: Unsupervised Crowd Counting via Vision-Language Model.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

SOOD: Towards Semi-Supervised Oriented Object Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
Cell Localization and Counting Using Direction Field Map.
IEEE J. Biomed. Health Informatics, 2022

AutoScale: Learning to Scale for Crowd Counting.
Int. J. Comput. Vis., 2022

TransCrowd: weakly-supervised crowd counting with transformers.
Sci. China Inf. Sci., 2022

Comprehensive benchmark datasets for Amharic scene text detection and recognition.
Sci. China Inf. Sci., 2022

An End-to-End Transformer Model for Crowd Localization.
Proceedings of the Computer Vision - ECCV 2022, 2022

When Counting Meets HMER: Counting-Aware Network for Handwritten Mathematical Expression Recognition.
Proceedings of the Computer Vision - ECCV 2022, 2022

2021
Fault Diagnosis of Main Pump in Converter Station Based on Deep Neural Network.
Symmetry, 2021

Dilated-Scale-Aware Category-Attention ConvNet for Multi-Class Object Counting.
IEEE Signal Process. Lett., 2021

TransCrowd: Weakly-Supervised Crowd Counting with Transformer.
CoRR, 2021

Reciprocal Distance Transform Maps for Crowd Counting and People Localization in Dense Crowd.
CoRR, 2021


2020
Dilated-Scale-Aware Attention ConvNet For Multi-Class Object Counting.
CoRR, 2020



  Loading...