Zhen Li
Orcid: 0000-0002-7669-2686Affiliations:
- Chinese University of Hong Kong, Shenzhen Research Institute of Big Data, Shenzhen, China
According to our database1,
Zhen Li
authored at least 173 papers
between 2016 and 2026.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2026
Medical Image Anal., 2026
2025
Int. J. Comput. Vis., August, 2025
MDK12-Bench: A Comprehensive Evaluation of Multimodal Large Language Models on Multidisciplinary Exams.
CoRR, August, 2025
RoboMemory: A Brain-inspired Multi-memory Agentic Framework for Lifelong Learning in Physical Embodied Systems.
CoRR, August, 2025
CoRR, July, 2025
BleedOrigin: Dynamic Bleeding Source Localization in Endoscopic Submucosal Dissection via Dual-Stage Detection and Tracking.
CoRR, July, 2025
CoRR, July, 2025
CoRR, July, 2025
Scene-R1: Video-Grounded Large Language Models for 3D Scene Reasoning without 3D Annotations.
CoRR, June, 2025
CoRR, June, 2025
IEEE Trans. Intell. Transp. Syst., May, 2025
CoRR, April, 2025
MDK12-Bench: A Multi-Discipline Benchmark for Evaluating Reasoning in Multimodal Large Language Models.
CoRR, April, 2025
CoRR, March, 2025
CoRR, March, 2025
PiSA: A Self-Augmented Data Engine and Training Strategy for 3D Understanding with Large Models.
CoRR, March, 2025
CLEA: Closed-Loop Embodied Agent for Enhancing Task Execution in Dynamic Environments.
CoRR, March, 2025
IEEE J. Biomed. Health Informatics, February, 2025
OpenFly: A Versatile Toolchain and Large-scale Benchmark for Aerial Vision-Language Navigation.
CoRR, February, 2025
Advancing Dense Endoscopic Reconstruction with Gaussian Splatting-driven Surface Normal-aware Tracking and Mapping.
CoRR, January, 2025
IMAGINE-E: Image Generation Intelligence Evaluation of State-of-the-art Text-to-Image Models.
CoRR, January, 2025
V²-SfMLearner: Learning Monocular Depth and Ego-Motion for Multimodal Wireless Capsule Endoscopy.
IEEE Trans Autom. Sci. Eng., 2025
Pattern Recognit., 2025
Boost Protein Language Model with Injected Structure Information Through Parameter Efficient Fine-tuning.
Comput. Biol. Medicine, 2025
Position: Prospective of Autonomous Driving - Multimodal LLMs, World Models, Embodied Intelligence, AI Alignment, and Mamba.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2025
Proceedings of the 22nd IEEE International Symposium on Biomedical Imaging, 2025
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
DriveGEN: Generalized and Robust 3D Detection in Driving via Controllable Text-to-Image Diffusion Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025
2024
Comprehensive Visual Question Answering on Point Clouds through Compositional Scene Manipulation.
IEEE Trans. Vis. Comput. Graph., December, 2024
ACM Trans. Multim. Comput. Commun. Appl., December, 2024
A Wearable, Reconfigurable, and Modular Magnetic Tracking System for Wireless Capsule Robots.
IEEE Trans. Ind. Informatics, December, 2024
IEEE Trans. Neural Networks Learn. Syst., September, 2024
ECC-PolypDet: Enhanced CenterNet With Contrastive Learning for Automatic Polyp Detection.
IEEE J. Biomed. Health Informatics, August, 2024
Int. J. Comput. Vis., July, 2024
IEEE Trans. Pattern Anal. Mach. Intell., January, 2024
Pattern Recognit., 2024
V<sup>2</sup>-SfMLearner: Learning Monocular Depth and Ego-motion for Multimodal Wireless Capsule Endoscopy.
CoRR, 2024
CoRR, 2024
An Efficient Occupancy World Model via Decoupled Dynamic Flow and Image-assisted Training.
CoRR, 2024
ETSM: Automating Dissection Trajectory Suggestion and Confidence Map-Based Safety Margin Prediction for Robot-assisted Endoscopic Submucosal Dissection.
CoRR, 2024
PDZSeg: Adapting the Foundation Model for Dissection Zone Segmentation with Visual Prompts in Robot-assisted Endoscopic Submucosal Dissection.
CoRR, 2024
CoRR, 2024
Privacy-Preserving Federated Foundation Model for Generalist Ultrasound Artificial Intelligence.
CoRR, 2024
CoPESD: A Multi-Level Surgical Motion Dataset for Training Large Vision-Language Models to Co-Pilot Endoscopic Submucosal Dissection.
CoRR, 2024
SA-GS: Semantic-Aware Gaussian Splatting for Large Scene Reconstruction with Geometry Constrain.
CoRR, 2024
CoRR, 2024
CoRR, 2024
GauU-Scene V2: Assessing the Reliability of Image-Based Metrics with Expansive Lidar Image Dataset Using 3DGS and NeRF.
CoRR, 2024
Generating, Reconstructing, and Representing Discrete and Continuous Data: Generalized Diffusion with Learnable Encoding-Decoding.
CoRR, 2024
GauU-Scene: A Scene Reconstruction Benchmark on Large Scale 3D Reconstruction Dataset Using Gaussian Splatting.
CoRR, 2024
Forging Vision Foundation Models for Autonomous Driving: Challenges, Methodologies, and Opportunities.
CoRR, 2024
Atom-ProteinQA: Atom-level protein model quality assessment through fine-grained joint learning.
Comput. Methods Programs Biomed., 2024
Towards Flexible 3D Perception: Object-Centric Occupancy Completion Augments 3D Object Detection.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
SearchLVLMs: A Plug-and-Play Framework for Augmenting Large Vision-Language Models by Searching Up-to-Date Internet Knowledge.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Towards a Benchmark for Colorectal Cancer Segmentation in Endorectal Ultrasound Videos: Dataset and Model Development.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2024, 2024
EndoUIC: Promptable Diffusion Transformer for Unified Illumination Correction in Capsule Endoscopy.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2024, 2024
Enrichment, Borrowing, And Mining: A Data-Driven Approach To Colonoscopic Lesion Classification.
Proceedings of the IEEE International Symposium on Biomedical Imaging, 2024
ColonCLIP: An Adaptable Prompt-Driven Multi-Modal Strategy for Colonoscopy Image Diagnosis.
Proceedings of the IEEE International Symposium on Biomedical Imaging, 2024
Generalize Polyp Segmentation Via Inpainting Across Diverse Backgrounds and Pseudo-Mask Refinement.
Proceedings of the IEEE International Symposium on Biomedical Imaging, 2024
Magnetic-Guided Flexible Origami Robot toward Long-Term Phototherapy of H. pylori in the Stomach.
Proceedings of the IEEE International Conference on Robotics and Automation, 2024
Chained Flexible Capsule Endoscope: Unraveling the Conundrum of Size Limitations and Functional Integration for Gastrointestinal Transitivity.
Proceedings of the IEEE International Conference on Robotics and Automation, 2024
Unified Generation, Reconstruction, and Representation: Generalized Diffusion with Adaptive Latent Encoding-Decoding.
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Proceedings of the Twelfth International Conference on Learning Representations, 2024
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
Proceedings of the Computer Vision - ECCV 2024, 2024
Proceedings of the Computer Vision - ECCV 2024, 2024
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Let Video Teaches You More: Video-to-Image Knowledge Distillation using Detection TRansformer for Medical Video Lesion Detection.
Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2024
MixPolyp: Integrating Mask, Box and Scribble Supervision for Enhanced Polyp Segmentation.
Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2024
RadOcc: Learning Cross-Modality Occupancy Knowledge through Rendering Assisted Distillation.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
WeakPCSOD: Overcoming the Bias of Box Annotations for Weakly Supervised Point Cloud Salient Object Detection.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
X4D-SceneFormer: Enhanced Scene Understanding on 4D Point Cloud Videos through Cross-Modal Knowledge Transfer.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
CrossBind: Collaborative Cross-Modal Identification of Protein Nucleic-Acid-Binding Residues.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
2023
SAVAnet: Surgical Action-Driven Visual Attention Network for Autonomous Endoscope Control.
IEEE Trans Autom. Sci. Eng., October, 2023
Comput. Medical Imaging Graph., September, 2023
Medical Image Anal., May, 2023
CoRR, 2023
GSmoothFace: Generalized Smooth Talking Face Generation via Fine Grained 3D Face Guidance.
CoRR, 2023
CPU: Codebook Lookup Transformer with Knowledge Distillation for Point Cloud Upsampling.
Proceedings of the 31st ACM International Conference on Multimedia, 2023
Toward Unpaired Multi-modal Medical Image Segmentation via Learning Structured Semantic Consistency.
Proceedings of the Medical Imaging with Deep Learning, 2023
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2023, 2023
YONA: You Only Need One Adjacent Reference-Frame for Accurate and Fast Video Polyp Detection.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2023, 2023
ArSDM: Colonoscopy Images Synthesis with Adaptive Refinement Semantic Diffusion Models.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2023, 2023
EasyGaze3D: Towards Effective and Flexible 3D Gaze Estimation from a Single RGB Camera.
IROS, 2023
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
Exploring the Effect of Primitives for Compositional Generalization in Vision-and-Language.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
ScribblePolyp: Scribble-Supervised Polyp Segmentation through Dual Consistency Alignment.
Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2023
CowClip: Reducing CTR Prediction Model Training Time from 12 Hours to 10 Minutes on 1 GPU.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023
2022
IEEE Trans. Medical Imaging, 2022
PointSite: A Point Cloud Segmentation Tool for Identification of Protein Ligand Binding Atoms.
J. Chem. Inf. Model., 2022
CoRR, 2022
Composable Text Control Operations in Latent Space with Ordinary Differential Equations.
CoRR, 2022
CoRR, 2022
Toward Unpaired Multi-modal Medical Image Segmentation via Learning Structured Semantic Consistency.
CoRR, 2022
CowClip: Reducing CTR Prediction Model Training Time from 12 hours to 10 minutes on 1 GPU.
CoRR, 2022
Transformer based tooth classification from cone-beam computed tomography for dental charting.
Comput. Biol. Medicine, 2022
Prior knowledge facilitates low homologous protein secondary structure prediction with DSM distillation.
Bioinform., 2022
Divide and Contrast: Source-free Domain Adaptation via Adaptive Contrastive Learning.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
AMOS: A Large-Scale Abdominal Multi-Organ Benchmark for Versatile Medical Image Segmentation.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2022, 2022
BoxPolyp: Boost Generalized Polyp Segmentation Using Extra Coarse Bounding Box Annotations.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2022, 2022
Toward Clinically Assisted Colorectal Polyp Recognition via Structured Cross-Modal Representation Consistency.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2022, 2022
Proceedings of the Computer Vision - ECCV 2022, 2022
Weakly Supervised Object Localization Through Inter-class Feature Similarity and Intra-class Appearance Consistency.
Proceedings of the Computer Vision - ECCV 2022, 2022
Beyond 3D Siamese Tracking: A Motion-Centric Paradigm for 3D Single Object Tracking in Point Clouds.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
X -Trans2Cap: Cross-Modal Knowledge Transfer using Transformer for 3D Dense Captioning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022
Proceedings of the Computer Vision - ACCV 2022, 2022
Contact-Distil: Boosting Low Homologous Protein Contact Map Prediction by Self-Supervised Distillation.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022
2021
Medical-VLBERT: Medical Visual Language BERT for COVID-19 CT Report Generation With Alternate Learning.
IEEE Trans. Neural Networks Learn. Syst., 2021
CLEVR3D: Compositional Language and Elementary Visual Reasoning for Question Answering in 3D Real-World Scenes.
CoRR, 2021
Adaptive Residue-wise Profile Fusion for Low Homologous Protein SecondaryStructure Prediction Using External Knowledge.
CoRR, 2021
CoRR, 2021
InstanceRefer: Cooperative Holistic Understanding for Visual Grounding on Point Clouds through Instance Multi-level Contextual Referring.
CoRR, 2021
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2021 - 24th International Conference, Strasbourg, France, September 27, 2021
Colorectal Polyp Classification from White-Light Colonoscopy Images via Domain Alignment.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2021 - 24th International Conference, Strasbourg, France, September 27, 2021
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2021 - 24th International Conference, Strasbourg, France, September 27, 2021
Geometric Morphology Based Irrelevant Vessels Removal For Accurate Coronary Artery Segmentation.
Proceedings of the 18th IEEE International Symposium on Biomedical Imaging, 2021
Multi-Modal Active Learning For Automatic Liver Fibrosis Diagnosis Based On Ultrasound Shear Wave Elastography.
Proceedings of the 18th IEEE International Symposium on Biomedical Imaging, 2021
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021
Adaptive Residue-wise Profile Fusion for Low Homologous Protein Secondary Structure Prediction Using External Knowledge.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
InstanceRefer: Cooperative Holistic Understanding for Visual Grounding on Point Clouds through Instance Multi-level Contextual Referring.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021
Sparse Single Sweep LiDAR Point Cloud Segmentation via Learning Contextual Shape Priors from Scene Completion.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021
PSSM-Distil: Protein Secondary Structure Prediction (PSSP) on Low-Quality PSSM by Knowledge Distillation with Contrastive Learning.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021
2020
SelfText Beyond Polygon: Unconstrained Text Detection with Box Supervision and Dynamic Self-Training.
CoRR, 2020
JAFPro: Joint Appearance Fusion and Propagation for Human Video Motion Transfer from Multiple Reference Images.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2020, 2020
Characterizing Label Errors: Confident Learning for Noisy-Labeled Image Segmentation.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2020, 2020
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2020, 2020
Proceedings of the 17th IEEE International Symposium on Biomedical Imaging, 2020
Progressive Abdominal Segmentation with Adaptively Hard Region Prediction and Feature Enhancement.
Proceedings of the 17th IEEE International Symposium on Biomedical Imaging, 2020
Towards Content-Independent Multi-Reference Super-Resolution: Adaptive Pattern Matching and Feature Aggregation.
Proceedings of the Computer Vision - ECCV 2020, 2020
MetaSelection: Metaheuristic Sub-Structure Selection for Neural Network Pruning Using Evolutionary Algorithm.
Proceedings of the ECAI 2020 - 24th European Conference on Artificial Intelligence, 29 August-8 September 2020, Santiago de Compostela, Spain, August 29 - September 8, 2020, 2020
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020
PointASNL: Robust Point Clouds Processing Using Nonlocal Neural Networks With Adaptive Sampling.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020
2019
Ultrasound Liver Fibrosis Diagnosis Using Multi-indicator Guided Deep Neural Networks.
Proceedings of the Machine Learning in Medical Imaging - 10th International Workshop, 2019
Proceedings of the 2019 Challenge on Segmentation of THoracic Organs at Risk in CT Images, 2019
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019
Transmembrane Topology Identification by Fusing Evolutionary and Co-evolutionary Information with Cascaded Bidirectional Transformers.
Proceedings of the 10th ACM International Conference on Bioinformatics, 2019
2018
WaveNano: a signal-level nanopore base-caller via simultaneous prediction of nucleotide labels and move labels through bi-directional WaveNets.
Quant. Biol., 2018
2017
PLoS Comput. Biol., 2017
Predicting membrane protein contacts from non-membrane proteins by deep transfer learning.
CoRR, 2017
Proceedings of the Research in Computational Molecular Biology, 2017
High-Resolution Shape Completion Using Deep Neural Networks for Global Structure and Local Geometry Inference.
Proceedings of the IEEE International Conference on Computer Vision, 2017
2016
Protein Secondary Structure Prediction Using Cascaded Convolutional and Recurrent Neural Networks.
Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016
Proceedings of the Computer Vision - ECCV 2016, 2016