Bin Wang

Orcid: 0000-0002-5625-2966

Affiliations:
  • Shanghai Artificial Intelligence Laboratory, Shanghai, China


According to our database1, Bin Wang authored at least 29 papers between 2008 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
DropQueries: A Simple Way to Discover Comprehensive Segment Representations.
IEEE Trans. Multim., 2024

CDM: A Reliable Metric for Fair and Accurate Formula Recognition Evaluation.
CoRR, 2024

OpenDataLab: Empowering General Artificial Intelligence with Open Datasets.
CoRR, 2024

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output.
CoRR, 2024

DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models.
CoRR, 2024

DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models.
CoRR, 2024

OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text.
CoRR, 2024

DSDL: Data Set Description Language for Bridging Modalities and Tasks in AI Data.
CoRR, 2024

How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites.
CoRR, 2024

UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition.
CoRR, 2024

InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD.
CoRR, 2024

InternLM2 Technical Report.
CoRR, 2024

InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model.
CoRR, 2024

Parrot Captions Teach CLIP to Spot Text.
Proceedings of the Computer Vision - ECCV 2024, 2024

OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

VIGC: Visual Instruction Generation and Correction.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization.
CoRR, 2023

InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition.
CoRR, 2023

MLLM-DataEngine: An Iterative Refinement Approach for MLLM.
CoRR, 2023

WanJuan: A Comprehensive Multimodal Dataset for Advancing English and Chinese Large Models.
CoRR, 2023

V3Det: Vast Vocabulary Visual Detection Dataset.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

2022
Cycle-Consistent Learning for Weakly Supervised Semantic Segmentation.
Proceedings of the HCMA@MM 2022: Proceedings of the 3rd International Workshop on Human-Centric Multimedia Analysis, 2022

2019
Detection and tracking based tubelet generation for video object detection.
J. Vis. Commun. Image Represent., 2019

Spatiotemporal Breast Mass Detection Network (MD-Net) in 4D DCE-MRI Images.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2019, 2019

Boundary Perception Guidance: A Scribble-Supervised Semantic Segmentation Approach.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

2018
Automated Pulmonary Nodule Detection: High Sensitivity with Few Candidates.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2018, 2018

2012
Staying-alive path planning with energy optimization for mobile robots.
Expert Syst. Appl., 2012

2008
Staying-alive and energy-efficient path planning for mobile robots.
Proceedings of the American Control Conference, 2008

A new feedrate adaptation control NURBS interpolation based on de boor algorithm in CNC systems.
Proceedings of the American Control Conference, 2008


  Loading...