Bo Zhang
Orcid: 0000-0001-8052-782XAffiliations:
- Shanghai AI Laboratory, China
- Fudan University, MoE Key Laboratory for Information Science of Electromagnetic Waves, Shanghai, China (PhD 2022)
According to our database1,
Bo Zhang
authored at least 83 papers
between 2016 and 2025.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2025
CoRR, August, 2025
Open-Source LLMs Collaboration Beats Closed-Source LLMs: A Scalable Multi-Agent System.
CoRR, July, 2025
BridgeNet: Comprehensive and Effective Feature Interactions via Bridge Feature for Multi-Task Dense Predictions.
IEEE Trans. Pattern Anal. Mach. Intell., May, 2025
CoRR, May, 2025
Doc-CoB: Enhancing Multi-Modal Document Understanding with Visual Chain-of-Boxes Reasoning.
CoRR, May, 2025
NovelSeek: When Agent Becomes the Scientist - Building Closed-Loop System from Hypothesis to Verification.
CoRR, May, 2025
CoRR, May, 2025
Breaking the Compression Ceiling: Data-Free Pipeline for Ultra-Efficient Delta Compression.
CoRR, May, 2025
GDI-Bench: A Benchmark for General Document Intelligence with Vision and Reasoning Decoupling.
CoRR, May, 2025
TrustGeoGen: Scalable and Formal-Verified Data Engine for Trustworthy Multi-modal Geometric Problem Solving.
CoRR, April, 2025
CoRR, March, 2025
CoRR, March, 2025
Temporal Overlapping Prediction: A Self-supervised Pre-training Method for LiDAR Moving Object Segmentation.
CoRR, March, 2025
MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency.
CoRR, February, 2025
Dolphin: Closed-loop Open-ended Auto-research through Thinking, Practice, and Feedback.
CoRR, January, 2025
IEEE Trans. Image Process., 2025
DF<sup>2</sup>RQ: Dynamic Feature Fusion via Region-Wise Queries for Semantic Segmentation of Multimodal Remote Sensing Data.
IEEE Trans. Geosci. Remote. Sens., 2025
IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens., 2025
DSF2-NAS: Dual-Stage Feature Fusion via Network Architecture Search for Classification of Multimodal Remote Sensing Images.
IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens., 2025
GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
JiSAM: Alleviate Labeling Burden and Corner Case Problems in Autonomous Driving via Minimal Real-World Data.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
Dolphin: Moving Towards Closed-loop Auto-research through Thinking, Practice, and Feedback.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025
SURVEYFORGE : On the Outline Heuristics, Memory-Driven Generation, and Multi-dimensional Evaluation for Automated Survey Writing.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025
2024
Few-Shot Cross-Domain Object Detection With Instance-Level Prototype-Based Meta-Learning.
IEEE Trans. Circuits Syst. Video Technol., October, 2024
Push-and-Pull: A General Training Framework With Differential Augmentor for Domain Generalized Point Cloud Classification.
IEEE Trans. Circuits Syst. Video Technol., August, 2024
IEEE Trans. Geosci. Remote. Sens., 2024
Multi-View Vision Fusion Network: Can 2D Pre-Trained Model Boost 3D Point Cloud Data-Scarce Learning?
IEEE Trans. Circuits Syst. Video Technol., 2024
TREND: Unsupervised 3D Representation Learning via Temporal Forecasting for LiDAR Perception.
CoRR, 2024
HyperDet: Generalizable Detection of Synthesized Images by Generating and Merging A Mixture of Hyper LoRAs.
CoRR, 2024
CoRR, 2024
DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models.
CoRR, 2024
OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text.
CoRR, 2024
CoRR, 2024
ChartX & ChartVLM: A Versatile Benchmark and Foundation Model for Complicated Chart Reasoning.
CoRR, 2024
OASim: an Open and Adaptive Simulator based on Neural Rendering for Autonomous Driving.
CoRR, 2024
How far are we to GPT-4V? Closing the gap to commercial multimodal models with open-source suites.
Sci. China Inf. Sci., 2024
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Continuously Learning, Adapting, and Improving: A Dual-Process Approach to Autonomous Driving.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2024
Proceedings of the Forty-first International Conference on Machine Learning, 2024
ReSimAD: Zero-Shot 3D Domain Transfer for Autonomous Driving with Source Reconstruction and Target Simulation.
Proceedings of the Twelfth International Conference on Learning Representations, 2024
On Reducing the Execution Latency of Superconducting Quantum Processors via Quantum Job Scheduling.
Proceedings of the 43rd IEEE/ACM International Conference on Computer-Aided Design, 2024
Proceedings of the Computer Vision - ECCV 2024, 2024
Once for Both: Single Stage of Importance and Sparsity Search for Vision Transformer Compression.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
2023
IEEE Trans. Pattern Anal. Mach. Intell., August, 2023
Int. J. Comput. Vis., March, 2023
Rethinking Cross-Domain Pedestrian Detection: A Background-Focused Distribution Alignment Framework for Instance-Free One-Stage Detectors.
IEEE Trans. Image Process., 2023
MATNet: A Combining Multi-Attention and Transformer Network for Hyperspectral Image Classification.
IEEE Trans. Geosci. Remote. Sens., 2023
IEEE Geosci. Remote. Sens. Lett., 2023
CoRR, 2023
CoRR, 2023
CoRR, 2023
CoRR, 2023
Multi-view Vision-Prompt Fusion Network: Can 2D Pre-trained Model Boost 3D Point Cloud Data-scarce Learning?
CoRR, 2023
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
Proceedings of the 31st ACM International Conference on Multimedia, 2023
Proceedings of the 31st ACM International Conference on Multimedia, 2023
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
2022
Joint Distribution Alignment via Adversarial Learning for Domain Adaptive Object Detection.
IEEE Trans. Multim., 2022
IEEE Trans. Image Process., 2022
Curriculum-Style Local-to-Global Adaptation for Cross-Domain Remote Sensing Image Segmentation.
IEEE Trans. Geosci. Remote. Sens., 2022
IEEE Trans. Circuits Syst. Video Technol., 2022
Few-Shot Object Detection With Self-Adaptive Global Similarity and Two-Way Foreground Stimulator in Remote Sensing Images.
IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens., 2022
ADAS: A Simple Active-and-Adaptive Baseline for Cross-Domain 3D Semantic Segmentation.
CoRR, 2022
CoRR, 2022
Learning Cross-Image Object Semantic Relation in Transformer for Few-Shot Fine-Grained Image Classification.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022
2021
Coarse-to-Fine Joint Distribution Alignment for Cross-Domain Hyperspectral Image Classification.
IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens., 2021
Scale-Aware Anchor-Free Object Detection via Curriculum Learning for Remote Sensing Images.
IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens., 2021
Neural Comput. Appl., 2021
Object-aware Long-short-range Spatial Alignment for Few-Shot Fine-Grained Image Classification.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021
2017
Proceedings of the 2017 ACM on Multimedia Conference, 2017
2016
Proceedings of the 11th International Conference on Computer Science & Education, 2016