Zhibin Wang

Orcid: 0000-0001-7618-7973

Affiliations:
  • DAMO Academy, Alibaba Group, Hangzhou, China


According to our database1, Zhibin Wang authored at least 55 papers between 2017 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
Few-Shot Semantic Segmentation on Remote Sensing Images With Learnable Prototype.
IEEE Trans. Geosci. Remote. Sens., 2025

CR2PQ: Continuous Relative Rotary Positional Query for Dense Visual Representation Learning.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

MVPaint: Synchronized Multi-View Diffusion for Painting Anything 3D.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024
Inter-Class and Inter-Domain Semantic Augmentation for Domain Generalization.
IEEE Trans. Image Process., 2024

PolyRoad: Polyline Transformer for Topological Road-Boundary Detection.
IEEE Trans. Geosci. Remote. Sens., 2024

MeshXL: Neural Coordinate Field for Generative 3D Foundation Models.
CoRR, 2024

Dynamic Token-Pass Transformers for Semantic Segmentation.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

Multimodal LLM Enhanced Cross-lingual Cross-modal Retrieval.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Continuous-Multiple Image Outpainting in One-Step via Positional Query and A Diffusion-based Approach.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Paint3D: Paint Anything 3D With Lighting-Less Texture Diffusion Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Enhanced Visual Instruction Tuning with Synthesized Image-Dialogue Data.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023
Land Use and Land Cover Mapping in China Using Multimodal Fine-Grained Dual Network.
IEEE Trans. Geosci. Remote. Sens., 2023

An M-Nary SAR Image Change Detection Based on GAN Architecture Search.
IEEE Trans. Geosci. Remote. Sens., 2023

ChartLlama: A Multimodal LLM for Chart Understanding and Generation.
CoRR, 2023

InfMLLM: A Unified Framework for Visual-Language Tasks.
CoRR, 2023

Robust Geometry-Preserving Depth Estimation Using Differentiable Rendering.
CoRR, 2023

StableLLaVA: Enhanced Visual Instruction Tuning with Synthesized Image-Dialogue Data.
CoRR, 2023

ICPC: Instance-Conditioned Prompting with Contrastive Learning for Semantic Segmentation.
CoRR, 2023

Improved Neural Radiance Fields Using Pseudo-depth and Fusion.
CoRR, 2023

ES-MVSNet: Efficient Framework for End-to-end Self-supervised Multi-View Stereo.
CoRR, 2023

D2Q-DETR: Decoupling and Dynamic Queries for Oriented Object Detection with Transformers.
CoRR, 2023

Data Pruning via Moving-one-Sample-out.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Semi-supervised Semantic Segmentation with Mutual Knowledge Distillation.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Points-to-3D: Bridging the Gap between Sparse Points and Shape-Controllable Text-to-3D Generation.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Mixture-of-Experts Learner for Single Long-Tailed Domain Generalization.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

UniNeXt: Exploring A Unified Architecture for Vision Recognition.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Patch-level Contrastive Learning via Positional Query for Visual Pre-training.
Proceedings of the International Conference on Machine Learning, 2023

LMSeg: Language-guided Multi-dataset Segmentation.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Robust Geometry-Preserving Depth Estimation Using Differentiable Rendering.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

D<sup>2</sup>Q-DETR: Decoupling and Dynamic Queries for Oriented Object Detection with Transformers.
Proceedings of the IEEE International Conference on Acoustics, 2023

Foundation Model Drives Weakly Incremental Learning for Semantic Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Efficient Mask Correction for Click-Based Interactive Image Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Frequency Domain Disentanglement for Arbitrary Neural Style Transfer.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

Point-Teaching: Weakly Semi-supervised Object Detection with Point Annotations.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

SwinRDM: Integrate SwinRNN with Diffusion Model towards High-Resolution and High-Quality Weather Forecasting.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Contrastive Haze-Aware Learning for Dynamic Remote Sensing Image Dehazing.
IEEE Trans. Geosci. Remote. Sens., 2022

Beyond Classifiers: Remote Sensing Change Detection with Metric Learning.
Remote. Sens., 2022

Towards a More Realistic and Detailed Deep-Learning-Based Radar Echo Extrapolation Method.
Remote. Sens., 2022

PolyBuilding: Polygon Transformer for End-to-End Building Extraction.
CoRR, 2022

Hierarchical Normalization for Robust Monocular Depth Estimation.
CoRR, 2022

FAKD: Feature Augmented Knowledge Distillation for Semantic Segmentation.
CoRR, 2022

Implicit Semantic Augmentation for Distance Metric Learning in Domain Generalization.
CoRR, 2022

Point RCNN: An Angle-Free Framework for Rotated Object Detection.
CoRR, 2022

SwinVRNN: A Data-Driven Ensemble Forecasting Model via Learned Distribution Perturbation.
CoRR, 2022

MimCo: Masked Image Modeling Pre-training with Contrastive Teacher.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Semantic Data Augmentation based Distance Metric Learning for Domain Generalization.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Poseur: Direct Human Pose Regression with Transformers.
Proceedings of the Computer Vision - ECCV 2022, 2022

2021
A Simple Baseline for Semi-supervised Semantic Segmentation with Strong Data Augmentation.
CoRR, 2021

TFPose: Direct Human Pose Estimation with Transformers.
CoRR, 2021

Object Detection Made Simpler by Eliminating Heuristic NMS.
CoRR, 2021

Get better 1 pixel PCK: ladder scales correspondence flow networks for remote sensing image matching in higher resolution.
Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2021

A Simple Baseline for Semi-supervised Semantic Segmentation with Strong Data Augmentation<sup>*</sup>.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Instant-Teaching: An End-to-End Semi-Supervised Object Detection Framework.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020
Unsupervised Style Transfer via Dualgan for Cross-Domain Aerial Image Classification.
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium, 2020

2017
The Opensesame NIST 2016 Speaker Recognition Evaluation System.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017


  Loading...