Wan-Cyuan Fan

According to our database1, Wan-Cyuan Fan authored at least 16 papers between 2021 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
In-Depth and In-Breadth: Pre-training Multimodal Language Models Customized for Comprehensive Chart Understanding.
CoRR, July, 2025

TAM-VT: Transformation-Aware Multi-Scale Video Transformer for Segmentation and Tracking.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2025

Response Wide Shut? Surprising Observations in Basic Vision Language Model Capabilities.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024
MMFactory: A Universal Solution Search Engine for Vision-Language Tasks.
CoRR, 2024

Response Wide Shut: Surprising Observations in Basic Vision Language Model Capabilities.
CoRR, 2024

On Pre-training of Multimodal Language Models Customized for Chart Understanding.
CoRR, 2024

The Second Visual Object Tracking Segmentation VOTS2024 Challenge Results.
Proceedings of the Computer Vision - ECCV 2024 Workshops, 2024

2023
M3T: Multi-Scale Memory Matching for Video Object Segmentation and Tracking.
CoRR, 2023

IoU-Aware Multi-Expert Cascade Network Via Dynamic Ensemble for Long-Tailed Object Detection.
Proceedings of the IEEE International Conference on Acoustics, 2023

Target-Free Text-Guided Image Manipulation.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

Frido: Feature Pyramid Diffusion for Complex Scene Image Synthesis.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Paraphrasing Is All You Need for Novel Object Captioning.
CoRR, 2022

Paraphrasing Is All You Need for Novel Object Captioning.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Scene Graph Expansion for Semantics-Guided Image Outpainting.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Cross-Modal Mutual Learning for Audio-Visual Speech Recognition and Manipulation.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
LayoutTransformer: Scene Layout Generation With Conceptual and Spatial Diversity.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021


  Loading...