Hongwei Xue

According to our database1, Hongwei Xue authored at least 21 papers between 2020 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Act Wisely: Cultivating Meta-Cognitive Tool Use in Agentic Multimodal Models.
CoRR, April, 2026

MM-CondChain: A Programmatically Verified Benchmark for Visually Grounded Deep Compositional Reasoning.
CoRR, March, 2026

Beyond Scattered Acceptance: Fast and Coherent Inference for DLMs via Longest Stable Prefixes.
CoRR, March, 2026

SwimBird: Eliciting Switchable Reasoning Mode in Hybrid Autoregressive MLLMs.
CoRR, February, 2026

2025
You Only Forward Once: An Efficient Compositional Judging Paradigm.
CoRR, November, 2025

CrossLMM: Decoupling Long Video Sequences from LMMs via Dual Cross-Attention Mechanisms.
CoRR, May, 2025

2024
Multi-Modal Generative Embedding Model.
CoRR, 2024

Visual Perception by Large Language Model's Weights.
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

2023
CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language Alignment.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Stare at What You See: Masked Image Modeling without Reconstruction.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
A coarse-to-fine and automatic algorithm for breast diagnosis on multi-series MRI images.
Frontiers Comput. Sci., 2022

CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language Representation Alignment.
CoRR, 2022

Long-Form Video-Language Pre-Training with Multimodal Temporal Contrastive Learning.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Tri-axial Motion Sensing with Mechanomagnetic Effect for Human-Machine Interface.
Proceedings of the Intelligent Robotics and Applications - 15th International Conference, 2022

Advancing High-Resolution Video-Language Representation with Large-Scale Video Transcriptions.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
Probing Inter-modality: Visual Parsing with Self-Attention for Vision-Language Pre-training.
CoRR, 2021

Probing Inter-modality: Visual Parsing with Self-Attention for Vision-and-Language Pre-training.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Learning Fine-Grained Motion Embedding for Landscape Animation.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Semantic Tag Augmented XlanV Model for Video Captioning.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Unifying Multimodal Transformer for Bi-directional Image and Text Generation.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

2020
Sed-Net: Detecting Multi-Type Edits Of Images.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2020


  Loading...