Haozhe Zhao

Orcid: 0000-0003-0502-4426

According to our database1, Haozhe Zhao authored at least 25 papers between 2021 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
NEP: Autoregressive Image Editing via Next Editing Token Prediction.
CoRR, August, 2025

MENTOR: Efficient Multimodal-Conditioned Tuning for Autoregressive Vision Generation Models.
CoRR, July, 2025

Teaching Large Language Models to Maintain Contextual Faithfulness via Synthetic Tasks and Reinforcement Learning.
CoRR, May, 2025

Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think.
CoRR, February, 2025

LongViTU: Instruction Tuning for Long-Form Video Understanding.
CoRR, January, 2025

MMEvalPro: Calibrating Multimodal Benchmarks Towards Trustworthy and Efficient Evaluation.
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image Generation.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Aligning Large Language Models to Follow Instructions and Hallucinate Less via Effective Data Filtering.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024
Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey.
CoRR, 2024

Looking Beyond Text: Reducing Language bias in Large Vision-Language Models via Multimodal Dual-Attention and Soft-Image Guidance.
CoRR, 2024

Selecting Influential Samples for Long Context Alignment via Homologous Models' Guidance and Contextual Awareness Measurement.
CoRR, 2024

Rethinking Semantic Parsing for Large Language Models: Enhancing LLM Performance with Semantic Hints.
CoRR, 2024

UltraEdit: Instruction-based Fine-Grained Image Editing at Scale.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Mitigating Language-Level Performance Disparity in mPLMs via Teacher Language Selection and Cross-lingual Self-Distillation.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

MMICL: Empowering Vision-language Model with Multi-Modal In-Context Learning.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models.
Proceedings of the Computer Vision - ECCV 2024, 2024

Improving the Robustness of Distantly-Supervised Named Entity Recognition via Uncertainty-Aware Teacher Learning and Student-Student Collaborative Learning.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chain.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023
ML-Bench: Large Language Models Leverage Open-source Libraries for Machine Learning Tasks.
CoRR, 2023

Distantly-Supervised Named Entity Recognition with Uncertainty-aware Teacher Learning and Student-student Collaborative Learning.
CoRR, 2023

Towards End-to-End Embodied Decision Making via Multi-modal Large Language Model: Explorations with GPT4-Vision and Beyond.
CoRR, 2023

Removing Camouflage and Revealing Collusion: Leveraging Gang-crime Pattern in Fraudster Detection.
Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023

Coarse-to-Fine Dual Encoders are Better Frame Identification Learners.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Empowering MultiModal Models' In-Context Learning Ability through Large Language Models.
Proceedings of the ACM Turing Award Celebration Conference - China 2023, 2023

2021
Traffic Accident Prediction Methods Based on Multi-factor Models.
Proceedings of the Knowledge Science, Engineering and Management, 2021


  Loading...