Haoyuan Shi

This page is a disambiguation page, it actually contains multiple papers from persons of the same or a similar name.

Bibliography

2026
CSHNet: A Novel Information Asymmetric Image Translation Method.
IEEE Trans. Circuits Syst. Video Technol., February, 2026

MSVBench: Towards Human-Level Evaluation of Multi-Shot Video Generation.
CoRR, February, 2026

Efficient Plug-and-Play Weight Refinement for Sparse Large Models.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025
EventSTU: Event-Guided Efficient Spatio-Temporal Understanding for Video Large Language Models.
CoRR, November, 2025

Uni-MoE-2.0-Omni: Scaling Language-Centric Omnimodal Large Model with Advanced MoE, Training and Data.
CoRR, November, 2025

UniMoE-Audio: Unified Speech and Music Generation with Dynamic-Capacity MoE.
CoRR, October, 2025

Johnson-Lindenstrauss Lemma Guided Network for Efficient 3D Medical Segmentation.
CoRR, September, 2025

Does DINOv3 Set a New Medical Vision Standard?
CoRR, September, 2025

AniMaker: Automated Multi-Agent Animated Storytelling with MCTS-Driven Clip Generation.
CoRR, June, 2025

Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models.
CoRR, May, 2025

AI Awareness.
CoRR, April, 2025

VideoVista-CulturalLingo: 360<sup>°</sup> Horizons-Bridging Cultures, Languages, and Domains in Video Comprehension.
CoRR, April, 2025

Interpretable Dynamic Directed Graph Convolutional Network for Multi-Relational Prediction of Missense Mutation and Drug Response.
IEEE J. Biomed. Health Informatics, February, 2025

S<sup>3</sup>OIL: Semi-Supervised SAR-to-Optical Image Translation via Multi-Scale and Cross-Set Matching.
IEEE Trans. Image Process., 2025

DRExplainer: Quantifiable interpretability in drug response prediction with directed graph convolutional network.
Artif. Intell. Medicine, 2025

AniMaker: Multi-Agent Animated Storytelling with MCTS-Driven Clip Generation.
Proceedings of the SIGGRAPH Asia 2025 Conference Papers, 2025

TokenUnify: Scaling Up Autoregressive Pretraining for Neuron Segmentation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

GTGM: Generative Text-Guided 3D Vision-Language Pretraining for Medical Image Segmentation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, ICCV 2025, 2025

VideoVista-CulturalLingo: 360° Horizons-Bridging Cultures, Languages, and Domains in Video Comprehension.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024
Reusability report: Uncovering associations in biomedical bipartite networks via a bilinear attention network with domain adaptation.
Nat. Mac. Intell., 2024

Controllable Edge-Type-Specific Interpretation in Multi-Relational Graph Neural Networks for Drug Response Prediction.
CoRR, 2024

DRExplainer: Quantifiable Interpretability in Drug Response Prediction with Directed Graph Convolutional Network.
CoRR, 2024

VideoVista: A Versatile Benchmark for Video Understanding and Reasoning.
CoRR, 2024

TokenUnify: Scalable Autoregressive Visual Pre-training with Mixture Token Prediction.
CoRR, 2024

Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation.
Proceedings of the SIGGRAPH Asia 2024 Conference Papers, 2024

Separating Spectral and Spatial Feature Aggregation for Demosaicking.
Proceedings of the International Joint Conference on Neural Networks, 2024

VisionGraph: Leveraging Large Multimodal Models for Graph Theory Problems in Visual Context.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Tracking Control of Mobile Robots for Moving Targets in Unknown Environments Based on Improved Dynamic Window Approach.
Proceedings of the 5th International Conference on Artificial Intelligence and Computer Engineering, 2024

KIE-STCformer: Key Information Enhanced Spatio-Temporal Correction Transformer for Time Series Forecasting.
Proceedings of the 2024 8th International Conference on Computer Science and Artificial Intelligence, 2024

Cognitive Visual-Language Mapper: Advancing Multimodal Comprehension with Enhanced Visual Knowledge Alignment.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023
GRA-GCN: Dense Granule Protein Prediction in Apicomplexa Protozoa Through Graph Convolutional Network.
IEEE ACM Trans. Comput. Biol. Bioinform., 2023

Toward Moiré-Free and Detail-Preserving Demosaicking.
CoRR, 2023


  Loading...