Yaosi Hu

Chang Wen Chen

IEEE Trans. Image Process., 2026

2025

Quality Assessment of Audiovisual Communication in Videotelephony: New ITU-T P.940 and P Suppl.31.

[BibT_eX]

[DOI]

Rafael G. Sotelo Bovino

IEEE Trans. Consumer Electron., November, 2025

LaMD: Latent Motion Diffusion for Image-Conditional Video Generation.

[BibT_eX]

[DOI]

Int. J. Comput. Vis., July, 2025

LongCaptioning: Unlocking the Power of Long Video Caption Generation in Large Multimodal Models.

[BibT_eX]

[DOI]

CoRR, February, 2025

Remote Sensing Semantic Segmentation Quality Assessment Based on Vision Language Model.

[BibT_eX]

[DOI]

IEEE Trans. Geosci. Remote. Sens., 2025

TIV-Diffusion: Towards Object-Centric Movement for Text-driven Image to Video Generation.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

SubjectDrive: Scaling Generative Data in Autonomous Driving via Subject Control.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024

A Benchmark for Controllable Text -Image-to-Video Generation.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2024

Memory-guided representation matching for unsupervised video anomaly detection.

[BibT_eX]

[DOI]

Yiran Tao

J. Vis. Commun. Image Represent., 2024

2023

Multiple visual relationship forecasting and arrangement in videos.

[BibT_eX]

[DOI]

Neurocomputing, July, 2023

LaMD: Latent Motion Diffusion for Video Generation.

[BibT_eX]

[DOI]

CoRR, 2023

A Lightweight No-reference Video Quality Assessment Method.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Visual Communications and Image Processing, 2023

2022

Subjective Evaluation of Visual Quality and Simulator Sickness of Short 360$^\circ$ Videos: ITU-T Rec. P.919.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2022

Predicate Correlation Learning for Scene Graph Generation.

[BibT_eX]

[DOI]

IEEE Trans. Image Process., 2022

Decomposing style, content, and motion for videos.

[BibT_eX]

[DOI]

J. Vis. Commun. Image Represent., 2022

Learning Human Cognitive Appraisal Through Reinforcement Memory Unit.

[BibT_eX]

[DOI]

CoRR, 2022

Video Quality Assessment based on Quality Aggregation Networks.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Visual Communications and Image Processing, 2022

Make It Move: Controllable Image-to-Video Generation with Text Descriptions.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Subjective Quality Assessment of One-to-One Video-Telephony Services.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Broadband Multimedia Systems and Broadcasting, 2022

2021

MAPS: Joint Multimodal Attention and POS Sequence Generation for Video Captioning.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Visual Communications and Image Processing, 2021

Learn to Look Around: Deep Reinforcement Learning Agent for Video Saliency Prediction.

[BibT_eX]

[DOI]

Yiran Tao

Proceedings of the International Conference on Visual Communications and Image Processing, 2021

2020

Exploiting the local temporal information for video captioning.

[BibT_eX]

[DOI]

J. Vis. Commun. Image Represent., 2020

A Multimodal Variational Encoder-Decoder Framework for Micro-video Popularity Prediction.

[BibT_eX]

[DOI]

Proceedings of the WWW '20: The Web Conference 2020, Taipei, Taiwan, April 20-24, 2020, 2020

Subjective Study of Perceptual Quality for Micro-Video Applications.

[BibT_eX]

[DOI]

Proceedings of the 3rd IEEE Conference on Multimedia Information Processing and Retrieval, 2020

2019

Hierarchical Global-Local Temporal Modeling for Video Captioning.

[BibT_eX]

[DOI]

Proceedings of the 27th ACM International Conference on Multimedia, 2019

Two-Stream Refinement Network for RGB-D Saliency Detection.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE International Conference on Image Processing, 2019

2018

RGB-D Semantic Segmentation: A Review.

[BibT_eX]

[DOI]