Yushuo Guan

Orcid: 0000-0001-5258-2397

According to our database1, Yushuo Guan authored at least 18 papers between 2019 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
LatentOmni: Rethinking Omni-Modal Understanding via Unified Audio-Visual Latent Reasoning.
CoRR, May, 2026

OmniSIFT: Modality-Asymmetric Token Compression for Efficient Omni-modal Large Language Models.
CoRR, February, 2026

Semantic Routing: Exploring Multi-Layer LLM Feature Weighting for Diffusion Transformers.
CoRR, February, 2026

2025
GRAN-TED: Generating Robust, Aligned, and Nuanced Text Embedding for Diffusion Models.
CoRR, December, 2025

The Unseen Bias: How Norm Discrepancy in Pre-Norm MLLMs Leads to Visual Information Loss.
CoRR, December, 2025

VersaVid-R1: A Versatile Video Understanding and Reasoning Model from Question Answering to Captioning Tasks.
CoRR, June, 2025

MME-VideoOCR: Evaluating OCR-Based Capabilities of Multimodal LLMs in Video Scenarios.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

Mavors: Multi-granularity Video Representation for Multimodal Large Language Model.
Proceedings of the 33rd ACM International Conference on Multimedia, 2025

VidCapBench: A Comprehensive Benchmark of Video Captioning for Controllable Text-to-Video Generation.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

2023
DAIS: Automatic Channel Pruning via Differentiable Annealing Indicator Search.
IEEE Trans. Neural Networks Learn. Syst., December, 2023

2021
EPASS360: QoE-Aware 360-Degree Video Streaming Over Mobile Devices.
IEEE Trans. Mob. Comput., 2021

2020
PERM: Neural Adaptive Video Streaming with Multi-path Transmission.
Proceedings of the 39th IEEE Conference on Computer Communications, 2020

Preference-Aware Mask for Session-Based Recommendation with Bidirectional Transformer.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

A New Perspective for Flexible Feature Gathering in Scene Text Recognition Via Character Anchor Pooling.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Differentiable Feature Aggregation Search for Knowledge Distillation.
Proceedings of the Computer Vision - ECCV 2020, 2020

2019
Alchemy: Techniques for Rectification Based Irregular Scene Text Recognition.
CoRR, 2019

Symmetry-Constrained Rectification Network for Scene Text Recognition.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

A Neural Attack Model for Cracking Passwords in Adversarial Environments.
Proceedings of the 2019 IEEE/CIC International Conference on Communications in China, 2019


  Loading...