Yushuo Guan

Orcid: 0000-0001-5258-2397

According to our database¹, Yushuo Guan authored at least 18 papers between 2019 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

LatentOmni: Rethinking Omni-Modal Understanding via Unified Audio-Visual Latent Reasoning.

[BibT_eX]

[DOI]

CoRR, May, 2026

OmniSIFT: Modality-Asymmetric Token Compression for Efficient Omni-modal Large Language Models.

[BibT_eX]

[DOI]

CoRR, February, 2026

Semantic Routing: Exploring Multi-Layer LLM Feature Weighting for Diffusion Transformers.

[BibT_eX]

[DOI]

CoRR, February, 2026

2025

GRAN-TED: Generating Robust, Aligned, and Nuanced Text Embedding for Diffusion Models.

[BibT_eX]

[DOI]

CoRR, December, 2025

The Unseen Bias: How Norm Discrepancy in Pre-Norm MLLMs Leads to Visual Information Loss.

[BibT_eX]

[DOI]

CoRR, December, 2025

VersaVid-R1: A Versatile Video Understanding and Reasoning Model from Question Answering to Captioning Tasks.

[BibT_eX]

[DOI]

CoRR, June, 2025

MME-VideoOCR: Evaluating OCR-Based Capabilities of Multimodal LLMs in Video Scenarios.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

Mavors: Multi-granularity Video Representation for Multimodal Large Language Model.

[BibT_eX]

[DOI]

Proceedings of the 33rd ACM International Conference on Multimedia, 2025

VidCapBench: A Comprehensive Benchmark of Video Captioning for Controllable Text-to-Video Generation.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2025

2023

DAIS: Automatic Channel Pruning via Differentiable Annealing Indicator Search.

[BibT_eX]

[DOI]

IEEE Trans. Neural Networks Learn. Syst., December, 2023

2021

EPASS360: QoE-Aware 360-Degree Video Streaming Over Mobile Devices.

[BibT_eX]

[DOI]

IEEE Trans. Mob. Comput., 2021

2020

PERM: Neural Adaptive Video Streaming with Multi-path Transmission.

[BibT_eX]

[DOI]

Proceedings of the 39th IEEE Conference on Computer Communications, 2020

Preference-Aware Mask for Session-Based Recommendation with Bidirectional Transformer.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

A New Perspective for Flexible Feature Gathering in Scene Text Recognition Via Character Anchor Pooling.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Differentiable Feature Aggregation Search for Knowledge Distillation.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

2019

Alchemy: Techniques for Rectification Based Irregular Scene Text Recognition.

[BibT_eX]

[DOI]

CoRR, 2019

Symmetry-Constrained Rectification Network for Scene Text Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

A Neural Attack Model for Cracking Passwords in Adversarial Environments.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CIC International Conference on Communications in China, 2019

Yushuo Guan

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...