Weiming Ren

Orcid: 0009-0000-1519-6710

According to our database1, Weiming Ren authored at least 26 papers between 2022 and 2026.

Collaborative distances:

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation.
CoRR, April, 2026

RationalRewards: Reasoning Rewards Scale Visual Generation Both Training and Test Time.
CoRR, April, 2026

VecGlypher: Unified Vector Glyph Generation with Language Models.
CoRR, February, 2026

2025
HiStream: Efficient High-Resolution Video Generation via Redundancy-Eliminated Streaming.
CoRR, December, 2025

OneStory: Coherent Multi-Shot Video Generation with Adaptive Memory.
CoRR, December, 2025

Scaling Zero-Shot Reference-to-Video Generation.
CoRR, December, 2025

TUNA: Taming Unified Visual Representations for Native Unified Multimodal Models.
CoRR, December, 2025

Hallucination Score: Towards Mitigating Hallucinations in Generative Image Super-Resolution.
CoRR, July, 2025

Pixel Reasoner: Incentivizing Pixel-Space Reasoning with Curiosity-Driven Reinforcement Learning.
CoRR, May, 2025

VideoEval-Pro: Robust and Realistic Long Video Understanding Evaluation.
CoRR, May, 2025

SAMPL: Self-Attention Modelled Patch Learning for Efficient Visual Understanding.
Proceedings of the 16th ACM Multimedia Systems Conference, 2025

OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

VAMBA: Understanding Hour-Long Videos with Hybrid Mamba-Transformers.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by Video Spatiotemporal Augmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024
ConsistI2V: Enhancing Visual Consistency for Image-to-Video Generation.
Trans. Mach. Learn. Res., 2024

Video Diffusion Models: A Survey.
Trans. Mach. Learn. Res., 2024

AnyV2V: A Tuning-Free Framework For Any Video-to-Video Editing Tasks.
Trans. Mach. Learn. Res., 2024

A positional-aware attention PCa detection network on multi-parametric MRI.
Signal Image Video Process., 2024

Bi-level weighted mixed-domain self-attention network for non-contact heart rate estimation.
Knowl. Based Syst., 2024

OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision.
CoRR, 2024

AnyV2V: A Plug-and-Play Framework For Any Video-to-Video Editing Tasks.
CoRR, 2024

StructLM: Towards Building Generalist Models for Structured Knowledge Grounding.
CoRR, 2024

ConsistI2V: Enhancing Visual Consistency for Image-to-Video Generation.
CoRR, 2024

MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark.
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

MMMU: A Massive Multi-Discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2022
HiCu: Leveraging Hierarchy for Curriculum Learning in Automated ICD Coding.
Proceedings of the Machine Learning for Healthcare Conference, 2022


  Loading...