Harry Yang

According to our database1, Harry Yang authored at least 33 papers between 2014 and 2025.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
TalkVid: A Large-Scale Diversified Dataset for Audio-Driven Talking Head Synthesis.
CoRR, August, 2025

Hierarchical Fine-grained Preference Optimization for Physically Plausible Video Generation.
CoRR, August, 2025

Meta-Learning for Speeding Up Large Model Inference in Decentralized Environments.
CoRR, August, 2025

Follow-Your-Shape: Shape-Aware Image Editing via Trajectory-Guided Region Control.
CoRR, August, 2025

Enhancing Vector Quantization with Distributional Matching: A Theoretical and Empirical Study.
CoRR, June, 2025

When Semantics Mislead Vision: Mitigating Large Multimodal Models Hallucinations in Scene Text Spotting and Understanding.
CoRR, June, 2025

Model Reveals What to Cache: Profiling-Based Feature Reuse for Video Diffusion Models.
CoRR, April, 2025

Temporal Regularization Makes Your Video Generator Stronger.
CoRR, March, 2025

Niagara: Normal-Integrated Geometric Affine Field for Scene Reconstruction from a Single View.
CoRR, March, 2025

VideoMerge: Towards Training-free Long Video Generation.
CoRR, March, 2025

LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference Optimization.
CoRR, March, 2025

VFX Creator: Animated Visual Effect Generation with Controllable Diffusion Transformer.
CoRR, February, 2025

Encrypted Large Model Inference: The Equivariant Encryption Paradigm.
CoRR, February, 2025

Intervening Anchor Token: Decoding Strategy in Alleviating Hallucinations for MLLMs.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Unveiling the Ignorance of MLLMs: Seeing Clearly, Answering Incorrectly.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024
Next Patch Prediction for Autoregressive Visual Generation.
CoRR, 2024

VideoGen-of-Thought: A Collaborative Framework for Multi-Shot Video Generation.
CoRR, 2024

OmniCreator: Self-Supervised Unified Generation with Universal Editing.
CoRR, 2024

DreamDance: Animating Human Images by Enriching 3D Geometry Cues from 2D Poses.
CoRR, 2024

Meta-Learning for Speeding Up Large Model Inference in Decentralized Environments.
CoRR, 2024

Model Agnostic Hybrid Sharding For Heterogeneous Distributed Inference.
CoRR, 2024

Complete Security and Privacy for AI Inference in Decentralized Systems.
CoRR, 2024

AnyV2V: A Plug-and-Play Framework For Any Video-to-Video Editing Tasks.
CoRR, 2024

ConsistI2V: Enhancing Visual Consistency for Image-to-Video Generation.
CoRR, 2024

2023
Latent-Shift: Latent Diffusion with Temporal Shift for Efficient Text-to-Video Generation.
CoRR, 2023

Make-A-Video: Text-to-Video Generation without Text-Video Data.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

2022
RegMixup: Mixup as a Regularizer Can Surprisingly Improve Accuracy and Out Distribution Robustness.
CoRR, 2022

Using Mixup as a Regularizer Can Surprisingly Improve Accuracy & Out-of-Distribution Robustness.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

MUGEN: A Playground for Video-Audio-Text Multimodal Understanding and GENeration.
Proceedings of the Computer Vision - ECCV 2022, 2022

Long Video Generation with Time-Agnostic VQGAN and Time-Sensitive Transformer.
Proceedings of the Computer Vision - ECCV 2022, 2022

2021
Robustness and Generalization via Generative Adversarial Training.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

2019
Fine-grained Synthesis of Unrestricted Adversarial Examples.
CoRR, 2019

2014
Low-rank SIFT: An affine invariant feature for place recognition.
Proceedings of the 2014 IEEE International Conference on Image Processing, 2014


  Loading...