Zhe Kong

Orcid: 0000-0003-1078-3806

According to our database1, Zhe Kong authored at least 24 papers between 2014 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
LongCat-Video-Avatar 1.5 Technical Report.
CoRR, May, 2026

MotionMERGE: A Multi-granular Framework for Human Motion Editing, Reasoning, Generation, and Explanation.
CoRR, May, 2026

Multi-target point path planning algorithm for mobile robot based on probabilistic roadmap.
Intell. Serv. Robotics, January, 2026

2025
AnyTalker: Scaling Multi-Person Talking Video Generation with Interactivity Refinement.
CoRR, November, 2025

InfiniteTalk: Audio-driven Video Generation for Sparse-Frame Video Dubbing.
CoRR, August, 2025

Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation.
CoRR, May, 2025

DAM-VSR: Disentanglement of Appearance and Motion for Video Super-Resolution.
Proceedings of the Special Interest Group on Computer Graphics and Interactive Techniques Conference, 2025

FineMotion: A Dataset and Benchmark with Both Spatial and Temporal Annotation for Fine-Grained Motion Generation and Editing.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

MOERL: When Mixture-Of-Experts Meet Reinforcement Learning for Adverse Weather Image Restoration.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

Scalable Dual Fingerprinting for Hierarchical Attribution of Text-to-Image Models.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

MG-MotionLLM: A Unified Framework for Motion Comprehension and Generation across Multiple Granularities.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024
Dual Teacher Knowledge Distillation With Domain Alignment for Face Anti-Spoofing.
IEEE Trans. Circuits Syst. Video Technol., December, 2024

Taming Self-Supervised Learning for Presentation Attack Detection: De-Folding and De-Mixing.
IEEE Trans. Neural Networks Learn. Syst., August, 2024

StereoCrafter: Diffusion-based Generation of Long and High-fidelity Stereoscopic 3D from Monocular Videos.
CoRR, 2024

Enhancing Generative Generalized Zero Shot Learning via Multi-Space Constraints and Adaptive Integration.
Proceedings of the MultiMedia Modeling - 30th International Conference, 2024

Enhancing Document-Level Event Extraction via Structure-Aware Heterogeneous Graph with Multi-Granularity Subsentences.
Proceedings of the IEEE International Conference on Acoustics, 2024

OMG: Occlusion-Friendly Personalized Multi-concept Generation in Diffusion Models.
Proceedings of the Computer Vision - ECCV 2024, 2024

2022
Fingerprint Presentation Attack Detection by Channel-Wise Feature Denoising.
IEEE Trans. Inf. Forensics Secur., 2022

Multi-level Fusion of Multi-modal Semantic Embeddings for Zero Shot Learning.
Proceedings of the International Conference on Multimodal Interaction, 2022

Searching Models with Nested Attention for Blind Super-Resolution.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

A Noise-Aware Framework for Blind Image Super-Resolution.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

2021
Taming Self-Supervised Learning for Presentation Attack Detection: In-Image De-Folding and Out-of-Image De-Mixing.
CoRR, 2021

Entity Extraction of Electrical Equipment Malfunction Text by a Hybrid Natural Language Processing Algorithm.
IEEE Access, 2021

2014
A New Design of Video Streaming Capture and Transmission System.
J. Multim., 2014


  Loading...