Guanglu Song

Orcid: 0000-0001-5391-5712

According to our database¹, Guanglu Song authored at least 71 papers between 2017 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

Salt: Self-Consistent Distribution Matching with Cache-Aware Training for Fast Video Generation.

[BibT_eX]

[DOI]

CoRR, April, 2026

Improving Joint Audio-Video Generation with Cross-Modal Context Learning.

[BibT_eX]

[DOI]

CoRR, March, 2026

AR-CoPO: Align Autoregressive Video Generation with Contrastive Policy Optimization.

[BibT_eX]

[DOI]

CoRR, March, 2026

2025

Neighbor GRPO: Contrastive ODE Policy Optimization Aligns Flow Models.

[BibT_eX]

[DOI]

CoRR, November, 2025

Towards Seamless Borders: A Method for Mitigating Inconsistencies in Image Inpainting and Outpainting.

[BibT_eX]

[DOI]

CoRR, June, 2025

ADT: Tuning Diffusion Models with Adversarial Supervision.

[BibT_eX]

[DOI]

CoRR, April, 2025

High-Fidelity Diffusion Face Swapping with ID-Constrained Facial Conditioning.

[BibT_eX]

[DOI]

CoRR, March, 2025

Robo-MUTUAL: Robotic Multimodal Task Specification via Unimodal Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Robotics and Automation, 2025

EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

MMSearch: Unveiling the Potential of Large Models as Multi-modal Search Engines.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

See Further When Clear: Curriculum Consistency Model.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024

VividFace: A Diffusion-Based Hybrid Framework for High-Fidelity Video Face Swapping.

[BibT_eX]

[DOI]

CoRR, 2024

See Further When Clear: Curriculum Consistency Model.

[BibT_eX]

[DOI]

CoRR, 2024

MMSearch: Benchmarking the Potential of Large Models as Multi-modal Search Engines.

[BibT_eX]

[DOI]

CoRR, 2024

Phased Consistency Model.

[BibT_eX]

[DOI]

Fu-Yun Wang

Zhaoyang Huang

Alexander William Bergman

CoRR, 2024

Visual CoT: Unleashing Chain-of-Thought Reasoning in Multi-Modal Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

AnimateLCM: Accelerating the Animation of Personalized Diffusion Models and Adapters with Decoupled Consistency Learning.

[BibT_eX]

[DOI]

CoRR, 2024

AnimateLCM: Computation-Efficient Personalized Style Video Generation without Personalized Video Data.

[BibT_eX]

[DOI]

Proceedings of the SIGGRAPH Asia 2024 Technical Communications, 2024

MoVA: Adapting Mixture of Vision Experts to Multimodal Context.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Phased Consistency Models.

[BibT_eX]

[DOI]

Fu-Yun Wang

Zhaoyang Huang

Alexander William Bergman

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought Reasoning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Exploring the Role of Large Language Models in Prompt Encoding for Diffusion Models.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Three Things We Need to Know About Transferring Stable Diffusion to Visual Dense Prediction Tasks.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

Deep Reward Supervisions for Tuning Text-to-Image Diffusion Models.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

Be-Your-Outpainter: Mastering Video Outpainting Through Input-Specific Adaptation.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

ZoLA: Zero-Shot Creative Long Animation Generation with Short Video Model.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

Rethinking the Spatial Inconsistency in Classifier-Free Diffusion Guidance.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

LMDrive: Closed-Loop End-to-End Driving with Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023

Teach-DETR: Better Training DETR With Teachers.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., December, 2023

UniFormer: Unifying Convolution and Self-Attention for Visual Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., October, 2023

Towards Large-scale Masked Face Recognition.

[BibT_eX]

[DOI]

CoRR, 2023

Gen-L-Video: Multi-Text to Long Video Generation via Temporal Co-Denoising.

[BibT_eX]

[DOI]

CoRR, 2023

RAPHAEL: Text-to-Image Generation via Large Mixture of Diffusion Paths.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

DETRs with Collaborative Hybrid Assignments Training.

[BibT_eX]

[DOI]

Zhuofan Zong

Guanglu Song

Yu Liu

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Temporal Enhanced Training of Multi-view 3D Object Detector via Historical Object Prediction.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Decoupled DETR: Spatially Disentangling Localization and Classification for Improved End-to-End Object Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Masked Autoencoders Are Stronger Knowledge Distillers.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

UniKD: Universal Knowledge Distillation for Mimicking Homogeneous or Heterogeneous Object Detectors.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

2022

Large-batch Optimization for Dense Visual Predictions.

[BibT_eX]

[DOI]

CoRR, 2022

UniFormer: Unified Transformer for Efficient Spatiotemporal Representation Learning.

[BibT_eX]

[DOI]

CoRR, 2022

Large-batch Optimization for Dense Visual Predictions: Training Faster R-CNN in 4.2 Minutes.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

UniFormer: Unified Transformer for Efficient Spatial-Temporal Representation Learning.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

Self-slimmed Vision Transformer.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Towards Robust Face Recognition with Comprehensive Search.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Rethinking Robust Representation Learning Under Fine-Grained Noisy Faces.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

UniNet: Unified Architecture Search with Convolution, Transformer, and MLP.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Unifying Visual Perception by Dispersible Points Learning.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

2021

INTERN: A New Learning Paradigm Towards General Vision.

[BibT_eX]

[DOI]

CoRR, 2021

FNAS: Uncertainty-Aware Fast Neural Architecture Search.

[BibT_eX]

[DOI]

CoRR, 2021

Scale Semantic Flow Preserving Across Image Pyramid.

[BibT_eX]

[DOI]

Zhili Lin

Guanglu Song

Biao Leng

Proceedings of the Neural Information Processing - 28th International Conference, 2021

PCNET: Parallelly Conquer the Large Variance of Person Re-Identification.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE International Conference on Image Processing, 2021

Rectifying the Data Bias in Knowledge Distillation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2021

Switchable K-class Hyperplanes for Noise-Robust Representation Learning.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

2020

Weighted triple-sequence loss for video-based person re-identification.

[BibT_eX]

[DOI]

Neurocomputing, 2020

1st place solution for AVA-Kinetics Crossover in AcitivityNet Challenge 2020.

[BibT_eX]

[DOI]

CoRR, 2020

1st Place Solutions for OpenImage2019 - Object Detection and Instance Segmentation.

[BibT_eX]

[DOI]

CoRR, 2020

Top-1 Solution of Multi-Moments in Time Challenge 2019.

[BibT_eX]

[DOI]

CoRR, 2020

Discriminability Distillation in Group Representation Learning.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

Revisiting the Sibling Head in Object Detector.

[BibT_eX]

[DOI]

Guanglu Song

Yu Liu

Xiaogang Wang

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

KPNet: Towards Minimal Face Detector.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

Spatial-Transformed Regional Quality Estimation Network for Large-Variance Person Re-Identification.

[BibT_eX]

[DOI]

IEEE Access, 2019

Scale Pyramid Attention for Single Shot MultiBox Detector.

[BibT_eX]

[DOI]

IEEE Access, 2019

Towards Flops-Constrained Face Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019

2018

Fast Portrait Matting Using Spatial Detail-Preserving Network.

[BibT_eX]

[DOI]

Proceedings of the Neural Information Processing - 25th International Conference, 2018

Transductive Centroid Projection for Semi-supervised Large-Scale Recognition.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2018, 2018

Beyond Trade-Off: Accelerate FCN-Based Face Detector With Higher Accuracy.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Region-Based Quality Estimation Network for Large-Scale Person Re-Identification.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017

Spatial Quality Aware Network for Video-Based Person Re-identification.

[BibT_eX]

[DOI]

Yujie Wang

Biao Leng

Guanglu Song

Proceedings of the Neural Information Processing - 24th International Conference, 2017

A Multi-level Weighted Representation for Person Re-identification.

[BibT_eX]

[DOI]

Xianglai Meng

Biao Leng

Guanglu Song

Proceedings of the Artificial Neural Networks and Machine Learning - ICANN 2017, 2017

Guanglu Song

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...