Dahua Lin

Orcid: 0000-0002-8865-7896

Affiliations:
  • Chinese University of Hong Kong, Department of Information Engineering, CUHK - SenseTime Joint Lab, Hong Kong
  • Toyota Technological Institute at Chicago, IL, USA
  • Massachusetts Institute of Technology, Cambridge, MA, USA (PhD 2012)


According to our database1, Dahua Lin authored at least 302 papers between 2005 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Delving Into the Devils of Bird's-Eye-View Perception: A Review, Evaluation and Recipe.
IEEE Trans. Pattern Anal. Mach. Intell., April, 2024

Weakly Supervised 3-D Building Reconstruction From Monocular Remote Sensing Images.
IEEE Trans. Geosci. Remote. Sens., 2024

Are We on the Right Way for Evaluating Large Vision-Language Models?
CoRR, 2024

RAR: Retrieving And Ranking Augmented MLLMs for Visual Recognition.
CoRR, 2024

Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models.
CoRR, 2024

GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image.
CoRR, 2024

DevBench: A Comprehensive Benchmark for Software Development.
CoRR, 2024

3DTopia: Large Text-to-3D Generation Model with Hybrid Diffusion Priors.
CoRR, 2024

WanJuan-CC: A Safe and High-Quality Open-sourced English Webtext Dataset.
CoRR, 2024

SongComposer: A Large Language Model for Lyric and Melody Composition in Song Generation.
CoRR, 2024

Data-freeWeight Compress and Denoise for Large Language Models.
CoRR, 2024

DualFocus: Integrating Macro and Micro Perspectives in Multi-modal Large Language Models.
CoRR, 2024

Balanced Data Sampling for Language Model Training with Clustering.
CoRR, 2024

CriticBench: Evaluating Large Language Models as Critic.
CoRR, 2024

LongWanjuan: Towards Systematic Measurement for Long Text Quality.
CoRR, 2024

Identifying Semantic Induction Heads to Understand In-Context Learning.
CoRR, 2024

Code Needs Comments: Enhancing Code LLMs with Comment Augmentation.
CoRR, 2024

Turn Waste into Worth: Rectifying Top-k Router of MoE.
CoRR, 2024

Mixed Gaussian Flow for Diverse Trajectory Prediction.
CoRR, 2024

SepRep-Net: Multi-source Free Domain Adaptation via Model Separation And Reparameterization.
CoRR, 2024

InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning.
CoRR, 2024

LV-Eval: A Balanced Long-Context Benchmark with 5 Length Levels Up to 256K.
CoRR, 2024

SALAD-Bench: A Hierarchical and Comprehensive Safety Benchmark for Large Language Models.
CoRR, 2024

Navigating the OverKill in Large Language Models.
CoRR, 2024

InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model.
CoRR, 2024

F-Eval: Asssessing Fundamental Abilities with Refined Evaluation Methods.
CoRR, 2024

Query of CC: Unearthing Large Scale Domain-Specific Knowledge from Public Corpora.
CoRR, 2024

Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback.
CoRR, 2024

SwinTextSpotter v2: Towards Better Synergy for Scene Text Spotting.
CoRR, 2024

GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation.
CoRR, 2024

Characterization of Large Language Model Development in the Datacenter.
Proceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation, 2024

Parcae: Proactive, Liveput-Optimized DNN Training on Preemptible Instances.
Proceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation, 2024

SpotServe: Serving Generative Large Language Models on Preemptible Instances.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

2023
SPTS v2: Single-Point Scene Text Spotting.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2023

Parsing-Conditioned Anime Translation: A New Dataset and Method.
ACM Trans. Graph., 2023

A Coarse-to-Fine Framework for Automatic Video Unscreen.
IEEE Trans. Multim., 2023

EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI.
CoRR, 2023

Gemini vs GPT-4V: A Preliminary Comparison and Combination of Vision-Language Models Through Qualitative Cases.
CoRR, 2023

T-Eval: Evaluating the Tool Utilization Capability Step by Step.
CoRR, 2023

SceneWiz3D: Towards Text-guided 3D Scene Composition.
CoRR, 2023

HyperDreamer: Hyper-Realistic 3D Content Generation and Editing from a Single Image.
CoRR, 2023

Alpha-CLIP: A CLIP Model Focusing on Wherever You Want.
CoRR, 2023

OneLLM: One Framework to Align All Modalities with Language.
CoRR, 2023

Open-sourced Data Ecosystem in Autonomous Driving: the Present and Future.
CoRR, 2023

GPT4Point: A Unified Framework for Point-Language Understanding and Generation.
CoRR, 2023

VideoBooth: Diffusion-based Video Generation with Image Prompts.
CoRR, 2023

Scaffold-GS: Structured 3D Gaussians for View-Adaptive Rendering.
CoRR, 2023

VBench: Comprehensive Benchmark Suite for Video Generative Models.
CoRR, 2023

OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation.
CoRR, 2023

Betrayed by Attention: A Simple yet Effective Approach for Self-supervised Video Object Segmentation.
CoRR, 2023

Cinematic Behavior Transfer via NeRF-based Differentiable Filming.
CoRR, 2023

HumanGaussian: Text-Driven 3D Human Generation with Gaussian Splatting.
CoRR, 2023

SparseCtrl: Adding Sparse Controls to Text-to-Video Diffusion Models.
CoRR, 2023

InterControl: Generate Human Motion Interactions by Controlling Every Joint.
CoRR, 2023

ShareGPT4V: Improving Large Multi-Modal Models with Better Captions.
CoRR, 2023

Flames: Benchmarking Value Alignment of Chinese Large Language Models.
CoRR, 2023

SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction.
CoRR, 2023

BotChat: Evaluating LLMs' Capabilities of Having Multi-Turn Dialogues.
CoRR, 2023

HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion.
CoRR, 2023

Scaling Laws of RoPE-based Extrapolation.
CoRR, 2023

Shadow Alignment: The Ease of Subverting Safely-Aligned Language Models.
CoRR, 2023

InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition.
CoRR, 2023

LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion Models.
CoRR, 2023

Unified Human-Scene Interaction via Prompted Chain-of-Contacts.
CoRR, 2023

PointLLM: Empowering Large Language Models to Understand Point Clouds.
CoRR, 2023

WanJuan: A Comprehensive Multimodal Dataset for Advancing English and Chinese Large Models.
CoRR, 2023

Learning Referring Video Object Segmentation from Weak Annotation.
CoRR, 2023

DNA-Rendering: A Diverse Neural Actor Repository for High-Fidelity Human-centric Rendering.
CoRR, 2023

MMBench: Is Your Multi-modal Model an All-around Player?
CoRR, 2023

AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning.
CoRR, 2023

Scene as Occupancy.
CoRR, 2023

Proteus: Simulating the Performance of Distributed DNN Training.
CoRR, 2023

RenderMe-360: A Large Digital Asset Library and Benchmarks Towards High-fidelity Head Avatars.
CoRR, 2023

RIFormer: Keep Your Vision Backbone Effective While Removing Token Mixer.
CoRR, 2023

V3Det: Vast Vocabulary Visual Detection Dataset.
CoRR, 2023

SynBody: Synthetic Dataset with Layered Human Models for 3D Human Perception and Modeling.
CoRR, 2023

Position-Guided Point Cloud Panoptic Segmentation Transformer.
CoRR, 2023

PixMIM: Rethinking Pixel Reconstruction in Masked Image Modeling.
CoRR, 2023

Dynamic Storyboard Generation in an Engine-based Virtual Environment for Video Production.
CoRR, 2023

VR-NeRF: High-Fidelity Virtualized Walkable Spaces.
Proceedings of the SIGGRAPH Asia 2023 Conference Papers, 2023

HyperDreamer: Hyper-Realistic 3D Content Generation and Editing from a Single Image.
Proceedings of the SIGGRAPH Asia 2023 Conference Papers, 2023

Dynamic Storyboard Generation in an Engine-based Virtual Environment for Video Production.
Proceedings of the ACM SIGGRAPH 2023 Posters, 2023

RenderMe-360: A Large Digital Asset Library and Benchmarks Towards High-fidelity Head Avatars.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

HireVAE: An Online and Adaptive Factor Model Based on Hierarchical and Regime-Switch VAE.
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

Voxurf: Voxel-based Efficient and Accurate Neural Surface Reconstruction.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

SynBody: Synthetic Dataset with Layered Human Models for 3D Human Perception and Modeling.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

AssetField: Assets Mining and Reconfiguration in Ground Feature Plane Representation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

V3Det: Vast Vocabulary Visual Detection Dataset.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Learning Human Dynamics in Autonomous Driving Scenarios.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Scene as Occupancy.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Semantics Meets Temporal Correspondence: Self-supervised Object-centric Learning in Videos.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Improving Pixel-based MIM by Reducing Wasted Modeling Capability.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

DNA-Rendering: A Diverse Neural Actor Repository for High-Fidelity Human-centric Rendering.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

MatrixCity: A Large-scale City Dataset for City-scale Neural Rendering and Beyond.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

E2EAI: End-to-End Deep Learning Framework for Active Investing.
Proceedings of the 4th ACM International Conference on AI in Finance, 2023

Chimera: An Analytical Optimizing Framework for Effective Compute-intensive Operators Fusion.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2023

CLEVA: Chinese Language Models EVAluation Platform.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Cali-NCE: Boosting Cross-modal Video Representation Learning with Calibrated Alignment.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Grid-guided Neural Radiance Fields for Large Urban Scenes.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

MV-JAR: Masked Voxel Jigsaw and Reconstruction for LiDAR-Based Self-Supervised Pre-Training.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

OmniObject3D: Large-Vocabulary 3D Object Dataset for Realistic Perception, Reconstruction and Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

RIFormer: Keep Your Vision Backbone Effective But Removing Token Mixer.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Controllable Mesh Generation Through Sparse Latent Point Diffusion Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

OmniCity: Omnipotent City Understanding with Multi-Level and Multi-View Images.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Multi-Level Logit Distillation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

DORT: Modeling Dynamic Objects in Recurrent for Multi-Camera 3D Object Detection and Tracking.
Proceedings of the Conference on Robot Learning, 2023

2022
Force-Aware Interface via Electromyography for Natural VR/AR Interaction.
ACM Trans. Graph., 2022

Jointly Learning the Attributes and Composition of Shots for Boundary Detection in Videos.
IEEE Trans. Multim., 2022

Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR-Based Perception.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

CARAFE++: Unified Content-Aware ReAssembly of FEatures.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

Exploiting Deep Generative Prior for Versatile Image Restoration and Manipulation.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

Factor Investing with a Deep Multi-Factor Model.
CoRR, 2022

Rethinking Trajectory Prediction via "Team Game".
CoRR, 2022

Temporal and Contextual Transformer for Multi-Camera Editing of TV Shows.
CoRR, 2022

DG-STGCN: Dynamic Spatial-Temporal Modeling for Skeleton-based Action Recognition.
CoRR, 2022

Delving into the Devils of Bird's-eye-view Perception: A Review, Evaluation and Recipe.
CoRR, 2022

Guided Diffusion Model for Adversarial Purification.
CoRR, 2022

Accelerating Diffusion Models via Early Stop of the Diffusion Process.
CoRR, 2022

MINI: Mining Implicit Novel Instances for Few-Shot Object Detection.
CoRR, 2022

Shoot360: Normal View Video Creation from City Panorama Footage.
Proceedings of the SIGGRAPH '22: Special Interest Group on Computer Graphics and Interactive Techniques Conference, Vancouver, BC, Canada, August 7, 2022

Audio-Driven Co-Speech Gesture Video Generation.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Semi-Supervised Semantic Segmentation via Gentle Teaching Assistant.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Transcript to Video: Efficient Clip Sequencing from Texts.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Cycle-Consistent Learning for Weakly Supervised Semantic Segmentation.
Proceedings of the HCMA@MM 2022: Proceedings of the 3rd International Workshop on Human-Centric Multimedia Analysis, 2022

SPTS: Single-Point Text Spotting.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

PYSKL: Towards Good Practices for Skeleton Action Recognition.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

LongTail-Bench: A Benchmark Suite for Domain-Specific Operators in Deep Learning.
Proceedings of the IEEE International Symposium on Workload Characterization, 2022

EasyView: Enabling and Scheduling Tensor Views in Deep Learning Compilers.
Proceedings of the 51st International Conference on Parallel Processing, 2022

A Conditional Point Diffusion-Refinement Paradigm for 3D Point Cloud Completion.
Proceedings of the Tenth International Conference on Learning Representations, 2022

BungeeNeRF: Progressive Neural Radiance Field for Extreme Multi-scale Scene Rendering.
Proceedings of the Computer Vision - ECCV 2022, 2022

Monocular 3D Object Detection with Depth from Motion.
Proceedings of the Computer Vision - ECCV 2022, 2022

Static and Dynamic Concepts for Self-supervised Video Representation Learning.
Proceedings of the Computer Vision - ECCV 2022, 2022

Mitigating Representation Bias in Action Recognition: Algorithms and Benchmarks.
Proceedings of the Computer Vision - ECCV 2022 Workshops, 2022

OCSampler: Compressing Videos to One Clip with Single-step Sampling.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

TransRank: Self-supervised Video Representation Learning via Ranking-based Transformation Recognition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Revisiting Skeleton-based Action Recognition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Towards Diverse and Natural Scene-aware 3D Human Motion Synthesis.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
Learning Diverse Fashion Collocation by Neural Graph Filtering.
IEEE Trans. Multim., 2021

Towards Statistically Provable Geometric 3D Human Pose Recovery.
SIAM J. Imaging Sci., 2021

Distributions.jl: Definition and Modeling of Probability Distributions in the JuliaStats Ecosystem.
J. Stat. Softw., 2021

Towards Balanced Learning for Instance Recognition.
Int. J. Comput. Vis., 2021

SPTS: Single-Point Text Spotting.
CoRR, 2021

CityNeRF: Building NeRF at City Scale.
CoRR, 2021

Density-aware Chamfer Distance as a Comprehensive Metric for Point Cloud Completion.
CoRR, 2021

INTERN: A New Learning Paradigm Towards General Vision.
CoRR, 2021

WSSOD: A New Pipeline for Weakly- and Semi-Supervised Object Detection.
CoRR, 2021

Revisiting Skeleton-based Action Recognition.
CoRR, 2021

Welcome back!
Commun. ACM, 2021

Generative Occupancy Fields for 3D Surface-Aware Image Synthesis.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Balanced Chamfer Distance as a Comprehensive Metric for Point Cloud Completion.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Few-Shot Object Detection via Association and DIscrimination.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

MMOCR: A Comprehensive Toolbox for Text Detection, Recognition and Understanding.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

FCOS3D: Fully Convolutional One-Stage Monocular 3D Object Detection.
Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2021

Vision Transformer with Progressive Sampling.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

BlockPlanner: City Block Generation with Vectorized Graph Representation.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

3D Building Reconstruction from Monocular Remote Sensing Images.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR Segmentation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Visually Informed Binaural Audio Generation without Binaural Audios.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Adversarial Robustness Under Long-Tailed Distribution.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Seesaw Loss for Long-Tailed Instance Segmentation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Scene-Aware Generative Network for Human Motion Synthesis.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Towards Evaluating and Training Verifiably Robust Neural Networks.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Probabilistic and Geometric Depth: Detecting Objects in Perspective.
Proceedings of the Conference on Robot Learning, 8-11 November 2021, London, UK., 2021

Understanding the wiring evolution in differentiable neural architecture search.
Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

Joint Semantic-geometric Learning for Polygonal Building Segmentation.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Temporal ROI Align for Video Object Recognition.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Parallel Multi-Environment Shaping Algorithm for Complex Multi-step Task.
Neurocomputing, 2020

Temporal Action Detection with Structured Segment Networks.
Int. J. Comput. Vis., 2020

Cylinder3D: An Effective 3D Framework for Driving-scene LiDAR Semantic Segmentation.
CoRR, 2020

Novel Policy Seeking with Constrained Optimization.
CoRR, 2020

Evolutionary Stochastic Policy Distillation.
CoRR, 2020

Feature Pyramid Grids.
CoRR, 2020

Regularizing Reasons for Outfit Evaluation with Gradient Penalty.
CoRR, 2020

FLAVA: Find, Localize, Adjust and Verify to Annotate LiDAR-based Point Clouds.
Proceedings of the UIST '20 Adjunct: The 33rd Annual ACM Symposium on User Interface Software and Technology, 2020

Real or Not Real, that is the Question.
Proceedings of the 8th International Conference on Learning Representations, 2020

SSN: Shape Signature Networks for Multi-class Object Detection from Point Clouds.
Proceedings of the Computer Vision - ECCV 2020, 2020

Sep-Stereo: Visually Guided Stereophonic Audio Generation by Associating Source Separation.
Proceedings of the Computer Vision - ECCV 2020, 2020

Learn to Propagate Reliably on Noisy Affinity Graphs.
Proceedings of the Computer Vision - ECCV 2020, 2020

Online Multi-modal Person Search in Videos.
Proceedings of the Computer Vision - ECCV 2020, 2020

Distribution-Balanced Loss for Multi-label Classification in Long-Tailed Datasets.
Proceedings of the Computer Vision - ECCV 2020, 2020

Side-Aware Boundary Localization for More Precise Object Detection.
Proceedings of the Computer Vision - ECCV 2020, 2020

Motion Guided 3D Pose Estimation from Videos.
Proceedings of the Computer Vision - ECCV 2020, 2020

A Unified Framework for Shot Type Classification Based on Subject Centric Lens.
Proceedings of the Computer Vision - ECCV 2020, 2020

Placepedia: Comprehensive Place Understanding with Multi-faceted Annotations.
Proceedings of the Computer Vision - ECCV 2020, 2020

Caption-Supervised Face Recognition: Training a State-of-the-Art Face Model Without Manual Annotation.
Proceedings of the Computer Vision - ECCV 2020, 2020

MovieNet: A Holistic Dataset for Movie Understanding.
Proceedings of the Computer Vision - ECCV 2020, 2020

Omni-Sourced Webly-Supervised Learning for Video Recognition.
Proceedings of the Computer Vision - ECCV 2020, 2020

Self-Supervised Scene De-Occlusion.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Learning to Cluster Faces via Confidence and Connectivity Estimation.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

FineGym: A Hierarchical Video Dataset for Fine-Grained Action Understanding.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Intra- and Inter-Action Understanding via Temporal Action Parsing.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

A Local-to-Global Approach to Multi-Modal Movie Scene Segmentation.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

DSNAS: Direct Neural Architecture Search Without Parameter Retraining.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

When NAS Meets Robustness: In Search of Robust Architectures Against Adversarial Attacks.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Prime Sample Attention in Object Detection.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Open Compound Domain Adaptation.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Reconfigurable Voxels: A New Representation for LiDAR-Based Point Clouds.
Proceedings of the 4th Conference on Robot Learning, 2020

Learning a Decision Module by Imitating Driver's Control Behaviors.
Proceedings of the 4th Conference on Robot Learning, 2020

Fastened CROWN: Tightened Neural Network Robustness Certificates.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Temporal Segment Networks for Action Recognition in Videos.
IEEE Trans. Pattern Anal. Mach. Intell., 2019

Learning Driving Decisions by Imitating Drivers' Control Behaviors.
CoRR, 2019

Learning to Synthesize Fashion Textures.
CoRR, 2019

Biased Estimates of Advantages over Path Ensembles.
CoRR, 2019

Compound Domain Adaptation in an Open World.
CoRR, 2019

MMDetection: Open MMLab Detection Toolbox and Benchmark.
CoRR, 2019

POPQORN: Quantifying Robustness of Recurrent Neural Networks.
CoRR, 2019

Online Hyper-parameter Learning for Auto-Augmentation Strategy.
CoRR, 2019

WIDER Face and Pedestrian Challenge 2018: Methods and Results.
CoRR, 2019

Policy Continuation with Hindsight Inverse Dynamics.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

POPQORN: Quantifying Robustness of Recurrent Neural Networks.
Proceedings of the 36th International Conference on Machine Learning, 2019

Convolutional Sequence Generation for Skeleton-Based Action Synthesis.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Recursive Visual Sound Separation Using Minus-Plus Net.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

A Graph-Based Framework to Bridge Movies and Synopses.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

CARAFE: Content-Aware ReAssembly of FEatures.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Online Hyper-Parameter Learning for Auto-Augmentation Strategy.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Adapting Object Detectors via Selective Cross-Domain Alignment.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Self-Supervised Learning via Conditional Motion Propagation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Learning to Cluster Faces on an Affinity Graph.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Region Proposal by Guided Anchoring.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Libra R-CNN: Towards Balanced Learning for Object Detection.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Learning a Unified Classifier Incrementally via Rebalancing.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

IRLAS: Inverse Reinforcement Learning for Architecture Search.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Hybrid Task Cascade for Instance Segmentation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018
Monocular 3D Pose Recovery via Nonconvex Sparsity with Theoretical Analysis.
CoRR, 2018

Improving On-policy Learning with Statistical Reward Accumulation.
CoRR, 2018

From Trailers to Storylines: An Efficient Way to Learn from Movies.
CoRR, 2018

Unsupervised Feature Learning via Non-Parametric Instance-level Discrimination.
CoRR, 2018

Trajectory Convolution for Action Recognition.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

A Neural Compositional Paradigm for Image Captioning.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Penalizing Top Performers: Conservative Loss for Semantic Segmentation Adaptation.
Proceedings of the Computer Vision - ECCV 2018, 2018

PSANet: Point-wise Spatial Attention Network for Scene Parsing.
Proceedings of the Computer Vision - ECCV 2018, 2018

Consensus-Driven Propagation in Massive Unlabeled Data for Face Recognition.
Proceedings of the Computer Vision - ECCV 2018, 2018

Pose Guided Human Video Generation.
Proceedings of the Computer Vision - ECCV 2018, 2018

Move Forward and Tell: A Progressive Generator of Video Descriptions.
Proceedings of the Computer Vision - ECCV 2018, 2018

Find and Focus: Retrieve and Localize Video Events with Natural Language Queries.
Proceedings of the Computer Vision - ECCV 2018, 2018

Person Search in Videos with One Portrait Through Visual and Temporal Links.
Proceedings of the Computer Vision - ECCV 2018, 2018

Lifelong Learning via Progressive Distillation and Retrospection.
Proceedings of the Computer Vision - ECCV 2018, 2018

Rethinking the Form of Latent States in Image Captioning.
Proceedings of the Computer Vision - ECCV 2018, 2018

Recognize Actions by Disentangling Components of Dynamics.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Unsupervised Feature Learning via Non-Parametric Instance Discrimination.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Learning Globally Optimized Object Detector via Policy Gradient.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Low-Latency Video Semantic Segmentation.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Unifying Identification and Context Learning for Person Recognition.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Optimizing Video Object Detection via a Scale-Time Lattice.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Accelerated Training for Massive Classification via Dynamic Class Selection.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Probabilistic Ensemble of Collaborative Filters.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Generative Adversarial Frontal View to Bird View Synthesis.
Proceedings of the 2018 International Conference on 3D Vision, 2018

2017
Peephole: Predicting Network Performance Before Training.
CoRR, 2017

Learning Sparse Visual Representations with Leaky Capped Norm Regularizers.
CoRR, 2017

A Pursuit of Temporal Accuracy in General Activity Detection.
CoRR, 2017

Contrastive Learning for Image Captioning.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Scalable Estimation of Dirichlet Process Mixture Models on Distributed Data.
Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017

Integrating Specialized Classifiers Based on Continuous Time Markov Chain.
Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017

Be Your Own Prada: Fashion Synthesis with Structural Coherence.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Towards Diverse and Natural Image Descriptions via a Conditional GAN.
Proceedings of the IEEE International Conference on Computer Vision, 2017

PolyNet: A Pursuit of Structural Diversity in Very Deep Networks.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

UntrimmedNets for Weakly Supervised Action Recognition and Detection.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Detecting Visual Relationships with Deep Relational Networks.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Discover and Learn New Objects from Documentaries.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

2016
Joint Inference of Objects and Scenes With Efficient Learning of Text-Object-Scene Relations.
IEEE Trans. Multim., 2016

CUHK & ETHZ & SIAT Submission to ActivityNet Challenge 2016.
CoRR, 2016

Deep Markov Random Field for Image Modeling.
Proceedings of the Computer Vision - ECCV 2016, 2016

Temporal Segment Networks: Towards Good Practices for Deep Action Recognition.
Proceedings of the Computer Vision - ECCV 2016, 2016

2015
Adjustable Bounded Rectifiers: Towards Deep Binary Representations.
CoRR, 2015

Generating Multi-Sentence Lingual Descriptions of Indoor Scenes.
CoRR, 2015

Recognize complex events from static images by fusing deep channels.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Generating Multi-sentence Natural Language Descriptions of Indoor Scenes.
Proceedings of the British Machine Vision Conference 2015, 2015

2014
Mining text snippets for images on the web.
Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2014

Visual Semantic Search: Retrieving Videos via Complex Textual Queries.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

What Are You Talking About? Text-to-Image Coreference.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

2013
Online Learning of Nonparametric Mixture Models via Sequential Variational Approximation.
Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

Characterizing Layouts of Outdoor Scenes Using Spatial Topic Processes.
Proceedings of the IEEE International Conference on Computer Vision, 2013

Holistic Scene Understanding for 3D Object Detection with RGBD Cameras.
Proceedings of the IEEE International Conference on Computer Vision, 2013

Hidden Factor Analysis for Age Invariant Face Recognition.
Proceedings of the IEEE International Conference on Computer Vision, 2013

2012
Generative modeling of dynamic visual scenes.
PhD thesis, 2012

Efficient Sampling from Combinatorial Space via Bridging.
Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, 2012

Coupling Nonparametric Mixtures via Latent Dirichlet Processes.
Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

Learning Deformations with Parallel Transport.
Proceedings of the Computer Vision - ECCV 2012, 2012

Low level vision via switchable Markov random fields.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

Manifold guided composite of Markov random fields for image modeling.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

2010
Construction of Dependent Dirichlet Processes based on Poisson Processes.
Proceedings of the Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 6-9 December 2010, 2010

Joint People, Event, and Location Recognition in Personal Photo Collections Using Cross-Domain Context.
Proceedings of the Computer Vision, 2010

Modeling and estimating persistent motion with geometric flows.
Proceedings of the Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, 2010

2009
Nonparametric Discriminant Analysis for Face Recognition.
IEEE Trans. Pattern Anal. Mach. Intell., 2009

Learning visual flows: A Lie algebraic approach.
Proceedings of the 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 2009

2007
Quality-Driven Face Occlusion Detection and Recovery.
Proceedings of the 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2007), 2007

Discriminant Mutual Subspace Learning for Indoor and Outdoor Face Recognition.
Proceedings of the 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2007), 2007

2006
Inter-modality Face Recognition.
Proceedings of the Computer Vision, 2006

Conditional Infomax Learning: An Integrated Framework for Feature Extraction and Fusion.
Proceedings of the Computer Vision, 2006

Pursuing Informative Projection on Grassmann Manifold.
Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2006), 2006

Recognize High Resolution Faces: From Macrocosm to Microcosm.
Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2006), 2006

2005
Neighbor combination and transformation for hallucinating faces.
Proceedings of the 2005 IEEE International Conference on Multimedia and Expo, 2005

Face hallucination through dual associative learning.
Proceedings of the 2005 International Conference on Image Processing, 2005

Comparative study: face recognition on unspecific persons using linear subspace methods.
Proceedings of the 2005 International Conference on Image Processing, 2005

Feedback-based dynamic generalized LDA for face recognition.
Proceedings of the 2005 International Conference on Image Processing, 2005

Tensor-based factor decomposition for relighting.
Proceedings of the 2005 International Conference on Image Processing, 2005

Layered local prediction network with dynamic learning for face super-resolution.
Proceedings of the 2005 International Conference on Image Processing, 2005

Coupled Space Learning for Image Style Transformation.
Proceedings of the 10th IEEE International Conference on Computer Vision (ICCV 2005), 2005

Hallucinating Faces: TensorPatch Super-Resolution and Coupled Residue Compensation.
Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), 2005

Nonparametric Subspace Analysis for Face Recognition.
Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), 2005


  Loading...