Wei Xu

Orcid: 0000-0002-3951-7192

  • Horizon Robotics Research, Sunnyvale, CA, USA (since 2018)
  • Baidu Research, Sunnyvale USA (since 2013)
  • Facebook, Inc., Palo Alto, CA, USA (2009 - 2013)
  • NEC Laboratories America, Cupertino, CA, USA (2001 - 2009)
  • Carnegie Mellon University, School of Computer Science, Pittsburgh, PA, USA

According to our database1, Wei Xu authored at least 76 papers between 2000 and 2024.

Collaborative distances:



In proceedings 
PhD thesis 


Online presence:

On csauthors.net:


An Efficient Global Optimal Method for Cardinality Constrained Portfolio Optimization.
INFORMS J. Comput., 2024

A Unified Framework for 3D Scene Understanding.
CoRR, 2024

Solving Motion Planning Tasks with a Scalable Generative Model.
CoRR, 2024

Dynamic Adapter Meets Prompt Tuning: Parameter-Efficient Transfer Learning for Point Cloud Analysis.
CoRR, 2024

PointMamba: A Simple State Space Model for Point Cloud Analysis.
CoRR, 2024

VONet: Unsupervised Video Object Learning With Parallel U-Net Attention and Object-wise Sequential VAE.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Focal Inverse Distance Transform Maps for Crowd Localization.
IEEE Trans. Multim., 2023

Efficient Multi-Task and Transfer Reinforcement Learning With Parameter-Compositional Framework.
IEEE Robotics Autom. Lett., 2023

Imitation with Spatial-Temporal Heatmap: 2nd Place Solution for NuPlan Challenge.
CoRR, 2023

Policy Expansion for Bridging Offline-to-Online Reinforcement Learning.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

CrowdCLIP: Unsupervised Crowd Counting via Vision-Language Model.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Understanding Natural Gradient in Sobolev Spaces.
CoRR, 2022

Do You Need the Entropy Reward (in Practice)?
CoRR, 2022

TransCrowd: weakly-supervised crowd counting with transformers.
Sci. China Inf. Sci., 2022

Towards Safe Reinforcement Learning with a Safety Editor Policy.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

PaCo: Parameter-Compositional Multi-task Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Distributionally Robust Q-Learning.
Proceedings of the International Conference on Machine Learning, 2022

Generative Planning for Temporally Coordinated Exploration in Reinforcement Learning.
Proceedings of the Tenth International Conference on Learning Representations, 2022

An End-to-End Transformer Model for Crowd Localization.
Proceedings of the Computer Vision - ECCV 2022, 2022

The Tenth Visual Object Tracking VOT2022 Challenge Results.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Proceedings of the Computer Vision - ECCV 2022 Workshops, 2022

TransCrowd: Weakly-Supervised Crowd Counting with Transformer.
CoRR, 2021

TASAC: Temporally Abstract Soft Actor-Critic for Continuous Control.
CoRR, 2021

MetaView: Few-shot Active Object Recognition.
CoRR, 2021

Reciprocal Distance Transform Maps for Crowd Counting and People Localization in Dense Crowd.
CoRR, 2021

TAAC: Temporally Abstract Actor-Critic for Continuous Control.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Generative Particle Variational Inference via Estimation of Functional Gradients.
Proceedings of the 38th International Conference on Machine Learning, 2021

Hierarchical Reinforcement Learning by Discovering Intrinsic Options.
Proceedings of the 9th International Conference on Learning Representations, 2021

MOOCCubeX: A Large Knowledge-centered Repository for Adaptive Learning in MOOCs.
Proceedings of the CIKM '21: The 30th ACM International Conference on Information and Knowledge Management, Virtual Event, Queensland, Australia, November 1, 2021

Every Pixel Counts ++: Joint Learning of Geometry and Motion with 3D Holistic Understanding.
IEEE Trans. Pattern Anal. Mach. Intell., 2020

Optimal switching of switched systems with time delay in discrete time.
Autom., 2020

Implicit Generative Modeling for Efficient Exploration.
Proceedings of the 37th International Conference on Machine Learning, 2020

Learning Good Representation via Continuous Attention.
CoRR, 2019

UnOS: Unified Unsupervised Optical-Flow and Stereo-Depth Estimation by Watching Videos.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Zero-Shot Transfer VQA Dataset.
CoRR, 2018

Joint Unsupervised Learning of Optical Flow and Depth by Watching Stereo Videos.
CoRR, 2018

Tracklet Association Tracker: An End-to-End Learning-based Association Approach for Multi-Object Tracking.
CoRR, 2018

DeepTransport: Learning Spatial-Temporal Dependency for Traffic Condition Forecasting.
Proceedings of the 2018 International Joint Conference on Neural Networks, 2018

Interactive Grounded Language Acquisition and Generalization in a 2D World.
Proceedings of the 6th International Conference on Learning Representations, 2018

Every Pixel Counts: Unsupervised Geometry Learning with Holistic 3D Motion Understanding.
Proceedings of the Computer Vision - ECCV 2018 Workshops, 2018

LEGO: Learning Edge With Geometry All at Once by Watching Videos.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

DeLS-3D: Deep Localization and Segmentation With a 3D Semantic Map.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Occlusion Aware Unsupervised Learning of Optical Flow.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Guided Feature Transformation (GFT): A Neural Language Grounding Module for Embodied Agents.
Proceedings of the 2nd Annual Conference on Robot Learning, 2018

Interactive Language Acquisition with One-shot Visual Concept Learning through a Conversational Game.
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018

Unsupervised Learning of Geometry From Videos With Edge-Aware Depth-Normal Consistency.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

State Tracking Networks for Dialog State Tracking.
Proceedings of the Workshops of the The Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Occlusion Aware Unsupervised Learning of Optical Flow.
CoRR, 2017

Unsupervised Learning of Geometry with Edge-aware Depth-Normal Consistency.
CoRR, 2017

Unsupervised Learning Layers for Video Analysis.
CoRR, 2017

Listen, Interact and Talk: Learning to Speak via Interaction.
CoRR, 2017

A Deep Compositional Framework for Human-like Language Acquisition in Virtual Environment.
CoRR, 2017

Dynamic Computational Time for Visual Attention.
CoRR, 2017

Optimal switching for linear quadratic problem of switched systems in discrete time.
Autom., 2017

Dynamic Computational Time for Visual Attention.
Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops, 2017

Deep Recurrent Models with Fast-Forward Connections for Neural Machine Translation.
Trans. Assoc. Comput. Linguistics, 2016

Video Paragraph Captioning Using Hierarchical Recurrent Neural Networks.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

CNN-RNN: A Unified Framework for Multi-label Image Classification.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Attention to Scale: Scale-Aware Semantic Image Segmentation.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

CFO: Conditional Focused Neural Question Answering with Large-scale Knowledge Bases.
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016

Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN).
Proceedings of the 3rd International Conference on Learning Representations, 2015

Learning like a Child: Fast Novel Visual Concept Learning from Sentence Descriptions of Images.
CoRR, 2015

Bidirectional LSTM-CRF Models for Sequence Tagging.
CoRR, 2015

Are You Talking to a Machine? Dataset and Methods for Multilingual Image Question Answering.
CoRR, 2015

ABC-CNN: An Attention Based Convolutional Neural Network for Visual Question Answering.
CoRR, 2015

Are You Talking to a Machine? Dataset and Methods for Multilingual Image Question.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Look and Think Twice: Capturing Top-Down Visual Attention with Feedback Convolutional Neural Networks.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

End-to-end learning of semantic role labeling using recurrent neural networks.
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, 2015

Explain Images with Multimodal Recurrent Neural Networks.
CoRR, 2014

Fast exact maximum likelihood estimation for mixture of language model.
Inf. Process. Manag., 2008

Fast exact maximum likelihood estimation for mixture of language models.
Proceedings of the SIGIR 2007: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2007

A hierarchical naive Bayes mixture model for name disambiguation in author citations.
Proceedings of the 2005 ACM Symposium on Applied Computing (SAC), 2005

Exploration and Exploitation in Adaptive Filtering Based on Bayesian Active Learning.
Proceedings of the Machine Learning, 2003

Can artificial neural networks learn language models?
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Language modeling for dialog system.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000
