Peihao Chen

Orcid: 0000-0002-6847-1621

According to our database1, Peihao Chen authored at least 23 papers between 2019 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
3D-VLA: A 3D Vision-Language-Action Generative World Model.
CoRR, 2024

MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D World.
CoRR, 2024

2023
A Simple Knowledge Distillation Framework for Open-world Object Detection.
CoRR, 2023

DCIR: Dynamic Consistency Intrinsic Reward for Multi-Agent Reinforcement Learning.
CoRR, 2023

CoVLM: Composing Visual Entities and Relationships in Large Language Models Via Communicative Decoding.
CoRR, 2023

A<sup>2</sup>Nav: Action-Aware Zero-Shot Robot Navigation by Exploiting Vision-and-Language Ability of Foundation Models.
CoRR, 2023

Vesper: A Compact and Effective Pretrained Model for Speech Emotion Recognition.
CoRR, 2023

Detecting the open-world objects with the help of the Brain.
CoRR, 2023

FGPrompt: Fine-grained Goal Prompting for Image-goal Navigation.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

3D-LLM: Injecting the 3D World into Large Language Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Learning Vision-and-Language Navigation from YouTube Videos.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Masked Motion Encoding for Self-Supervised Video Representation Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
M<sup>3</sup>Video: Masked Motion Modeling for Self-Supervised Video Representation Learning.
CoRR, 2022

Weakly-Supervised Multi-Granularity Map Learning for Vision-and-Language Navigation.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Learning Active Camera for Multi-Object Navigation.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

2021
RSPNet: Relative Speed Perception for Unsupervised Video Representation Learning.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Relation Attention for Temporal Action Localization.
IEEE Trans. Multim., 2020

Generating Visually Aligned Sound From Videos.
IEEE Trans. Image Process., 2020

Foley Music: Learning to Generate Music from Videos.
Proceedings of the Computer Vision - ECCV 2020, 2020

Dense Regression Network for Video Grounding.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Location-Aware Graph Convolutional Networks for Video Question Answering.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Breaking Winner-Takes-All: Iterative-Winners-Out Networks for Weakly Supervised Temporal Action Localization.
IEEE Trans. Image Process., 2019

Self-Supervised Moving Vehicle Tracking With Stereo Sound.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019


  Loading...