Ruohan Gao

Orcid: 0000-0002-8346-1114

According to our database1, Ruohan Gao authored at least 53 papers between 2015 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
AFFORD2ACT: Affordance-Guided Automatic Keypoint Selection for Generalizable and Lightweight Robotic Manipulation.
CoRR, October, 2025

GenFlowRL: Shaping Rewards with Generative Object-Centric Flow in Visual Reinforcement Learning.
CoRR, August, 2025

Towards Perception-Informed Latent HRTF Representations.
CoRR, July, 2025

EgoAdapt: Adaptive Multisensory Distillation and Policy Learning for Efficient Egocentric Perception.
CoRR, June, 2025

ControlTac: Force- and Position-Controlled Tactile Data Augmentation with a Single Reference Image.
CoRR, May, 2025

Differentiable Room Acoustic Rendering with Multi-View Vision Priors.
CoRR, April, 2025

Aurelia: Test-time Reasoning Distillation in Audio-Visual LLMs.
CoRR, March, 2025

AVTrustBench: Assessing and Enhancing Reliability and Robustness in Audio-Visual LLMs.
CoRR, January, 2025

Predicting the Distribution of Ailanthus altissima Using Deep Learning-Based Analysis of Satellite Imagery.
Symmetry, 2025

Hearing Anywhere in Any Environment.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Learning to Highlight Audio by Watching Movies.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Multisensory Machine Intelligence.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024
Leveraging Cognitive Conflict and Organizational Unlearning for Digital Mastery in SMEs: Insights From Upper Echelons.
IEEE Trans. Engineering Management, 2024

DiffSound: Differentiable Modal Sound Rendering and Inverse Rendering for Diverse Inference Tasks.
Proceedings of the ACM SIGGRAPH 2024 Conference Papers, 2024

Spherical World-Locking for Audio-Visual Localization in Egocentric Videos.
Proceedings of the Computer Vision - ECCV 2024, 2024

MEERKAT: Audio-Visual Large Language Model for Grounding in Space and Time.
Proceedings of the Computer Vision - ECCV 2024, 2024

VMFTransformer: An Angle-Preserving and Auto-Scaling Machine for Multi-Horizon Probabilistic Forecasting.
Proceedings of the ECAI 2024 - 27th European Conference on Artificial Intelligence, 19-24 October 2024, Santiago de Compostela, Spain, 2024

Hearing Anything Anywhere.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

The Audio-Visual Conversational Graph: From an Egocentric-Exocentric Perspective.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
Visually-Guided Audio Spatialization in Video with Geometry-Aware Multi-task Learning.
Int. J. Comput. Vis., October, 2023

Differentiable Physics Simulation of Dynamics-Augmented Neural Objects.
IEEE Robotics Autom. Lett., May, 2023

Learning Object-Centric Neural Scattering Functions for Free-viewpoint Relighting and Scene Composition.
Trans. Mach. Learn. Res., 2023

The ObjectFolder Benchmark: Multisensory Learning with Neural and Real Objects.
CoRR, 2023

An Extensible Multimodal Multi-task Object Dataset with Materials.
CoRR, 2023

SoundCam: A Dataset for Finding Humans Using Room Acoustics.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Sonicverse: A Multisensory Simulation Platform for Embodied Household Agents that See and Hear.
Proceedings of the IEEE International Conference on Robotics and Automation, 2023

An Extensible Multi-modal Multi-task Object Dataset with Materials.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

The Object Folder Benchmark : Multisensory Learning with Neural and Real Objects.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

REALIMPACT: A Dataset of Impact Sound Fields for Real Objects.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

NOIR: Neural Signal Operated Intelligent Robots for Everyday Activities.
Proceedings of the Conference on Robot Learning, 2023

2022
Determining critical lung cancer subtypes from gigapixel multi-scale whole slide H&E stains images.
Proceedings of the 5th International Conference on Data Science and Information Technology, 2022

ObjectFolder 2.0: A Multisensory Object Dataset for Sim2Real Transfer.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Visual Acoustic Matching.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

See, Hear, and Feel: Smart Sensory Fusion for Robotic Manipulation.
Proceedings of the Conference on Robot Learning, 2022

2021
Learning to Set Waypoints for Audio-Visual Navigation.
Proceedings of the 9th International Conference on Learning Representations, 2021

VisualVoice: Audio-Visual Speech Separation With Cross-Modal Consistency.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

ObjectFolder: A Dataset of Objects with Implicit Visual, Auditory, and Tactile Representations.
Proceedings of the Conference on Robot Learning, 8-11 November 2021, London, UK., 2021

DiffImpact: Differentiable Rendering and Identification of Impact Sounds.
Proceedings of the Conference on Robot Learning, 8-11 November 2021, London, UK., 2021

Geometry-Aware Multi-Task Learning for Binaural Audio Generation from Video.
Proceedings of the 32nd British Machine Vision Conference 2021, 2021

2020
Audio-Visual Waypoints for Navigation.
CoRR, 2020

VisualEchoes: Spatial Image Representation Learning Through Echolocation.
Proceedings of the Computer Vision - ECCV 2020, 2020

Listen to Look: Action Recognition by Previewing Audio.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019
Co-Separating Sounds of Visual Objects.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

2.5D Visual Sound.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018
ShapeCodes: Self-supervised Feature Learning by Lifting Views to Viewgrids.
Proceedings of the Computer Vision - ECCV 2018, 2018

Learning to Separate Object Sounds by Watching Unlabeled Video.
Proceedings of the Computer Vision - ECCV 2018, 2018

Im2Flow: Motion Hallucination From Static Images for Action Recognition.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

2017
Unsupervised learning through one-shot image-based shape reconstruction.
CoRR, 2017

On-demand Learning for Deep Image Restoration.
Proceedings of the IEEE International Conference on Computer Vision, 2017

2016
From One-Trick Ponies to All-Rounders: On-Demand Learning for Image Restoration.
CoRR, 2016

Accelerating graph mining algorithms via uniform random edge sampling.
Proceedings of the 2016 IEEE International Conference on Communications, 2016

Object-Centric Representation Learning from Unlabeled Videos.
Proceedings of the Computer Vision - ACCV 2016, 2016

2015
Graph Property Preservation under Community-Based Sampling.
Proceedings of the 2015 IEEE Global Communications Conference, 2015


  Loading...