Yuhang He

This page is a disambiguation page, it actually contains multiple papers from persons of the same or a similar name.

Bibliography

2026
ReVision: Scaling Computer-Use Agents via Temporal Visual Redundancy Reduction.
CoRR, May, 2026

Hypergraph and Latent ODE Learning for Multimodal Root Cause Localization in Microservices.
CoRR, May, 2026

Rethinking Token-Level Credit Assignment in RLVR: A Polarity-Entropy Analysis.
CoRR, April, 2026

Training-free Spatially Grounded Geometric Shape Encoding (Technical Report).
CoRR, April, 2026

OS-Marathon: Benchmarking Computer-Use Agents on Long-Horizon Repetitive Tasks.
CoRR, January, 2026

Joint Design for IRS-Assisted Integrated Radar and Communication Systems: Multi-Target Detection and Multi-User Interference Management.
IEEE Trans. Wirel. Commun., 2026

Information-coupled MRI acceleration via multi-modal mapping and progressive masking.
Pattern Recognit., 2026

SAGNet: Enhancing Sparse Power Line Point Cloud Segmentation With EdgeConv and Structure Tensor Gating.
IEEE Access, 2026

2025
User multi-dimensional prior preferences adaptive balancing based next POI recommendation.
Knowl. Inf. Syst., December, 2025

MoReGen: Multi-Agent Motion-Reasoning Engine for Code-based Text-to-Video Synthesis.
CoRR, December, 2025

Space-Frequency Transmit Sequence Design for Dual-Function RadCom System.
IEEE Wirel. Commun. Lett., September, 2025

Prestack Seismic Waveform Classification via Physical Knowledge-Guided Disentangled Representation.
IEEE Trans. Geosci. Remote. Sens., 2025

VSP Wavefield Separation via Physical Prior-Guided Convolutional Autoencoder.
IEEE Geosci. Remote. Sens. Lett., 2025

The Effective Singular Value Ratio: A Novel Criterion for Adaptive Data Selection in DeePC.
IEEE Control. Syst. Lett., 2025

SoundLoc3D: Invisible 3D Sound Source Localization and Classification Using a Multimodal RGB-D Acoustic Camera.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2025

Wavelet-3DGS: Wavelet Domain Joint Compaction and Compression of 3D Gaussian Splatting.
Proceedings of the Picture Coding Symposium, 2025

SURDS: Benchmarking Spatial Understanding and Reasoning in Driving Scenarios with Vision Language Models.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

DiffRefine: Diffusion-Based Proposal Specific Point Cloud Densification for Cross-Domain Object Detection.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

SoundTRC: DNN-based Acoustic Target Region Control.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

B2-CTA: Deep Learning Model for Network Intrusion Detection.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2025

RiTTA: Modeling Event Relations in Text-to-Audio Generation.
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

2024
Embodied Intelligence in Mining: Leveraging Multi-Modal Large Language Models for Autonomous Driving in Mines.
IEEE Trans. Intell. Veh., May, 2024

Sora for Smart Mining: Towards Sustainability With Imaginative Intelligence and Parallel Intelligence.
IEEE Trans. Intell. Veh., April, 2024

DriveMLLM: A Benchmark for Spatial Understanding with Multimodal Large Language Models in Autonomous Driving.
CoRR, 2024

Overview of AI-Debater 2023: The Challenges of Argument Generation Tasks.
CoRR, 2024

SPEAR: Receiver-to-Receiver Acoustic Neural Warping Field.
CoRR, 2024

Emerging Synergies Between Large Language Models and Machine Learning in Ecommerce Recommendations.
CoRR, 2024

4D Gaussian Splatting: Towards Efficient Novel View Synthesis for Dynamic Scenes.
CoRR, 2024

Adaptive State Feedback Shared Control for Unmanned Surface Vehicle With Fixed-Time Prescribed Performance Control.
IEEE Access, 2024

Sound3DVDet: 3D Sound Source Detection using Multiview Microphone Array and RGB Images.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

4D-Rotor Gaussian Splatting: Towards Efficient Novel View Synthesis for Dynamic Scenes.
Proceedings of the ACM SIGGRAPH 2024 Conference Papers, 2024

Bridge the Modality and Capability Gaps in Vision-Language Model Selection.
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Towards Learning Group-Equivariant Features for Domain Adaptive 3D Detection.
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024


Deep Neural Room Acoustics Primitive.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

BC-Prover: Backward Chaining Prover for Formal Theorem Proving.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Decomposing Argumentative Essay Generation via Dialectical Planning of Complex Reasoning.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

SoundCount: Sound Counting from Raw Audio with Dyadic Decomposition Neural Network.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Feature-based POI grouping with transformer for next point of interest recommendation.
Appl. Soft Comput., November, 2023

Backward THz Emission from Two-Color Laser Field-Induced Air Plasma Filament.
Sensors, 2023

Bayesian inference and neural estimation of acoustic wave propagation.
CoRR, 2023

Biased Technological Progress, Factor Price Distribution, and Overcapacity: A Case from China.
Complex., 2023

Metric-Free Exploration for Topological Mapping by Task and Motion Imitation in Feature Space.
Proceedings of the Robotics: Science and Systems XIX, Daegu, 2023

Multi-body SE(3) Equivariance for Unsupervised Rigid Segmentation and Motion Estimation.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

SRACas: A Social Role-Aware Graph Neural Network-Based Model for Popularity Prediction of Information Cascades.
Proceedings of the Database Systems for Advanced Applications, 2023

SoundSynp: Sound Source Detection from Raw Waveforms with Multi-Scale Synperiodic Filterbanks.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023

2022
An Integrated Entropy-COPRAS Framework for Ningbo-Zhoushan Port Logistics Development from the Perspective of Dual Circulation.
Syst., 2022

DeepAoANet: Learning Angle of Arrival From Software Defined Radios With Deep Neural Networks.
IEEE Access, 2022

SoundDoA: Learn Sound Source Direction of Arrival and Semantics from Sound Raw Waveforms.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

A Generative Model for End-to-End Argument Mining with Reconstructed Positional Encoding and Constrained Pointer Mechanism.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

Learning 3D Semantics From Pose-Noisy 2D Images with Hierarchical Full Attention Network.
Proceedings of the Computer Vision - ECCV 2022 Workshops, 2022

2021
Experimental research of ploughing extrusion forming multi-tooth tool wear and damage mechanism and workpiece defects.
J. Comput. Methods Sci. Eng., 2021

DeepAoANet: Learning Angle of Arrival from Software Defined Radios with Deep Neural Networks.
CoRR, 2021

SoundDet: Polyphonic Sound Event Detection and Localization from Raw Waveform.
CoRR, 2021

SoundDet: Polyphonic Moving Sound Event Detection and Localization from Raw Waveform.
Proceedings of the 38th International Conference on Machine Learning, 2021

A Novel Contrast Operator for Robust Object Searching.
Proceedings of the 17th International Conference on Computational Intelligence and Security CIS 2021, 2021

2019
Deep Integration: A Multi-Label Architecture for Road Scene Recognition.
IEEE Trans. Image Process., 2019

Real-Time Vehicle Detection from Short-range Aerial Image with Compressed MobileNet.
Proceedings of the International Conference on Robotics and Automation, 2019

Trajectory Tracking Control of a Large Six-DOF Platform Based on Integral-Type Terminal Sliding Mode Control and Bat Algorithm.
Proceedings of the 2019 International Conference on Control, 2019

2018
Dress Fashionably: Learn Fashion Collocation With Deep Mixed-Category Metric Learning.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
Transforming a 3-D LiDAR Point Cloud Into a 2-D Dense Depth Map Through a Parameter Self-Adaptive Framework.
IEEE Trans. Intell. Transp. Syst., 2017

Let the robot tell: Describe car image with natural language via LSTM.
Pattern Recognit. Lett., 2017

Real-Time Fashion-Guided Clothing Semantic Parsing: A Lightweight Multi-Scale Inception Neural Network and Benchmark.
Proceedings of the Workshops of the The Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016
Multi-task Relative Attribute Prediction by Incorporating Local Context and Global Style Information.
Proceedings of the British Machine Vision Conference 2016, 2016

Fast Fashion Guided Clothing Image Retrieval: Delving Deeper into What Feature Makes Fashion.
Proceedings of the Computer Vision - ACCV 2016, 2016

2015
A novel way to organize 3D LiDAR point cloud as 2D depth map height map and surface normal map.
Proceedings of the 2015 IEEE International Conference on Robotics and Biomimetics, 2015

Robust optimization with credibility factor for graph-based SLAM.
Proceedings of the 2015 IEEE International Conference on Robotics and Biomimetics, 2015

Sparse depth map upsampling with RGB image and anisotropic diffusion tensor.
Proceedings of the 2015 IEEE Intelligent Vehicles Symposium, 2015

2014
Using edit distance and junction feature to detect and recognize arrow road marking.
Proceedings of the 17th International IEEE Conference on Intelligent Transportation Systems, 2014


  Loading...