We stand with Ukraine

We stand with Ukraine

Hongfei Xue

Orcid: 0000-0001-9691-9668

According to our database¹, Hongfei Xue authored at least 67 papers between 2016 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

Online presence:

on orcid.org
on scholar.google.com

On csauthors.net:

Bibliography

2026

CLLAP: Contrastive Learning-based LiDAR-Augmented Pretraining for Enhanced Radar-Camera Fusion.

[DOI]

,

,

,

,

,

,

,

CoRR, April, 2026

HumDial-EIBench: A Human-Recorded Multi-Turn Emotional Intelligence Benchmark for Audio Language Models.

[DOI]

,

,

,

,

,

,

,

CoRR, April, 2026

LiveGesture Streamable Co-Speech Gesture Generation Model.

[DOI]

Muhammad Usama Saleem

,

Mayur Jagdishbhai Patel

,

Ekkasit Pinyoanuntapong

,

,

,

,

,

,

CoRR, April, 2026

FastTurn: Unifying Acoustic and Streaming Semantic Cues for Low-Latency and Robust Turn Detection.

[DOI]

,

,

,

,

,

,

,

,

,

,

CoRR, April, 2026

KHMP: Frequency-Domain Kalman Refinement for High-Fidelity Human Motion Prediction.

[DOI]

,

,

,

,

,

,

CoRR, March, 2026

Monocular Models are Strong Learners for Multi-View Human Mesh Recovery.

[DOI]

,

,

,

Muhammad Usama Saleem

,

,

,

,

,

CoRR, March, 2026

OSUM-Pangu: An Open-Source Multidimension Speech Understanding Foundation Model Built upon OpenPangu on Ascend NPUs.

[DOI]

,

,

,

,

CoRR, March, 2026

Send Less, Perceive More: Masked Quantized Point Cloud Communication for Loss-Tolerant Collaborative Perception.

[DOI]

,

,

,

,

,

,

,

,

CoRR, February, 2026

WenetSpeech-Wu: Datasets, Benchmarks, and Models for a Unified Chinese Wu Dialect Speech Processing Ecosystem.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, January, 2026

The ICASSP 2026 HumDial Challenge: Benchmarking Human-like Spoken Dialogue Systems in the LLM Era.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, January, 2026

Crystal Generation using the Fully Differentiable Pipeline and Latent Space Optimization.

[DOI]

Osman Goni Ridwan

,

,

,

CoRR, January, 2026

Walk Before You Dance: High-fidelity and Editable Dance Synthesis via Generative Masked Motion Prior.

[DOI]

Foram Niravbhai Shah

,

,

Muhammad Usama Saleem

,

Ekkasit Pinyoanuntapong

,

,

,

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

Lifelong Domain Adaptive 3D Human Pose Estimation.

[DOI]

,

,

,

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

Hearing More with Less: Multi-Modal Retrieval-and-Selection Augmented Conversational LLM-Based ASR.

[DOI]

,

,

,

,

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

WenetSpeech-Yue: A Large-Scale Cantonese Speech Corpus with Multi-dimensional Annotation.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025

MonoMSK: Monocular 3D Musculoskeletal Dynamics Estimation.

[DOI]

Farnoosh Koleini

,

,

,

CoRR, November, 2025

Easy Turn: Integrating Acoustic and Linguistic Modalities for Robust Turn-Taking in Full-Duplex Spoken Dialogue Systems.

[DOI]

,

,

,

,

,

,

,

,

,

,

CoRR, September, 2025

WenetSpeech-Chuan: A Large-Scale Sichuanese Corpus with Rich Annotation for Dialectal Speech Processing.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, September, 2025

OSUM-EChat: Enhancing End-to-End Empathetic Spoken Chatbot via Understanding-Driven Spoken Dialogue.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, August, 2025

The TEA-ASLP System for Multilingual Conversational Speech Recognition and Speech Diarization in MLC-SLM 2025 Challenge.

[DOI]

,

,

,

,

CoRR, July, 2025

mmHand: Toward Pixel-Level-Accuracy Hand Localization Using a Single Commodity mmWave Device.

[DOI]

,

,

,

,

,

,

,

IEEE Internet Things J., June, 2025

AI-Assisted Rapid Crystal Structure Generation Towards a Target Local Environment.

[DOI]

Osman Goni Ridwan

,

,

Monish Soundar Raj

,

,

,

,

CoRR, June, 2025

DanceMosaic: High-Fidelity Dance Generation with Multimodal Editability.

[DOI]

Foram Niravbhai Shah

,

,

Muhammad Usama Saleem

,

Ekkasit Pinyoanuntapong

,

,

,

CoRR, April, 2025

BioPose: Biomechanically-Accurate 3D Pose Estimation from Monocular Videos.

[DOI]

Farnoosh Koleini

,

Muhammad Usama Saleem

,

,

,

,

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2025

RAM-Hand: Robust Acoustic Multi-Hand Pose Reconstruction Using a Microphone Array.

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the 23rd ACM Conference on Embedded Networked Sensor Systems, 2025

Argus: Multi-View Egocentric Human Mesh Reconstruction Based on Stripped-Down Wearable mmWave Add-on.

[DOI]

,

,

,

,

,

,

,

Proceedings of the 23rd ACM Conference on Embedded Networked Sensor Systems, 2025

Enhancing Non-Core Language Instruction-Following in Speech LLMs via Semi-Implicit Cross-Lingual CoT Reasoning.

[DOI]

,

,

,

,

,

Proceedings of the 33rd ACM International Conference on Multimedia, 2025

Selective Invocation for Multilingual ASR: A Cost-effective Approach Adapting to Speech Recognition Difficulty.

[DOI]

,

,

,

,

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Delayed-KD: Delayed Knowledge Distillation based CTC for Low-Latency Streaming ASR.

[DOI]

,

,

,

,

,

,

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Physical Backdoor Attacks against mmWave-based Human Activity Recognition.

[DOI]

,

,

,

,

,

Proceedings of the 45th IEEE International Conference on Distributed Computing Systems, 2025

Frequency-Semantic Enhanced Variational Autoencoder for Zero-Shot Skeleton-Based Action Recognition.

[DOI]

,

,

,

,

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

MaskHand: Generative Masked Modeling for Robust Hand Mesh Reconstruction in the Wild.

[DOI]

Muhammad Usama Saleem

,

Ekkasit Pinyoanuntapong

,

Mayur Jagdishbhai Patel

,

,

,

,

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

MaskControl: Spatio-Temporal Control for Masked Motion Synthesis.

[DOI]

Ekkasit Pinyoanuntapong

,

Muhammad Usama Saleem

,

Korrawe Karunratanakul

,

,

,

,

,

,

,

Sergey Tulyakov

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

mmCooper: A Multi-Agent Multi-Stage Communication-Efficient and Collaboration-Robust Cooperative Perception Framework.

[DOI]

,

,

,

,

,

,

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

FreqPure: A High-Frequency Preservation Diffusion-Based Purification Method for Protective Perturbation Removal.

[DOI]

,

,

Proceedings of the IEEE/CVF International Conference on Computer Vision, ICCV 2025, 2025

HOIGPT: Learning Long-Sequence Hand-Object Interaction with Language Models.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

GenHMR: Generative Human Mesh Recovery.

[DOI]

Muhammad Usama Saleem

,

Ekkasit Pinyoanuntapong

,

,

,

,

Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024

Prompt Learning for Multimodal Intent Recognition with Modal Alignment Perception.

[DOI]

,

,

,

,

,

,

Cogn. Comput., November, 2024

Towards Smartphone-based 3D Hand Pose Reconstruction Using Acoustic Signals.

[DOI]

,

,

,

,

,

,

,

,

ACM Trans. Sens. Networks, September, 2024

MMHMR: Generative Masked Modeling for Hand Mesh Recovery.

[DOI]

Muhammad Usama Saleem

,

Ekkasit Pinyoanuntapong

,

Mayur Jagdishbhai Patel

,

,

,

,

CoRR, 2024

ControlMM: Controllable Masked Motion Generation.

[DOI]

Ekkasit Pinyoanuntapong

,

Muhammad Usama Saleem

,

Korrawe Karunratanakul

,

,

,

,

,

,

,

Sergey Tulyakov

CoRR, 2024

Ideal-LLM: Integrating Dual Encoders and Language-Adapted LLM for Multilingual Speech-to-Text.

[DOI]

,

,

,

,

,

,

,

,

CoRR, 2024

Findings of the 2024 Mandarin Stuttering Event Detection and Automatic Speech Recognition Challenge.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

mmCLIP: Boosting mmWave-based Zero-shot HAR via Signal-Text Alignment.

[DOI]

,

,

,

,

,

,

Proceedings of the 22nd ACM Conference on Embedded Networked Sensor Systems, 2024

Malicious Attacks against Multi-Sensor Fusion in Autonomous Driving.

[DOI]

,

,

,

,

,

Proceedings of the 30th Annual International Conference on Mobile Computing and Networking, 2024

E-Chat: Emotion-Sensitive Spoken Dialogue System with Large Language Models.

[DOI]

,

,

,

,

,

,

Proceedings of the 14th IEEE International Symposium on Chinese Spoken Language Processing, 2024

Unveiling the Potential of LLM-Based ASR on Chinese Open-Source Datasets.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the 14th IEEE International Symposium on Chinese Spoken Language Processing, 2024

AS-70: A Mandarin stuttered speech dataset for automatic speech recognition and stuttering event detection.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

SSHR: Leveraging Self-supervised Hierarchical Representations for Multilingual Automatic Speech Recognition.

[DOI]

,

,

,

,

,

Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

Breakthrough from Nuance and Inconsistency: Enhancing Multimodal Sarcasm Detection with Context-Aware Self-Attention Fusion and Word Weight Calculation.

[DOI]

,

,

,

,

,

Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

Towards Robust mmWave-based Human Activity Recognition using Large Simulated Dataset for Model Pretraining.

[DOI]

,

,

,

,

,

Proceedings of the IEEE International Conference on Big Data, 2024

2023

Towards Generalized mmWave-based Human Pose Estimation through Signal Augmentation.

[DOI]

,

,

,

,

,

,

Proceedings of the 29th Annual International Conference on Mobile Computing and Networking, 2023

Macular: A Multi-Task Adversarial Framework for Cross-Lingual Natural Language Understanding.

[DOI]

,

,

,

,

Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023

TranUSR: Phoneme-to-word Transcoder Based Unified Speech Representation Learning for Cross-lingual Speech Recognition.

[DOI]

,

,

,

,

,

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

TileMask: A Passive-Reflection-based Attack against mmWave Radar Object Detection in Autonomous Driving.

[DOI]

,

,

,

,

,

,

,

Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security, 2023

BA-MoE: Boundary-Aware Mixture-of-Experts Adapter for Code-Switching Speech Recognition.

[DOI]

,

,

,

,

,

,

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022

M<sup>4</sup>esh: mmWave-Based 3D Human Mesh Construction for Multiple Subjects.

[DOI]

,

,

,

,

,

,

Proceedings of the 20th ACM Conference on Embedded Networked Sensor Systems, 2022

Fusing Global and Local Features for Generalized AI-Synthesized Image Detection.

[DOI]

,

,

,

,

,

Proceedings of the 2022 IEEE International Conference on Image Processing, 2022

2021

mmMesh: towards 3D real-time dynamic human mesh construction using millimeter-wave.

[DOI]

,

,

,

,

,

,

Proceedings of the MobiSys '21: The 19th Annual International Conference on Mobile Systems, Applications, and Services, Virtual Event, Wisconsin, USA, 24 June, 2021

2020

DeepMV: Multi-View Deep Learning for Device-Free Human Activity Recognition.

[DOI]

,

,

,

,

,

,

,

,

Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., 2020

Towards 3D human pose construction using wifi.

[DOI]

,

,

,

,

,

,

Srinivasan Murali

,

,

,

Proceedings of the MobiCom '20: The 26th Annual International Conference on Mobile Computing and Networking, 2020

2019

DeepFusion: A Deep Learning Framework for the Fusion of Heterogeneous Sensory Data.

[DOI]

,

,

,

,

,

,

,

,

,

,

Proceedings of the Twentieth ACM International Symposium on Mobile Ad Hoc Networking and Computing, 2019

On the Estimation of Treatment Effect with Text Covariates.

[DOI]

,

,

,

,

,

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Deep Metric Learning: The Generalization Analysis and an Adaptive Algorithm.

[DOI]

,

,

,

,

,

,

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

2018

Towards Environment Independent Device Free Human Activity Recognition.

[DOI]

,

,

,

,

,

,

,

,

,

Dimitrios Koutsonikolas

,

,

Proceedings of the 24th Annual International Conference on Mobile Computing and Networking, 2018

A novel channel-aware attention framework for multi-channel EEG seizure detection via multi-view deep learning.

[DOI]

,

,

,

,

,

,

Proceedings of the 2018 IEEE EMBS International Conference on Biomedical & Health Informatics, 2018

2016

Risk Factor Analysis Based on Deep Learning Models.

[DOI]

,

,

,

Proceedings of the 7th ACM International Conference on Bioinformatics, 2016

Loading...