Jiaming Zhou

This page is a disambiguation page, it actually contains mutiple papers from persons of the same or a similar name.

Bibliography

2026

One model, dual tasks: a novel distributionally adaptive learning framework for ECG classification and generation addressing intra- and inter-patient variability.

[BibT_eX]

[DOI]

Expert Syst. Appl., 2026

2025

Zero- and One-Shot Data Augmentation for Sentence-Level Dysarthric Speech Recognition in Constrained Scenarios.

[BibT_eX]

[DOI]

CoRR, October, 2025

SpeechLLM-as-Judges: Towards General and Interpretable Speech Quality Evaluation.

[BibT_eX]

[DOI]

CoRR, October, 2025

AudioEval: Automatic Dual-Perspective and Multi-Dimensional Evaluation of Text-to-Audio-Generation.

[BibT_eX]

[DOI]

CoRR, October, 2025

WildElder: A Chinese Elderly Speech Dataset from the Wild with Fine-Grained Manual Annotations.

[BibT_eX]

[DOI]

CoRR, October, 2025

EgoTraj-Bench: Towards Robust Trajectory Prediction Under Ego-view Noisy Observations.

[BibT_eX]

[DOI]

CoRR, October, 2025

From Watch to Imagine: Steering Long-horizon Manipulation via Human Demonstration and Future Envisionment.

[BibT_eX]

[DOI]

CoRR, September, 2025

MECap-R1: Emotion-aware Policy with Reinforcement Learning for Multimodal Emotion Captioning.

[BibT_eX]

[DOI]

CoRR, September, 2025

Mind the Gap: Data Rewriting for Stable Off-Policy Supervised Fine-Tuning.

[BibT_eX]

[DOI]

CoRR, September, 2025

GLAD: Global-Local Aware Dynamic Mixture-of-Experts for Multi-Talker ASR.

[BibT_eX]

[DOI]

CoRR, September, 2025

TTA-Bench: A Comprehensive Benchmark for Evaluating Text-to-Audio Models.

[BibT_eX]

[DOI]

CoRR, September, 2025

RAMPGrasp: Retentive Attention-Based Multiscale Perception Grasp Detection Network.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., August, 2025

RealTalk-CN: A Realistic Chinese Speech-Text Dialogue Benchmark With Cross-Modal Interaction Analysis.

[BibT_eX]

[DOI]

CoRR, August, 2025

End-to-End Humanoid Robot Safe and Comfortable Locomotion Policy.

[BibT_eX]

[DOI]

CoRR, August, 2025

DIFFA: Large Language Diffusion Models Can Listen and Understand.

[BibT_eX]

[DOI]

CoRR, July, 2025

Omni-Thinker: Scaling Cross-Domain Generalization in LLMs via Multi-Task RL with Hybrid Rewards.

[BibT_eX]

[DOI]

Mohammad Ali Alomrani

CoRR, July, 2025

StreamMel: Real-Time Zero-shot Text-to-Speech via Interleaved Continuous Autoregressive Modeling.

[BibT_eX]

[DOI]

CoRR, June, 2025

A State Space Model for Multiobject Full 3-D Information Estimation From RGB-D Images.

[BibT_eX]

[DOI]

IEEE Trans. Cybern., May, 2025

EmotionTalk: An Interactive Chinese Multimodal Emotion Dataset With Rich Annotations.

[BibT_eX]

[DOI]

CoRR, May, 2025

Omni-Perception: Omnidirectional Collision Avoidance for Legged Locomotion in Dynamic Environments.

[BibT_eX]

[DOI]

CoRR, May, 2025

Reinforcing Question Answering Agents with Minimalist Policy Gradient Optimization.

[BibT_eX]

[DOI]

CoRR, May, 2025

Exploring the Limits of Vision-Language-Action Manipulations in Cross-task Generalization.

[BibT_eX]

[DOI]

CoRR, May, 2025

GLOVER++: Unleashing the Potential of Affordance Learning from Human Behaviors for Robotic Manipulation.

[BibT_eX]

[DOI]

CoRR, May, 2025

SeniorTalk: A Chinese Conversation Dataset with Rich Annotations for Super-Aged Seniors.

[BibT_eX]

[DOI]

CoRR, March, 2025

Human-Centric Transformer for Domain Adaptive Action Recognition.

[BibT_eX]

[DOI]

Kun-Yu Lin

Jiaming Zhou

Wei-Shi Zheng

IEEE Trans. Pattern Anal. Mach. Intell., February, 2025

CS-Dialogue: A 104-Hour Dataset of Spontaneous Mandarin-English Code-Switching Dialogues for Speech Recognition.

[BibT_eX]

[DOI]

CoRR, February, 2025

FELLE: Autoregressive Speech Synthesis with Token-Wise Coarse-to-Fine Flow Matching.

[BibT_eX]

[DOI]

CoRR, February, 2025

MusicEval: A Generative Music Corpus with Expert Ratings for Automatic Text-to-Music Evaluation.

[BibT_eX]

[DOI]

CoRR, January, 2025

StreamMel: Real-Time Zero-Shot Text-to-Speech Via Interleaved Continuous Autoregressive Modeling.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2025

A Self-Training Approach for Whisper to Enhance Long Dysarthric Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

RA-CLAP: Relation-Augmented Emotional Speaking Style Contrastive Language-Audio Pretraining For Speech Retrieval.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Chinese-LiPS: A Chinese Audio-Visual Speech Recognition Dataset with Lip-Reading and Presentation Slides.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2025

Improving Zero-Shot Chinese-English Code-Switching ASR with kNN-CTC and Gated Monolingual Datastores.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

M2R-Whisper: Multi-stage and Multi-scale Retrieval Augmentation for Enhancing Whisper.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Enhancing Multimodal Emotion Recognition through Multi-Granularity Cross-Modal Alignment.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Enhancing Emotion Recognition in Incomplete Data: A Novel Cross-Modal Alignment, Reconstruction, and Refinement Framework.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

MusicEval: A Generative Music Dataset with Expert Ratings for Automatic Text-to-Music Evaluation.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Emotion-Preserving Prosody Anonymization Network for Voice Privacy Protection.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Mitigating the Human-Robot Domain Discrepancy in Visual Pre-training for Robotic Manipulation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

One-Iteration-per-Update (OIpU) Algorithm Applied to MMPaC (Minimum Motion Planning and Control) of Planar Four-Link Robotic Arm Aided with Zhang Equivalency.

[BibT_eX]

[DOI]

Jiaming Zhou

Zhiwen Yuan

Yunong Zhang

Proceedings of the 28th International Conference on Computer Supported Cooperative Work in Design, 2025

ChildMandarin: A Comprehensive Mandarin Speech Dataset for Young Children Aged 3-5.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024

PoseDiffusion: A Coarse-to-Fine Framework for Unseen Object 6-DoF Pose Estimation.

[BibT_eX]

[DOI]

IEEE Trans. Ind. Informatics, September, 2024

A Depth Adaptive Feature Extraction and Dense Prediction Network for 6-D Pose Estimation in Robotic Grasping.

[BibT_eX]

[DOI]

IEEE Trans. Ind. Informatics, February, 2024

TwinFormer: Fine-to-Coarse Temporal Modeling for Long-Term Action Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2024

Path-of-Thoughts: Extracting and Following Paths for Robust Relational Reasoning with Large Language Models.

[BibT_eX]

[DOI]

Ge Zhang

Mohammad Ali Alomrani

CoRR, 2024

GLOVER: Generalizable Open-Vocabulary Affordance Reasoning for Task-Oriented Grasping.

[BibT_eX]

[DOI]

CoRR, 2024

ChildMandarin: A Comprehensive Mandarin Speech Dataset for Young Children Aged 3-5.

[BibT_eX]

[DOI]

CoRR, 2024

Enhancing Logical Reasoning in Large Language Models through Graph-based Synthetic Data.

[BibT_eX]

[DOI]

CoRR, 2024

Self-Prompt Tuning: Enable Autonomous Role-Playing in LLMs.

[BibT_eX]

[DOI]

CoRR, 2024

Rethinking CLIP-based Video Learners in Cross-Domain Open-Vocabulary Action Recognition.

[BibT_eX]

[DOI]

CoRR, 2024

ActionHub: A Large-scale Action Video Description Dataset for Zero-shot Action Recognition.

[BibT_eX]

[DOI]

CoRR, 2024

Findings of the 2024 Mandarin Stuttering Event Detection and Automatic Speech Recognition Challenge.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

PB-LRDWWS System For the SLT 2024 Low-Resource Dysarthria Wake-Up Word Spotting Challenge.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

Uncertainty-Aware Mean Opinion Score Prediction.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Enhancing Dysarthric Speech Recognition for Unseen Speakers via Prototype-Based Adaptation.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Iterative Prototype Refinement for Ambiguous Speech Emotion Recognition.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

AS-70: A Mandarin stuttered speech dataset for automatic speech recognition and stuttering event detection.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

CKGConv: General Graph Convolution with Continuous Kernels.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

KNN-CTC: Enhancing ASR via Retrieval of CTC Pseudo Labels.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

CIF-T: A Novel CIF-Based Transducer Architecture for Automatic Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Uniform-Distribution (UD) Based Time Intervals of GRC (Global Reserve Currency) Transition Year Predicted Narrowly as [2024, 2040] and Generally [2k00, 2k50].

[BibT_eX]

[DOI]

Yunong Zhang

Jiaming Zhou

Proceedings of the 27th International Conference on Computer Supported Cooperative Work in Design, 2024

Contrastive Imitation Learning for Language-guided Multi-Task Robotic Manipulation.

[BibT_eX]

[DOI]

Proceedings of the Conference on Robot Learning, 6-9 November 2024, Munich, Germany., 2024

2023

Toward TR-PCB Bubble Detection via an Efficient Attention Segmentation Network and Dynamic Threshold.

[BibT_eX]

[DOI]

IEEE Trans. Instrum. Meas., 2023

Exploring ChatGPT's Potential for Consultation, Recommendations and Report Diagnosis: Gastric Cancer and Gastroscopy Reports' Case.

[BibT_eX]

[DOI]

Rubén González Crespo

Int. J. Interact. Multim. Artif. Intell., 2023

GeoDeformer: Geometric Deformable Transformer for Action Recognition.

[BibT_eX]

[DOI]

CoRR, 2023

AdaFocus: Towards End-to-end Weakly Supervised Learning for Long-Video Action Understanding.

[BibT_eX]

[DOI]

CoRR, 2023

PostRainBench: A comprehensive benchmark and a new model for precipitation forecasting.

[BibT_eX]

[DOI]

CoRR, 2023

Improved YOLOv7 Based on Transformer for Object Detection in UAV-Captured Images.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, 2023

Diversifying Spatial-Temporal Perception for Video Domain Generalization.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

MADI: Inter-Domain Matching and Intra-Domain Discrimination for Cross-Domain Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Unsupervised Deep Homography Estimation based on Transformer.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Advanced Robotics and Mechatronics, 2023

2022

Execution and perception of upper limb exoskeleton for stroke patients: a systematic review.

[BibT_eX]

[DOI]

Intell. Serv. Robotics, 2022

Adversarial Partial Domain Adaptation by Cycle Inconsistency.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

2021

Analysis and Design of Drivetrain Control for the AEV With Network-Induced Compounding-Construction Loop Delays.

[BibT_eX]

[DOI]

IEEE Trans. Veh. Technol., 2021

Video Mosaic of Arbitrary Viewing Direction Oriented on DAS.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Signal Processing, 2021

Graph-Based High-Order Relation Modeling for Long-Term Action Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Novelty Detection-Based Automated Anomaly Identification via Optimized Deep Generative Model.

[BibT_eX]

[DOI]

Proceedings of the Big Data - 9th CCF Conference, 2021

2020

Toward Flotation Process Operation-State Identification via Statistical Modeling of Biologically Inspired Gabor Filtering Responses.

[BibT_eX]

[DOI]

IEEE Trans. Cybern., 2020

Research on the Damping Effect Mechanism and Optimization of Super-High-Speed Electric Air Compressors for Fuel Cell Vehicles Under the Stiffness Softening Effect.

[BibT_eX]

[DOI]

IEEE Access, 2020

2019

FLONet: Fewer Labeling Cost Active Learning for Deep Neural Network.

[BibT_eX]

[DOI]

Proceedings of the 21st IEEE International Conference on High Performance Computing and Communications; 17th IEEE International Conference on Smart City; 5th IEEE International Conference on Data Science and Systems, 2019

LADet: A Light-weight and Adaptive Network for Multi-scale Object Detection.

[BibT_eX]

[DOI]

Proceedings of The 11th Asian Conference on Machine Learning, 2019

Jiaming Zhou

Bibliography

Loading...