Junbo Zhang

This page is a disambiguation page, it actually contains mutiple papers from persons of the same or a similar name.

Known people with the same name:

Bibliography

2025
MiDashengLM: Efficient Audio Understanding with General Audio Captions.
CoRR, August, 2025

MECAT: A Multi-Experts Constructed Benchmark for Fine-Grained Audio Understanding Tasks.
CoRR, July, 2025

Unified Vision-Language-Action Model.
CoRR, June, 2025

Efficient Speech Enhancement via Embeddings from Pre-trained Generative Audioencoders.
CoRR, June, 2025

GLAP: General contrastive audio-text pretraining across domains and languages.
CoRR, June, 2025

X-ARES: A Comprehensive Framework for Assessing Audio Encoder Performance.
CoRR, May, 2025

Reinforcement Learning Outperforms Supervised Fine-Tuning: A Case Study on Audio Question Answering.
CoRR, March, 2025

The ICME 2025 Audio Encoder Capability Challenge.
CoRR, January, 2025

G2MBCF: Enhanced Named Entity Recognition for sensitive entities identification.
Data Knowl. Eng., 2025

RoLD: Robot Latent Diffusion for Multi-task Policy Modeling.
Proceedings of the MultiMedia Modeling, 2025

AirRadar: Inferring Nationwide Air Quality in China with Deep Neural Networks.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024
Integrating a Pipette Into a Robot Manipulator With Uncalibrated Vision and TCP for Liquid Handling.
IEEE Trans Autom. Sci. Eng., October, 2024

BAFormer: A Novel Boundary-Aware Compensation UNet-like Transformer for High-Resolution Cropland Extraction.
Remote. Sens., July, 2024

Video Text Detection With Robust Feature Representation.
IEEE Trans. Circuits Syst. Video Technol., June, 2024

Underwater image dehazing using a novel color channel based dual transmission map estimation.
Multim. Tools Appl., 2024

The impact of emotional expression by artificial intelligence recommendation chatbots on perceived humanness and social interactivity.
Decis. Support Syst., 2024

Emotional expressions of care and concern by customer service chatbots: Improved customer attitudes despite perceived inauthenticity.
Decis. Support Syst., 2024

Analyzing the Scalability of Bi-static Backscatter Networks for Large Scale Applications.
CoRR, 2024

Enhancing Automated Audio Captioning via Large Language Models with Optimized Audio Encoding.
CoRR, 2024

Scaling up masked audio encoder learning for general audio classification.
CoRR, 2024

Multi-task Manipulation Policy Modeling with Visuomotor Latent Diffusion.
CoRR, 2024

Sequential Transformer for End-to-End Video Text Detection.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

MG-VLN: Benchmarking Multi-Goal and Long-Horizon Vision-Language Navigation with Language Enhanced Memory Map.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2024

Bridging Language Gaps in Audio-Text Retrieval.
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Enhancing Automated Audio Captioning via Large Language Models with Optimized Audio Encoding.
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Streaming Audio Transformers for Online Audio Tagging.
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Scaling up masked audio encoder learning for general audio classification.
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

VQGG: Generating Adaptive Graphs for Traffic Forecasting via a Vector-Quantized Graph Generator.
Proceedings of the International Joint Conference on Neural Networks, 2024

Statistical Studies of Fading in Underwater Wireless Optical Channels in the Presence of Bubbles.
Proceedings of the IEEE International Conference on Signal Processing, 2024

CED: Consistent Ensemble Distillation for Audio Tagging.
Proceedings of the IEEE International Conference on Acoustics, 2024

Long Video Scoring Method Fusing High-Precision Pose and Spatio-Temporal Attention Modules.
Proceedings of the Web and Big Data - 8th International Joint Conference, 2024

2023
Reinvigorating sustainability in Internet of Things marketing: Framework for multi-round real-time bidding with game machine learning.
Internet Things, December, 2023

Teaching quality monitoring and evaluation of physical education teaching in ordinary college based on edge computing optimization model.
J. Supercomput., October, 2023

Spatio-Temporal Contrastive Self-Supervised Learning for POI-level Crowd Flow Inference.
CoRR, 2023

Understanding temporally weakly supervised training: A case study for keyword spotting.
CoRR, 2023

Streaming Audio Transformers for Online Audio Tagging.
CoRR, 2023

Motor Imagery EEG Recognition Based on an Improved Convolutional Neural Network with Parallel Gate Recurrent Unit.
Proceedings of the Pattern Recognition and Computer Vision - 6th Chinese Conference, 2023

Focus on the Sound around You: Monaural Target Speaker Extraction via Distance and Speaker Information.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Improved RAKE receiver for underwater acoustic spread spectrum communication based on m-SAMP channel estimation.
Proceedings of the IEEE International Conference on Signal Processing, 2023

A Flipped Classroom Teaching Design Based on Constructivist Theory.
Proceedings of the IEEE International Conference on Signal Processing, 2023

Autoencoders as Cross-Modal Teachers: Can Pretrained 2D Image Transformers Help 3D Representation Learning?
Proceedings of the Eleventh International Conference on Learning Representations, 2023

CLIP-FO3D: Learning Free Open-world 3D Scene Representations from 2D Dense CLIP.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Av-Sepformer: Cross-Attention Sepformer for Audio-Visual Target Speaker Extraction.
Proceedings of the IEEE International Conference on Acoustics, 2023

Unified Keyword Spotting and Audio Tagging on Mobile Devices with Transformers.
Proceedings of the IEEE International Conference on Acoustics, 2023

Language-Assisted 3D Feature Learning for Semantic Scene Understanding.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Research on control method of upper limb exoskeleton based on mixed perception model.
Robotica, 2022

Real-Time Ray-Traced Soft Shadows of Environmental Lighting by Conical Ray Culling.
Proc. ACM Comput. Graph. Interact. Tech., 2022

A novel compression framework of the dense point-cloud model for cultural heritage artifacts.
Multim. Tools Appl., 2022

Integrating a Manual Pipette into a Collaborative Robot Manipulator for Flexible Liquid Dispensing.
CoRR, 2022

An Empirical Study of Weakly Supervised Audio Tagging Embeddings for General Audio Representations.
Proceedings of the Odyssey 2022: The Speaker and Language Recognition Workshop, 28 June, 2022

UniKW-AT: Unified Keyword Spotting and Audio Tagging.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Pseudo Strong Labels for Large Scale Weakly Supervised Audio Tagging.
Proceedings of the IEEE International Conference on Acoustics, 2022

Towards Relational Multi-Agent Reinforcement Learning via Inductive Logic Programming.
Proceedings of the Artificial Neural Networks and Machine Learning - ICANN 2022, 2022

Contrastive Deep Supervision.
Proceedings of the Computer Vision - ECCV 2022, 2022

Rethinking the Augmentation Module in Contrastive Learning: Learning Hierarchical Augmentation Invariance with Expanded Views.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
GigaSpeech: An Evolving, Multi-domain ASR Corpus with 10, 000 Hours of Transcribed Audio.
CoRR, 2021

speechocean762: An Open-Source Non-Native English Speech Corpus for Pronunciation Assessment.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

GigaSpeech: An Evolving, Multi-Domain ASR Corpus with 10, 000 Hours of Transcribed Audio.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

2020
Data Augmentation For Children's Speech Recognition - The "Ethiopian" System For The SLT 2021 Children Speech Recognition Challenge.
CoRR, 2020

Estimation of Power System Inertia Under Normal Operating Conditions.
Proceedings of the IEEE Power & Energy Society Innovative Smart Grid Technologies Conference, 2020

Power System Sensitivity Matrix Estimation by Multivariable Least Squares Considering Mitigating Data Saturation.
Proceedings of the 46th Annual Conference of the IEEE Industrial Electronics Society, 2020

2019
Arbitrage of Energy Storage in Electricity Markets with Deep Reinforcement Learning.
CoRR, 2019

A Predictor-Corrector Method for Power System Variable Step Numerical Simulation.
CoRR, 2019

2018
End-to-end Models with auditory attention in Multi-channel Keyword Spotting.
CoRR, 2018

Sequence-to-sequence Models for Small-Footprint Keyword Spotting.
CoRR, 2018

Empirical Evaluation of Speaker Adaptation on DNN Based Acoustic Model.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Investigating Generative Adversarial Networks Based Speech Dereverberation for Robust Speech Recognition.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Attention-based End-to-End Models for Small-Footprint Keyword Spotting.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Time Series Trend Detection and Forecasting Using Complex Network Topology Analysis.
Proceedings of the 2018 International Joint Conference on Neural Networks, 2018

Attention-Based End-to-End Speech Recognition on Voice Search.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
Online Identification of Power System Equivalent Inertia Constant.
IEEE Trans. Ind. Electron., 2017

Attention-Based End-to-End Speech Recognition in Mandarin.
CoRR, 2017

2016
Microperturbation Method for Power System Online Model Identification.
IEEE Trans. Ind. Informatics, 2016

2014
Supervised deep learning with auxiliary networks.
Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2014

2013
A Novel Discriminative Method for Pronunciation Quality Assessment.
IEICE Trans. Inf. Syst., 2013

A novel discriminative method for pronunciation quality assessment.
Proceedings of the IEEE International Conference on Acoustics, 2013

A Computer-Assist Algorithm to Detect Repetitive Stuttering Automatically.
Proceedings of the 2013 International Conference on Asian Language Processing, 2013

2012
A Forced Alignment Based Approach for English Passage Reading Assessment.
IEICE Trans. Inf. Syst., 2012

Automatic Scoring on English Passage Reading Quality.
Proceedings of the Advances in Swarm Intelligence - Third International Conference, 2012


  Loading...