Shihao Chen

According to our database1, Shihao Chen authored at least 31 papers between 2009 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
HAFM: Hierarchical Autoregressive Foundation Model for Music Accompaniment Generation.
CoRR, April, 2026

SaSaSaSa2VA: 2nd Place of the 5th PVUW MeViS-Text Track.
CoRR, March, 2026

3D-DCASphereNet: 3D dynamic convolutional attention network with spherical representation for high heterogeneity in lung nodule detection.
Expert Syst. Appl., 2026

2025
LSVOS 2025 Challenge Report: Recent Advances in Complex Video Object Segmentation.
CoRR, October, 2025

The 1st Solution for 7th LSVOS RVOS Track: SaSaSa2VA.
CoRR, September, 2025

VectorLLM: Human-like Extraction of Structured Building Contours vis Multimodal LLMs.
CoRR, July, 2025

DenseWorld-1M: Towards Detailed Dense Grounded Caption in the Real World.
CoRR, June, 2025

Pixel-SAIL: Single Transformer For Pixel-Grounded Understanding.
CoRR, April, 2025

Are They the Same? Exploring Visual Correspondence Shortcomings of Multimodal LLMs.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

Beyond Appearance: Geometric Cues for Robust Video Instance Segmentation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, ICCV 2025, 2025

A Lightweight and Real-Time Binaural Speech Enhancement Model with Spatial Cues Preservation.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Sinba: Singing-To-Accompaniment Generation With Pitch Guidance Via Mamba-Based Language Model.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2025

CSSinger: End-to-End Chunkwise Streaming Singing Voice Synthesis System Based on Conditional Variational Autoencoder.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024
A Lightweight and Real-Time Binaural Speech Enhancement Model with Spatial Cues Preservation.
CoRR, 2024

A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition.
CoRR, 2024

Intelligent Energy-Efficient and Fair Resource Scheduling for UAV-Assisted Space-Air-Ground Integrated Networks Under Jamming Attacks.
Proceedings of the 99th IEEE Vehicular Technology Conference, 2024

LCM-SVC: Latent Diffusion Model Based Singing Voice Conversion with Inference Acceleration via Latent Consistency Distillation.
Proceedings of the 14th IEEE International Symposium on Chinese Spoken Language Processing, 2024

LDM-SVC: Latent Diffusion Model Based Zero-Shot Any-to-Any Singing Voice Conversion with Singer Guidance.
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

A Study of Multichannel Spatiotemporal Features and Knowledge Distillation on Robust Target Speaker Extraction.
Proceedings of the IEEE International Conference on Acoustics, 2024

Adversarial Speech for Voice Privacy Protection from Personalized Speech Generation.
Proceedings of the IEEE International Conference on Acoustics, 2024

A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

STPose: 6D object pose estimation network based on sparse attention and cross-layer connection.
Proceedings of the 35th British Machine Vision Conference, 2024

2023
The USTC's Dialect Speech Translation System for IWSLT 2023.
Proceedings of the 20th International Conference on Spoken Language Translation, 2023

2022
Design for Operation in Two Frequency Bands by Division of the Coupled Region in a Waveguide 2-Plane Coupler.
IEICE Trans. Electron., December, 2022

Nanoporous Graphene Oxide-Based Quartz Crystal Microbalance Gas Sensor with Dual-Signal Responses for Trimethylamine Detection.
Sensors, 2022

2020
A Novel Non-Stationary Cylinder Model for Massive MIMO Air-To-Ground (A2G) Channels.
Proceedings of the 2020 International Conference on Wireless Communications and Signal Processing (WCSP), 2020

NB-IoT Estrus Detection System of Dairy Cows Based on LSTM Networks.
Proceedings of the 31st IEEE Annual International Symposium on Personal, 2020

2018
Reproducible Interference-Aware Mobile Testing.
Proceedings of the 2018 IEEE International Conference on Software Maintenance and Evolution, 2018

2009
Interactive Inner Structures Visualization in 3D Datasets.
Proceedings of the Fifth International Conference on Image and Graphics, 2009

Rapid Texture-based Volume Rendering.
Proceedings of the 2009 International Conference on Environmental Science and Information Application Technology, 2009

Interactive GPU-Based Volume Rendering for Medical Image.
Proceedings of the 2nd International Conference on BioMedical Engineering and Informatics, 2009


  Loading...