Hongbin Zhou

Orcid: 0009-0003-7299-1296

According to our database1, Hongbin Zhou authored at least 50 papers between 2003 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
SPOT: Scalable 3D Pre-Training via Occupancy Prediction for Learning Transferable 3D Representations.
IEEE Trans. Pattern Anal. Mach. Intell., November, 2025

S2ST-Omni: An Efficient and Scalable Multilingual Speech-to-Speech Translation Framework via Seamless Speech-Text Alignment and Streaming Speech Generation.
CoRR, June, 2025

GDI-Bench: A Benchmark for General Document Intelligence with Vision and Reasoning Decoupling.
CoRR, May, 2025

TrustGeoGen: Scalable and Formal-Verified Data Engine for Trustworthy Multi-modal Geometric Problem Solving.
CoRR, April, 2025

Fine-grained Preference Optimization Improves Zero-shot Text-to-Speech.
CoRR, February, 2025

ChartX and ChartVLM: A Versatile Benchmark and Foundation Model for Complicated Chart Reasoning.
IEEE Trans. Image Process., 2025

ClapFM-EVC: High-Fidelity and Flexible Emotional Voice Conversion with Dual Control from Natural Language and Speech.
Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

LaTeXNet: A Specialized Model for Converting Visual Tables and Equations to LaTeX Code.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Takin-VC: Expressive Zero-Shot Voice Conversion via Adaptive Hybrid Content Encoding and Enhanced Timbre Modeling.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

StableVC: Style Controllable Zero-Shot Voice Conversion with Conditional Flow Matching.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024
METTS: Multilingual Emotional Text-to-Speech by Cross-Speaker and Cross-Lingual Emotion Transfer.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

Chimera: Improving Generalist Model with Domain-Specific Experts.
CoRR, 2024

StableVC: Style Controllable Zero-Shot Voice Conversion with Conditional Flow Matching.
CoRR, 2024

CTEFM-VC: Zero-Shot Voice Conversion Based on Content-Aware Timbre Ensemble Modeling and Flow Matching.
CoRR, 2024

Takin-ADA: Emotion Controllable Audio-Driven Animation with Canonical and Landmark Loss Optimization.
CoRR, 2024

Takin-VC: Zero-shot Voice Conversion via Jointly Hybrid Content and Memory-Augmented Context-Aware Timbre Modeling.
CoRR, 2024

Takin: A Cohort of Superior Quality Zero-shot Speech Generation Models.
CoRR, 2024

DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models.
CoRR, 2024

Vec-Tok-VC+: Residual-enhanced Robust Zero-shot Voice Conversion with Progressive Constraints in a Dual-mode Training Strategy.
CoRR, 2024

ChartX & ChartVLM: A Versatile Benchmark and Foundation Model for Complicated Chart Reasoning.
CoRR, 2024

OASim: an Open and Adaptive Simulator based on Neural Rendering for Autonomous Driving.
CoRR, 2024

ZOPP: A Framework of Zero-shot Offboard Panoptic Perception for Autonomous Driving.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Vec-Tok-VC+: Residual-enhanced Robust Zero-shot Voice Conversion with Progressive Constraints in a Dual-mode Training Strategy.
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

VeloVox: A Low-Cost and Accurate 4D Object Detector with Single-Frame Point Cloud of Livox LiDAR.
Proceedings of the IEEE International Conference on Robotics and Automation, 2024

Promptvc: Flexible Stylistic Voice Conversion in Latent Space Driven by Natural Language Prompts.
Proceedings of the IEEE International Conference on Acoustics, 2024

ASR Model Adaptation with Domain Prompt Tuning.
Proceedings of the International Conference on Asian Language Processing, 2024

2023
Denoising method for terahertz signal using RBF neural network with adaptive projection learning algorithm.
Wirel. Networks, February, 2023

Classification of Liquid Ingress in GFRP Honeycomb Based on One-Dimension Sequential Model Using THz-TDS.
Sensors, February, 2023

Vec-Tok Speech: speech vectorization and tokenization for neural speech generation.
CoRR, 2023

DetZero: Rethinking Offboard 3D Object Detection with Long-term Sequential Point Clouds.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Symbolization, Prompt, and Classification: A Framework for Implicit Speaker Identification in Novels.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Salt: Distinguishable Speaker Anonymization Through Latent Space Transformation.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022
MASNet: Improve Performance of Siamese Networks with Mutual-attention for Remote Sensing Change Detection Tasks.
CoRR, 2022

Improving Cross-Lingual Speech Synthesis with Triplet Training Scheme.
Proceedings of the IEEE International Conference on Acoustics, 2022

2020
Time Segmented Image Fusion Based Multi- Depth Defects Imaging Method in Composites With Pulsed Terahertz.
IEEE Access, 2020

Defect Depth Determination in Laser Infrared Thermography Based on LSTM-RNN.
IEEE Access, 2020

A Light-Weight Stereo Matching Network with Color Guidance Refinement.
Proceedings of the Cognitive Systems and Signal Processing - 5th International Conference, 2020

2016
Research on Spatial Information Network System Construction and Validation Technology.
Proceedings of the Space Information Networks - First International Conference, 2016

2015
Pruning redundant synthesis units based on static and delta unit appearance frequency.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Optimal dispatch of electric taxis and price making of charging stations using Stackelberg game.
Proceedings of the IECON 2015, 2015

2013
Optimization of ETSI DSR frontend software on a high-efficient audio DSP.
Proceedings of the 2013 IEEE International Symposium on Circuits and Systems (ISCAS2013), 2013

2012
Fast automatic security protocol generation.
J. Comput. Secur., 2012

2006
A Framework for Establishing Decentralized Secure Coalitions.
Proceedings of the 19th IEEE Computer Security Foundations Workshop, 2006

2005
Authorisation Subterfuge by Delegation in Decentralised Networks.
Proceedings of the Security Protocols, 2005

A Logic for Analysing Subterfuge in Delegation Chains.
Proceedings of the Formal Aspects in Security and Trust, Third International Workshop, 2005

2004
A collaborative approach to autonomic security protocols.
Proceedings of the New Security Paradigms Workshop 2004, 2004

2003
Towards a Framework for Autonomic Security Protocols.
Proceedings of the Security Protocols, 2003

Fast automatic synthesis of security protocols using backward search.
Proceedings of the 2003 ACM workshop on Formal methods in security engineering, 2003


  Loading...