Ruofan Hu

Orcid: 0009-0005-1723-6778

Affiliations:
  • Zhejiang University, Hangzhou, China


According to our database1, Ruofan Hu authored at least 13 papers between 2024 and 2026.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

Online presence:

On csauthors.net:

Bibliography

2026
DocRetriever: A Plug-and-Play Framework for Multimodal Document Retrieval with Comprehensive Benchmark.
CoRR, May, 2026

SpatialReward: Verifiable Spatial Reward Modeling for Fine-Grained Spatial Consistency in Text-to-Image Generation.
CoRR, March, 2026

HIVE: A hypergraph-based game-theoretic interactive value decomposition engine for multi-lateral agents collaboration.
Neural Networks, 2026

2025
Generative Reasoning Recommendation via LLMs.
CoRR, October, 2025

OmniChat: Enhancing Spoken Dialogue Systems with Scalable Synthetic Data for Diverse Scenarios.
CoRR, January, 2025

Multimodal Conditional Retrieval with High Controllability.
Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, V.2, 2025

MelRe: Vision-Based Mel-Spectrogram Restoration.
Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Vela: Scalable Embeddings with Voice Large Language Models for Multimodal Retrieval.
Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

GTA: Towards Generative Text-To-Audio Retrieval via Multi-Scale Tokenizer.
Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

VoxDialogue: Can Spoken Dialogue Systems Understand Information Beyond Words?
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

2024
Prompt-Singer: Controllable Singing-Voice-Synthesis with Natural Language Prompt.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

VoiceTuner: Self-Supervised Pre-training and Efficient Fine-tuning For Voice Generation.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

InstructSpeech: Following Speech Editing Instructions via Large Language Models.
Proceedings of the Forty-first International Conference on Machine Learning, 2024


  Loading...