Yao Hu
Orcid: 0009-0006-1274-7111Affiliations:
- Xiaohongshu Inc., Beijing, China
- Zhejiang University of Technology, Hangzhou, China (2021 - 2024)
- Zhejiang University, Hangzhou, China (PhD 2015)
According to our database1,
Yao Hu authored at least 211 papers
between 2012 and 2026.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2026
AgentCVR: Active Multi-Agent Cross-Video Reasoning via Script-Simulated Reinforcement Learning.
CoRR, May, 2026
Share More, Search Less: Collaborative Parallel Thinking for Efficient Test-Time Scaling.
CoRR, May, 2026
Tournament-GRPO: Group-Wise Tournament Rewards for Reinforcement Learning in Open-Ended Long-Form Generation.
CoRR, May, 2026
CoRR, May, 2026
VideoSeeker: Incentivizing Instance-level Video Understanding via Native Agentic Tool Invocation.
CoRR, May, 2026
CCD-Level and Load-Aware Thread Orchestration for In-Memory Vector ANNS on Multi-Core CPUs.
CoRR, May, 2026
HTPO: Towards Exploration-Exploitation Balanced Policy Optimization via Hierarchical Token-level Objective Control.
CoRR, May, 2026
CoRR, May, 2026
MUSE: Resolving Manifold Misalignment in Visual Tokenization via Topological Orthogonality.
CoRR, May, 2026
CoRR, May, 2026
From a Social Cognitive Perspective: Context-Aware Visual Social Relationship Recognition.
IEEE Trans. Neural Networks Learn. Syst., April, 2026
Edit Where You Mean: Region-Aware Adapter Injection for Mask-Free Local Image Editing.
CoRR, April, 2026
EditCaption: Human-Aligned Instruction Synthesis for Image Editing via Supervised Fine-Tuning and Direct Preference Optimization.
CoRR, April, 2026
SPARD: Self-Paced Curriculum for RL Alignment via Integrating Reward Dynamics and Data Utility.
CoRR, April, 2026
CoRR, March, 2026
FireRedASR2S: A State-of-the-Art Industrial-Grade All-in-One Automatic Speech Recognition System.
CoRR, March, 2026
CoRR, March, 2026
LASER: An Efficient Target-Aware Segmented Attention Framework for End-to-End Long Sequence Modeling.
CoRR, February, 2026
QP-OneModel: A Unified Generative LLM for Multi-Task Query Understanding in Xiaohongshu Search.
CoRR, February, 2026
CoRR, February, 2026
CoRR, February, 2026
IVC-Prune: Revealing the Implicit Visual Coordinates in LVLMs for Vision Token Pruning.
CoRR, February, 2026
Vision-DeepResearch Benchmark: Rethinking Visual and Textual Search for Multimodal Large Language Models.
CoRR, February, 2026
Decouple Searching from Training: Scaling Data Mixing via Model Merging for Large Language Model Pre-training.
CoRR, February, 2026
Learning More from Less: Unlocking Internal Representations for Benchmark Compression.
CoRR, February, 2026
Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models.
CoRR, January, 2026
CoRR, January, 2026
CoRR, January, 2026
Do Not Waste Your Rollouts: Recycling Search Experience for Efficient Test-Time Scaling.
CoRR, January, 2026
CoRR, January, 2026
EComStage: Stage-wise and Orientation-specific Benchmarking for Large Language Models in E-commerce.
CoRR, January, 2026
HyMiRec: A Hybrid Multi-interest Learning Framework for LLM-based Sequential Recommendation.
Proceedings of the ACM Web Conference 2026, 2026
Proceedings of the ACM Web Conference 2026, 2026
Proceedings of the ACM Web Conference 2026, 2026
Proceedings of the Abstracts of the 2026 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, 2026
Optimizing Generative Ranking Relevance via Reinforcement Learning in Xiaohongshu Search.
Proceedings of the 32nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.1, 2026
SPARD: Self-Paced Curriculum for RL Alignment via Integrating Reward Dynamics and Data Utility.
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2026
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2026
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2026
Anti-Length Shift: Dynamic Outlier Truncation for Training Efficient Reasoning Models.
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2026
CrossVid: A Comprehensive Benchmark for Evaluating Cross-Video Reasoning in Multimodal Large Language Models.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026
2025
RedOne 2.0: Rethinking Domain-specific LLM Post-Training in Social Networking Services.
CoRR, November, 2025
CoRR, November, 2025
CoRR, October, 2025
CoRR, October, 2025
CoRR, October, 2025
RealBench: A Chinese Multi-image Understanding Benchmark Close to Real-world Scenarios.
CoRR, September, 2025
CoRR, September, 2025
FireRedChat: A Pluggable, Full-Duplex Voice Interaction System with Cascaded and Semi-Cascaded Implementations.
CoRR, September, 2025
CoRR, September, 2025
Decomposed Reasoning with Reinforcement Learning for Relevance Assessment in UGC Platforms.
CoRR, August, 2025
CoRR, July, 2025
Flux-Sculptor: Text-Driven Rich-Attribute Portrait Editing through Decomposed Spatial Flow Control.
CoRR, July, 2025
CoRR, June, 2025
Plan Your Travel and Travel with Your Plan: Wide-Horizon Planning and Evaluation via LLM.
CoRR, June, 2025
MT<sup>3</sup>: Scaling MLLM-based Text Image Machine Translation via Multi-Task Reinforcement Learning.
CoRR, May, 2025
MT-R1-Zero: Advancing LLM-based Machine Translation via R1-Zero-like Reinforcement Learning.
CoRR, April, 2025
SocioVerse: A World Model for Social Simulation Powered by LLM Agents and A Pool of 10 Million Real-World Users.
CoRR, April, 2025
Redefining Machine Translation on Social Network Services with Large Language Models.
CoRR, April, 2025
IEEE Trans. Neural Networks Learn. Syst., March, 2025
CoRR, March, 2025
CQ-DINO: Mitigating Gradient Dilution via Category Queries for Vast Vocabulary Object Detection.
CoRR, March, 2025
VLRMBench: A Comprehensive and Challenging Benchmark for Vision-Language Reward Models.
CoRR, March, 2025
CoRR, March, 2025
CoRR, February, 2025
FireRedASR: Open-Source Industrial-Grade Mandarin Speech Recognition Models from Encoder-Decoder to LLM Integration.
CoRR, January, 2025
DynamicFace: High-Quality and Consistent Video Face Swapping using Composable 3D Facial Priors.
CoRR, January, 2025
Scenario-Aware Multimodal Chain-of-Thought Prompting for Rationales of VideoSocial Relations.
IEEE Trans. Circuits Syst. Video Technol., 2025
Scalable Overload-Aware Graph-Based Index Construction for 10-Billion-Scale Vector Similarity Search.
Proceedings of the Companion Proceedings of the ACM on Web Conference 2025, 2025
Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2025
Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2025
Multi-Granularity Distribution Modeling for Video Watch Time Prediction via Exponential-Gaussian Mixture Network.
Proceedings of the Nineteenth ACM Conference on Recommender Systems, 2025
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025
RAG-IGBench: Innovative Evaluation for RAG-based Interleaved Generation in Open-domain Question Answering.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025
Wide-Horizon Thinking and Simulation-Based Evaluation for Real-World LLM Planning with Multifaceted Constraints.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025
Make Every Penny Count: Difficulty-Adaptive Self-Consistency for Cost-Efficient Reasoning.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025
Proceedings of the 33rd ACM International Conference on Multimedia, 2025
Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, V.1, 2025
Improving Synthetic Image Detection Towards Generalization: An Image Transformation Perspective.
Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, V.1, 2025
SNS-Bench: Defining, Building, and Assessing Capabilities of Large Language Models in Social Networking Services.
Proceedings of the Forty-second International Conference on Machine Learning, 2025
UniCBE: An Uniformity-driven Comparing Based Evaluation Framework with Unified Multi-Objective Optimization.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025
DynamicFace: High-Quality and Consistent Face Swapping for Image and Video Using Composable 3D Facial Priors.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025
VLRMBench: A Comprehensive and Challenging Benchmark for Vision-Language Reward Models.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025
RealBench: A Chinese Multi-image Understanding Benchmark Close to Real-world Scenarios.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025
Think-Search-Patch: A Retrieval-Augmented Reasoning Framework for Repository-Level Code Repair.
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025
EcoLANG: Efficient and Effective Agent Communication Language Induction for Social Simulation.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025
DecEx-RAG: Boosting Agentic Retrieval-Augmented Generation with Decision and Execution Optimization via Process Supervision.
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025
SelfAug: Mitigating Catastrophic Forgetting in Retrieval-Augmented Generation via Distribution Self-Alignment.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
Proceedings of the 31st International Conference on Computational Linguistics, 2025
ZigZagKV: Dynamic KV Cache Compression for Long-context Modeling based on Layer Uncertainty.
Proceedings of the 31st International Conference on Computational Linguistics, 2025
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025
From Sub-Ability Diagnosis to Human-Aligned Generation: Bridging the Gap for Text Length Control via MarkerGen.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025
iPET: An Interactive Emotional Companion Dialogue System with LLM-Powered Virtual Pet World Simulation.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations), 2025
Revisiting Self-Consistency from Dynamic Distributional Alignment Perspective on Answer Aggregation.
Proceedings of the Findings of the Association for Computational Linguistics, 2025
2024
Int. J. Comput. Vis., November, 2024
Int. J. Comput. Vis., November, 2024
TOMGPT: Reliable Text-Only Training Approach for Cost-Effective Multi-modal Large Language Model.
ACM Trans. Knowl. Discov. Data, August, 2024
PiClick: Picking the desired mask from multiple candidates in click-based interactive segmentation.
Neurocomputing, 2024
ScalingNote: Scaling up Retrievers with Large Language Models for Real-World Dense Retrieval.
CoRR, 2024
Benchmarking Large Language Models for Conversational Question Answering in Multi-instructional Documents.
CoRR, 2024
Target-Driven Distillation: Consistency Distillation with Target Timestep Selection and Decoupled Guidance.
CoRR, 2024
Make Every Penny Count: Difficulty-Adaptive Self-Consistency for Cost-Efficient Reasoning.
CoRR, 2024
Mining Open Semantics from CLIP: A Relation Transition Perspective for Few-Shot Learning.
CoRR, 2024
From a Social Cognitive Perspective: Context-aware Visual Social Relationship Recognition.
CoRR, 2024
Agent Group Chat: An Interactive Group Chat Simulacra For Better Eliciting Collective Emergent Behavior.
CoRR, 2024
Proceedings of the Companion Proceedings of the ACM on Web Conference 2024, 2024
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024
VideoLLM-MoD: Efficient Video-Language Streaming with Mixture-of-Depths Vision Computation.
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024
Instruction Embedding: Latent Representations of Instructions Towards Task Identification.
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Knowledge-Enhanced Multi-perspective Incongruity Perception Network for Multimodal Sarcasm Detection.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2024
Proceedings of the IEEE International Conference on Image Processing, 2024
Proceedings of the IEEE International Conference on Data Mining, 2024
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
Proceedings of the Computer Vision - ECCV 2024, 2024
SSR-Encoder: Encoding Selective Subject Representation for Subject-Driven Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
Proceedings of the Findings of the Association for Computational Linguistics, 2024
Proceedings of the Findings of the Association for Computational Linguistics, 2024
Integrate the Essence and Eliminate the Dross: Fine-Grained Self-Consistency for Free-Form Language Generation.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
2023
Optimizing traffic efficiency via a reinforcement learning approach based on time allocation.
Int. J. Mach. Learn. Cybern., October, 2023
CoRR, 2023
CoRR, 2023
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
2022
IEEE Trans. Image Process., 2022
Proceedings of the Web and Big Data - 6th International Joint Conference, 2022
2021
ACM Trans. Multim. Comput. Commun. Appl., 2021
Proceedings of the Pattern Recognition and Computer Vision - 4th Chinese Conference, 2021
Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, 2021
Linking the Characters: Video-oriented Social Graph Generation via Hierarchical-cumulative GCN.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021
2020
Multi-label Zero-shot Classification by Learning to Transfer from External Knowledge.
CoRR, 2020
Modeling Heterogeneous Statistical Patterns in High-dimensional Data by Adversarial Distributions: An Unsupervised Generative Framework.
Proceedings of the WWW '20: The Web Conference 2020, Taipei, Taiwan, April 20-24, 2020, 2020
Proceedings of the KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2020
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020
Deep Time-Stream Framework for Click-through Rate Prediction by Tracking Interest Evolution.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020
2019
CoRR, 2019
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019
Proceedings of the 36th International Conference on Machine Learning, 2019
2016
ACM Trans. Multim. Comput. Commun. Appl., 2016
Online robust principal component analysis via truncated nuclear norm regularization.
Neurocomputing, 2016
Neurocomputing, 2016
2015
Neurocomputing, 2015
Proceedings of the Intelligence Science and Big Data Engineering. Big Data and Machine Learning Techniques, 2015
2014
IEEE Trans. Cybern., 2014
Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014
Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014
Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, 2014
2013
IEEE Trans. Pattern Anal. Mach. Intell., 2013
Proceedings of the Intelligence Science and Big Data Engineering, 2013
A Unified Approximate Nearest Neighbor Search Scheme by Combining Data Structure and Hashing.
Proceedings of the IJCAI 2013, 2013
Proceedings of the IEEE International Conference on Computer Vision, 2013
2012
Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2012
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012