We stand with Ukraine

We stand with Ukraine

Zihan Qiu

Orcid: 0009-0008-3991-0817

According to our database¹, Zihan Qiu authored at least 30 papers between 2021 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

A survey on multilingual large language models: corpora, alignment, and bias.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Frontiers Comput. Sci., November, 2025

Blending Supervised and Reinforcement Fine-Tuning with Prefix Sampling.

[BibT_eX]

[DOI]

,

,

,

,

,

Edoardo M. Ponti

,

CoRR, July, 2025

A Controllable Examination for Long-Context Language Models.

[BibT_eX]

[DOI]

,

,

,

,

,

,

CoRR, June, 2025

Qwen3 Technical Report.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, May, 2025

Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, May, 2025

Neo-TKGC: Enhancing Temporal Knowledge Graph Completion with Integrated Node Weights and Future Information.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the Eighteenth ACM International Conference on Web Search and Data Mining, 2025

A Closer Look into Mixture-of-Experts in Large Language Models.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025

Layerwise Recurrent Router for Mixture-of-Experts.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Post-hoc Reward Calibration: A Case Study on Length Bias.

[BibT_eX]

[DOI]

,

,

,

Edoardo M. Ponti

,

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

Video Repurposing from User Generated Content: A Large-scale Dataset and Benchmark.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024

Qwen2.5 Technical Report.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, 2024

Reconstructing Global Daily CO2 Emissions via Machine Learning.

[BibT_eX]

[DOI]

,

,

,

,

,

Matthew W. Jones

,

Robbie M. Andrew

,

,

,

,

Robert B. Jackson

,

CoRR, 2024

GW-MoE: Resolving Uncertainty in MoE Router with Global Workspace Theory.

[BibT_eX]

[DOI]

,

,

,

,

CoRR, 2024

A Survey on Multilingual Large Language Models: Corpora, Alignment, and Bias.

[BibT_eX]

[DOI]

,

,

,

,

,

CoRR, 2024

HyperMoE: Paying Attention to Unselected Experts in Mixture of Experts via Dynamic Transfer.

[BibT_eX]

[DOI]

,

,

,

,

,

CoRR, 2024

DSIFNet: Implicit feature network for nasal cavity and vestibule segmentation from 3D head CT.

[BibT_eX]

[DOI]

,

,

,

,

,

Comput. Medical Imaging Graph., 2024

ValueCSV: Evaluating Core Socialist Values Understanding in Large Language Models.

[BibT_eX]

[DOI]

,

,

Proceedings of the Natural Language Processing and Chinese Computing, 2024

Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Unlocking Emergent Modularity in Large Language Models.

[BibT_eX]

[DOI]

,

,

Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

Empirical Study on Updating Key-Value Memories in Transformer Feed-forward Layers.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the Second Tiny Papers Track at ICLR 2024, 2024

Unlocking Continual Learning Abilities in Language Models.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

HyperMoE: Towards Better Mixture of Experts via Transferring Among Experts.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023

Heterogenous Memory Augmented Neural Networks.

[BibT_eX]

[DOI]

,

,

,

Shanghang Zhang

,

CoRR, 2023

Emergent Mixture-of-Experts: Can Dense Pre-trained Transformers Benefit from Emergent Modular Structures?

[BibT_eX]

[DOI]

,

,

CoRR, 2023

2022

Supported Policy Optimization for Offline Reinforcement Learning.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

2021

Detection of Advertising Users Based on K-SMOTE and Ensemble Learning.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the Human Centered Computing - 7th International Conference, 2021

Academic Article Classification Algorithm Based on Pre-trained Model and Keyword Extraction.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the Computer Supported Cooperative Work and Social Computing, 2021

ResConvE: Deeper Convolution-Based Knowledge Graph Embeddings.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the Computer Supported Cooperative Work and Social Computing, 2021

A University Portrait System Incorporating Academic Social Network.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the Computer Supported Cooperative Work and Social Computing, 2021

Loading...