We stand with Ukraine

We stand with Ukraine

Yuxian Gu

Orcid: 0000-0002-4607-7025

According to our database¹, Yuxian Gu authored at least 30 papers between 2019 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

LLM-in-Sandbox Elicits General Agentic Intelligence.

[DOI]

,

,

,

,

,

,

,

,

CoRR, January, 2026

Taming the Long-Tail: Efficient Reasoning RL Training with Adaptive Drafter.

[DOI]

,

,

,

,

,

,

,

,

,

Proceedings of the 31st ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2026

2025

Trust-Region Adaptive Policy Optimization.

[DOI]

,

,

,

,

CoRR, December, 2025

Taming the Long-Tail: Efficient Reasoning RL Training with Adaptive Drafter.

[DOI]

,

,

,

,

,

,

,

,

,

CoRR, November, 2025

Data-Efficient RLVR via Off-Policy Influence Guidance.

[DOI]

,

,

,

,

,

,

,

,

,

,

CoRR, October, 2025

Jet-Nemotron: Efficient Language Model with Post Neural Architecture Search.

[DOI]

,

,

,

,

,

,

CoRR, August, 2025

Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

Trans. Mach. Learn. Res., 2025

MiniPLM: Knowledge Distillation for Pre-training Language Models.

[DOI]

,

,

,

,

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Data Selection via Optimal Control for Language Models.

[DOI]

,

,

,

,

,

,

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

NVILA: Efficient Frontier Visual Language Models.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

Pavlo Molchanov

,

,

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024

NVILA: Efficient Frontier Visual Language Models.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

Pavlo Molchanov

,

,

,

,

CoRR, 2024

Direct Preference Knowledge Distillation for Large Language Models.

[DOI]

,

,

,

,

,

CoRR, 2024

Towards Optimal Learning of Language Models.

[DOI]

,

,

,

,

,

CoRR, 2024

MiniLLM: Knowledge Distillation of Large Language Models.

[DOI]

,

,

,

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Instruction Pre-Training: Language Models are Supervised Multitask Learners.

[DOI]

,

,

,

,

,

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

2023

EVA2.0: Investigating Open-domain Chinese Dialogue Systems with Large-scale Pre-training.

[DOI]

,

,

,

,

,

,

,

,

,

,

Mach. Intell. Res., April, 2023

Knowledge Distillation of Large Language Models.

[DOI]

,

,

,

CoRR, 2023

Pre-Training to Learn in Context.

[DOI]

,

,

,

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022

Structured Prompting: Scaling In-Context Learning to 1, 000 Examples.

[DOI]

,

,

,

,

,

CoRR, 2022

Many-Class Text Classification with Matching.

[DOI]

,

,

CoRR, 2022

Learning Instructions with Unlabeled Data for Zero-Shot Cross-Task Generalization.

[DOI]

,

,

,

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

PPT: Pre-trained Prompt Tuning for Few-shot Learning.

[DOI]

,

,

,

Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

2021

CUGE: A Chinese Language Understanding and Generation Evaluation Benchmark.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

Xuancheng Huang

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, 2021

EVA: An Open-Domain Chinese Dialogue System with Large-Scale Generative Pre-Training.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, 2021

Pre-Trained Models: Past, Present and Future.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, 2021

CPM: A large-scale generative Chinese Pre-trained language model.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

AI Open, 2021

CPM-2: Large-scale cost-effective pre-trained language models.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

AI Open, 2021

Pre-trained models: Past, present and future.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

AI Open, 2021

2020

Train No Evil: Selective Masking for Task-Guided Pre-Training.

[DOI]

,

,

,

,

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

2019

Adapting Meta Knowledge Graph Information for Multi-Hop Reasoning over Few-Shot Relations.

[DOI]

,

,

,

,

,

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Loading...