Yuxian Gu

Orcid: 0000-0002-4607-7025

According to our database1, Yuxian Gu authored at least 30 papers between 2019 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
LLM-in-Sandbox Elicits General Agentic Intelligence.
CoRR, January, 2026

Taming the Long-Tail: Efficient Reasoning RL Training with Adaptive Drafter.
Proceedings of the 31st ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2026

2025
Trust-Region Adaptive Policy Optimization.
CoRR, December, 2025

Taming the Long-Tail: Efficient Reasoning RL Training with Adaptive Drafter.
CoRR, November, 2025

Data-Efficient RLVR via Off-Policy Influence Guidance.
CoRR, October, 2025

Jet-Nemotron: Efficient Language Model with Post Neural Architecture Search.
CoRR, August, 2025

Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models.
Trans. Mach. Learn. Res., 2025

MiniPLM: Knowledge Distillation for Pre-training Language Models.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Data Selection via Optimal Control for Language Models.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

NVILA: Efficient Frontier Visual Language Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024
NVILA: Efficient Frontier Visual Language Models.
CoRR, 2024

Direct Preference Knowledge Distillation for Large Language Models.
CoRR, 2024

Towards Optimal Learning of Language Models.
CoRR, 2024

MiniLLM: Knowledge Distillation of Large Language Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Instruction Pre-Training: Language Models are Supervised Multitask Learners.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

2023
EVA2.0: Investigating Open-domain Chinese Dialogue Systems with Large-scale Pre-training.
Mach. Intell. Res., April, 2023

Knowledge Distillation of Large Language Models.
CoRR, 2023

Pre-Training to Learn in Context.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022
Structured Prompting: Scaling In-Context Learning to 1, 000 Examples.
CoRR, 2022

Many-Class Text Classification with Matching.
CoRR, 2022

Learning Instructions with Unlabeled Data for Zero-Shot Cross-Task Generalization.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

PPT: Pre-trained Prompt Tuning for Few-shot Learning.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

2021
CUGE: A Chinese Language Understanding and Generation Evaluation Benchmark.
CoRR, 2021

EVA: An Open-Domain Chinese Dialogue System with Large-Scale Generative Pre-Training.
CoRR, 2021

Pre-Trained Models: Past, Present and Future.
CoRR, 2021

CPM: A large-scale generative Chinese Pre-trained language model.
AI Open, 2021

CPM-2: Large-scale cost-effective pre-trained language models.
AI Open, 2021

Pre-trained models: Past, present and future.
AI Open, 2021

2020
Train No Evil: Selective Masking for Task-Guided Pre-Training.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

2019
Adapting Meta Knowledge Graph Information for Multi-Hop Reasoning over Few-Shot Relations.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019


  Loading...