Xiaozhe Ren

Orcid: 0000-0002-0432-5510

According to our database¹, Xiaozhe Ren authored at least 30 papers between 2019 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

An Efficient Edge-Cloud Collaboration System With Foundational Models for Open-Set IoT Applications.

[BibT_eX]

[DOI]

IEEE Trans. Mob. Comput., July, 2026

BitDP: Ultra-low-bit Communication for Data Parallelism in LLM Training.

[BibT_eX]

[DOI]

Xiaozhe Ren

Qiong Luo

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025

A Survey of Reasoning with Foundation Models: Concepts, Methodologies, and Outlook.

[BibT_eX]

[DOI]

ACM Comput. Surv., November, 2025

Pangu DeepDiver: Adaptive Search Intensity Scaling via Open-Web Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, May, 2025

Self-Adjust Softmax.

[BibT_eX]

[DOI]

CoRR, February, 2025

TaskSense: A Translation-like Approach for Tasking Heterogeneous Sensor Systems with LLMs.

[BibT_eX]

[DOI]

Proceedings of the 23rd ACM Conference on Embedded Networked Sensor Systems, 2025

SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

Self-Adjust Softmax.

[BibT_eX]

[DOI]

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

DAPE V2: Process Attention Score as Feature Map for Length Extrapolation.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024

Scaling Law for Language Models Training Considering Batch Size.

[BibT_eX]

[DOI]

CoRR, 2024

CAPE: Context-Adaptive Positional Encoding for Length Extrapolation.

[BibT_eX]

[DOI]

CoRR, 2024

Poster Abstract: Tasking Heterogeneous Sensor Systems with LLMs.

[BibT_eX]

[DOI]

Proceedings of the 22nd ACM Conference on Embedded Networked Sensor Systems, 2024

DAPE: Data-Adaptive Positional Encoding for Length Extrapolation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

ScheMoE: An Extensible Mixture-of-Experts Distributed Training System with Tasks Scheduling.

[BibT_eX]

[DOI]

Proceedings of the Nineteenth European Conference on Computer Systems, 2024

PIXART-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

2023

A Survey of Reasoning with Foundation Models.

[BibT_eX]

[DOI]

CoRR, 2023

PanGu-Σ: Towards Trillion Parameter Language Model with Sparse Heterogeneous Computing.

[BibT_eX]

[DOI]

CoRR, 2023

EdgeFM: Leveraging Foundation Model for Open-set Learning on the Edge.

[BibT_eX]

[DOI]

Proceedings of the 21st ACM Conference on Embedded Networked Sensor Systems, 2023

Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

A Study on Transformer Configuration and Training Objective.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

One Student Knows All Experts Know: From Sparse to Dense.

[BibT_eX]

[DOI]

Proceedings of the First Tiny Papers Track at ICLR 2023, 2023

CAME: Confidence-guided Adaptive Memory Efficient Optimization.

[BibT_eX]

[DOI]

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022

Deeper vs Wider: A Revisit of Transformer Configuration.

[BibT_eX]

[DOI]

CoRR, 2022

AutoBERT-Zero: Evolving BERT Backbone from Scratch.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021

Large-Scale Deep Learning Optimizations: A Comprehensive Survey.

[BibT_eX]

[DOI]

CoRR, 2021

NumGPT: Improving Numeracy Ability of Generative Pre-trained Models.

[BibT_eX]

[DOI]

CoRR, 2021

PanGu-α: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation.

[BibT_eX]

[DOI]

CoRR, 2021

SparseBERT: Rethinking the Importance Analysis in Self-attention.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

EfficientBERT: Progressively Searching Multilayer Perceptron via Warm-up Knowledge Distillation.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

2019

NEZHA: Neural Contextualized Representation for Chinese Language Understanding.

[BibT_eX]

[DOI]

CoRR, 2019

Xiaozhe Ren

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...