Yufan Xu

Orcid: 0000-0002-7787-6460

Affiliations:

University of Utah, School of Computing, Salt Lake City, UT, USA
Ohio State University, Columbus, OH, USA (2017 - 2019)

According to our database¹, Yufan Xu authored at least 18 papers between 2019 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Bibliography

2026

Exploiting Efficient Mapping and Pipelined Execution for Accelerating SpMV on Tensor Cores.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2026

ElasGNN: An Elastic Training Framework for Distributed GNN Training.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2026

2025

RLHFSpec: Breaking the Efficiency Bottleneck in RLHF Training via Adaptive Drafting.

[BibT_eX]

[DOI]

CoRR, December, 2025

\uline{LO}w-c\uline{O}st yet High-\uline{P}erformant \uline{S}parse Matrix-Matrix Multiplication on Arm SME Architectures.

[BibT_eX]

[DOI]

Enrique S. Quintana-Orti

Zhongzhi Luan

Yi Liu

Depei Qian

CoRR, November, 2025

Towards Efficient LLM Inference via Collective and Adaptive Speculative Decoding.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2025

Zero-Value Code Specialization via Profile-Guided Control Data Flow Analysis.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2025

Accelerating Complex Stencil Computations with Adaptive Fusion Strategy.

[BibT_eX]

[DOI]

Proceedings of the 39th ACM International Conference on Supercomputing, 2025

ESC: Effective Submanifold Convolution using Tensor Cores.

[BibT_eX]

[DOI]

Proceedings of the 54th International Conference on Parallel Processing, 2025

OVERT: Orchestrating Vector-Scalar Execution for Efficient SpMV on Modern CPUs.

[BibT_eX]

[DOI]

Proceedings of the 54th International Conference on Parallel Processing, 2025

2024

CoNST: Code Generator for Sparse Tensor Networks.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., December, 2024

CoNST: Code Generator for Sparse Tensor Networks.

[BibT_eX]

[DOI]

CoRR, 2024

Accelerated Auto-Tuning of GPU Kernels for Tensor Computations.

[BibT_eX]

[DOI]

Chendi Li

Yufan Xu

Sina Mahdipour Saravani

Ponnuswamy Sadayappan

Proceedings of the 38th ACM International Conference on Supercomputing, 2024

2023

PEAK: Generating High-Performance Schedules in MLIR.

[BibT_eX]

[DOI]

Amir Mohammad Tavakkoli

Proceedings of the Languages and Compilers for Parallel Computing, 2023

2022

Training of deep learning pipelines on memory-constrained GPUs via segmented fused-tiled execution.

[BibT_eX]

[DOI]

Aravind Sukumaran-Rajam

P. Sadayappan

Proceedings of the CC '22: 31st ACM SIGPLAN International Conference on Compiler Construction, Seoul, South Korea, April 2, 2022

Effective Performance Modeling and Domain-Specific Compiler Optimization of CNNs for GPUs.

[BibT_eX]

[DOI]

Aravind Sukumaran-Rajam

Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2022

2021

Efficient Distributed Algorithms for Convolutional Neural Networks.

[BibT_eX]

[DOI]

Rui Li

Yufan Xu

Aravind Sukumaran-Rajam

Atanas Rountev

P. Sadayappan

Proceedings of the SPAA '21: 33rd ACM Symposium on Parallelism in Algorithms and Architectures, 2021

Analytical characterization and design space exploration for optimization of CNNs.

[BibT_eX]

[DOI]

Rui Li

Yufan Xu

Aravind Sukumaran-Rajam

Atanas Rountev

P. Sadayappan

Proceedings of the ASPLOS '21: 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2021

2019

Dependence-aware, unbounded sound predictive race detection.

[BibT_eX]

[DOI]

Proc. ACM Program. Lang., 2019

Yufan Xu

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...