Cody Hao Yu

Orcid: 0000-0002-9298-6254

Affiliations:

University of California, Los Angeles, USA (PhD 2019)

According to our database¹, Cody Hao Yu authored at least 38 papers between 2014 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Bibliography

2026

Tilus: A Tile-Level GPGPU Programming Language for Low-Precision Computation.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2026

2024

SGLang: Efficient Execution of Structured Language Model Programs.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Automated Deep Learning Optimization via DSL-Based Source Code Transformation.

[BibT_eX]

[DOI]

Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis, 2024

Slapo: A Schedule Language for Progressive Optimization of Large Deep Learning Model Training.

[BibT_eX]

[DOI]

Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

2023

Efficiently Programming Large Language Models using SGLang.

[BibT_eX]

[DOI]

CoRR, 2023

RAF: Holistic Compilation for Deep Learning Model Training.

[BibT_eX]

[DOI]

CoRR, 2023

Decoupled Model Schedule for Deep Learning Training.

[BibT_eX]

[DOI]

CoRR, 2023

Efficient Memory Management for Large Language Model Serving with PagedAttention.

[BibT_eX]

[DOI]

Proceedings of the 29th Symposium on Operating Systems Principles, 2023

Grape: Practical and Efficient Graphed Execution for Dynamic Deep Neural Networks on GPUs.

[BibT_eX]

[DOI]

Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

TensorIR: An Abstraction for Automatic Tensorized Program Optimization.

[BibT_eX]

[DOI]

Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

Hidet: Task-Mapping Programming Paradigm for Deep Learning Tensor Programs.

[BibT_eX]

[DOI]

Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

2022

AutoDSE: Enabling Software Programmers to Design Efficient FPGA Accelerators.

[BibT_eX]

[DOI]

ACM Trans. Design Autom. Electr. Syst., 2022

Tensor Program Optimization with Probabilistic Programs.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

DietCode: Automatic Optimization for Dynamic Tensor Programs.

[BibT_eX]

[DOI]

Proceedings of the Fifth Conference on Machine Learning and Systems, 2022

2021

Bring Your Own Codegen to Deep Learning Compiler.

[BibT_eX]

[DOI]

CoRR, 2021

AutoDSE: Enabling Software Programmers Design Efficient FPGA Accelerators.

[BibT_eX]

[DOI]

Proceedings of the FPGA '21: The 2021 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Virtual Event, USA, February 28, 2021

MOCHA: Multinode Cost Optimization in Heterogeneous Clouds with Accelerators.

[BibT_eX]

[DOI]

Proceedings of the FPGA '21: The 2021 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Virtual Event, USA, February 28, 2021

Lorien: Efficient Deep Learning Workloads Delivery.

[BibT_eX]

[DOI]

Proceedings of the SoCC '21: ACM Symposium on Cloud Computing, 2021

2020

Ansor: Generating High-Performance Tensor Programs for Deep Learning.

[BibT_eX]

[DOI]

Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation, 2020

Analysis and Optimization of the Implicit Broadcasts in FPGA HLS to Improve Maximum Frequency.

[BibT_eX]

[DOI]

Proceedings of the FPGA '20: The 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2020

2019

Raising an Abstraction Level of Compilation and Optimization for Customized Computing.

[BibT_eX]

[DOI]

Hao Yu

PhD thesis, 2019

Customizable Computing - From Single Chip to Datacenters.

[BibT_eX]

[DOI]

Proc. IEEE, 2019

Overcoming Data Transfer Bottlenecks in DNN Accelerators via Layer-Conscious Memory Managment.

[BibT_eX]

[DOI]

Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2019

HeteroCL: A Multi-Paradigm Programming Infrastructure for Software-Defined Reconfigurable Computing.

[BibT_eX]

[DOI]

Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2019

2018

AutoAccel: Automated Accelerator Generation and Optimization with Composable, Parallel and Pipeline Architecture.

[BibT_eX]

[DOI]

CoRR, 2018

Best-Effort FPGA Programming: A Few Steps Can Go a Long Way.

[BibT_eX]

[DOI]

CoRR, 2018

TGPA: tile-grained pipeline architecture for low latency CNN inference.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Computer-Aided Design, 2018

From JVM to FPGA: Bridging Abstraction Hierarchy via Optimized Deep Pipelining.

[BibT_eX]

[DOI]

Jason Cong

Peng Wei

Cody Hao Yu

Proceedings of the 10th USENIX Workshop on Hot Topics in Cloud Computing, 2018

Latte: Locality Aware Transformation for High-Level Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 26th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2018

S2FA: an accelerator automation framework for heterogeneous computing in datacenters.

[BibT_eX]

[DOI]

Proceedings of the 55th Annual Design Automation Conference, 2018

Automated accelerator generation and optimization with composable, parallel and pipeline architecture.

[BibT_eX]

[DOI]

Proceedings of the 55th Annual Design Automation Conference, 2018

2017

Automated Systolic Array Architecture Synthesis for High Throughput CNN Inference on FPGAs.

[BibT_eX]

[DOI]

Proceedings of the 54th Annual Design Automation Conference, 2017

Bandwidth Optimization Through On-Chip Memory Restructuring for HLS.

[BibT_eX]

[DOI]

Proceedings of the 54th Annual Design Automation Conference, 2017

2016

The SMEM Seeding Acceleration for DNA Sequence Alignment.

[BibT_eX]

[DOI]

Mau-Chung Frank Chang

Proceedings of the 24th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2016

Invited - Heterogeneous datacenters: options and opportunities.

[BibT_eX]

[DOI]

Proceedings of the 53rd Annual Design Automation Conference, 2016

Programming and Runtime Support to Blaze FPGA Accelerator Deployment at Datacenter Scale.

[BibT_eX]

[DOI]

Proceedings of the Seventh ACM Symposium on Cloud Computing, 2016

2015

Impact of Loop Transformations on Software Reliability.

[BibT_eX]

[DOI]

Jason Cong

Cody Hao Yu

Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, 2015

2014

Thermal-Aware On-Line Scheduler for 3-D Many-Core Processor Throughput Optimization.

[BibT_eX]

[DOI]

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2014

Cody Hao Yu

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...