Tianqi Chen

Affiliations:
  • Carnegie Mellon University, USA
  • University of Washington, USA (PhD 2019)


According to our database1, Tianqi Chen authored at least 39 papers between 2014 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
GraphPipe: Improving Performance and Scalability of DNN Training with Graph Pipeline Parallelism.
CoRR, 2024

Emerging Platforms Meet Emerging LLMs: A Year-Long Journey of Top-Down Development.
CoRR, 2024

Atom: Low-Bit Quantization for Efficient and Accurate LLM Serving.
Proceedings of the Seventh Annual Conference on Machine Learning and Systems, 2024

ACROBAT: Optimizing Auto-batching of Dynamic Deep Learning at Compile Time.
Proceedings of the Seventh Annual Conference on Machine Learning and Systems, 2024

2023
Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems.
CoRR, 2023

Relax: Composable Abstractions for End-to-End Dynamic Machine Learning.
CoRR, 2023

ED-Batch: Efficient Automatic Batching of Dynamic Neural Networks via Learned Finite State Machines.
Proceedings of the International Conference on Machine Learning, 2023

SparseTIR: Composable Abstractions for Sparse Compilation in Deep Learning.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

TensorIR: An Abstraction for Automatic Tensorized Program Optimization.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

2022
Tensor Program Optimization with Probabilistic Programs.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

DietCode: Automatic Optimization for Dynamic Tensor Programs.
Proceedings of the Fifth Conference on Machine Learning and Systems, 2022

The CoRa Tensor Compiler: Compilation for Ragged Tensors with Minimal Padding.
Proceedings of the Fifth Conference on Machine Learning and Systems, 2022

Collage: Seamless Integration of Deep Learning Backends with Automatic Placement.
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2022

2021
Collage: Automated Integration of Deep Learning Backends.
CoRR, 2021

Automated Backend-Aware Post-Training Quantization.
CoRR, 2021

TenSet: A Large-scale Program Performance Dataset for Learned Tensor Compilers.
Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, 2021

Cortex: A Compiler for Recursive Deep Learning Models.
Proceedings of the Fourth Conference on Machine Learning and Systems, 2021

Dynamic Tensor Rematerialization.
Proceedings of the 9th International Conference on Learning Representations, 2021

2020
Automatic generation of high-performance quantized machine learning kernels.
Proceedings of the CGO '20: 18th ACM/IEEE International Symposium on Code Generation and Optimization, 2020

2019
Scalable and Intelligent Learning Systems.
PhD thesis, 2019

Irreversible samplers from jump and continuous Markov processes.
Stat. Comput., 2019

A Hardware-Software Blueprint for Flexible Deep Learning Specialization.
IEEE Micro, 2019

Relay: A High-Level IR for Deep Learning.
CoRR, 2019

2018
ADARES: Adaptive Resource Management for Virtual Machines.
CoRR, 2018

Automating Generation of Low Precision Deep Learning Operators.
CoRR, 2018

VTA: An Open Hardware-Software Stack for Deep Learning.
CoRR, 2018

TVM: End-to-End Optimization Stack for Deep Learning.
CoRR, 2018

Relay: a new IR for machine learning frameworks.
Proceedings of the 2nd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages, 2018

TVM: An Automated End-to-End Optimizing Compiler for Deep Learning.
Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation, 2018

Learning to Optimize Tensor Programs.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Optimizing Deep Learning Workloads on ARM GPU with TVM.
Proceedings of the 1st on Reproducible Quality-Efficient Systems Tournament on Co-designing Pareto-efficient Deep Learning, 2018

Leveraging the VTA-TVM Hardware-Software Stack for FPGA Acceleration of 8-bit ResNet-18 Inference.
Proceedings of the 1st on Reproducible Quality-Efficient Systems Tournament on Co-designing Pareto-efficient Deep Learning, 2018

PANEL: Open panel and discussion on tackling complexity, reproducibility and tech transfer challenges in a rapidly evolving AI/ML/systems research.
Proceedings of the 1st on Reproducible Quality-Efficient Systems Tournament on Co-designing Pareto-efficient Deep Learning, 2018

2017
An end to end IR stack for deep learning systems.
Proceedings of the Workshop on Trends in Machine-Learning (and impact on computer architecture), 2017

2016
Training Deep Nets with Sublinear Memory Cost.
CoRR, 2016

XGBoost: A Scalable Tree Boosting System.
Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016

2015
A Complete Recipe for Stochastic Gradient MCMC.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Efficient Second-Order Gradient Boosting for Conditional Random Fields.
Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, 2015

2014
Stochastic Gradient Hamiltonian Monte Carlo.
Proceedings of the 31th International Conference on Machine Learning, 2014


  Loading...