Baolin Li

Orcid: 0000-0001-9778-1023

Affiliations:
  • Northeastern University, Boston, MA, USA


According to our database1, Baolin Li authored at least 28 papers between 2020 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Toward Sustainable GenAI using Generation Directives for Carbon-Friendly Large Language Model Inference.
CoRR, 2024

EcoLife: Carbon-Aware Serverless Function Scheduling for Sustainable Computing.
Proceedings of the International Conference for High Performance Computing, 2024

Interpretable Analysis of Production GPU Clusters Monitoring Data via Association Rule Mining.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024

LLM Inference Serving: Survey of Recent Advances and Opportunities.
Proceedings of the IEEE High Performance Extreme Computing Conference, 2024

Sprout: Green Generative AI with Carbon-Efficient LLM Inference.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

2023
Sustainable HPC: Modeling, Characterization, and Implications of Carbon Footprint in Modern HPC Systems.
CoRR, 2023

Green Carbon Footprint for Model Inference Serving via Exploiting Mixed-Quality Models and GPU Partitioning.
CoRR, 2023

Clover: Toward Sustainable AI with Carbon-Aware Machine Learning Inference Service.
Proceedings of the International Conference for High Performance Computing, 2023

Toward Sustainable HPC: Carbon Footprint Estimation and Environmental Implications of HPC Systems.
Proceedings of the International Conference for High Performance Computing, 2023

From Words to Watts: Benchmarking the Energy Costs of Large Language Model Inference.
Proceedings of the IEEE High Performance Extreme Computing Conference, 2023

Kairos: Building Cost-Efficient Machine Learning Inference Systems with Heterogeneous Cloud Resources.
Proceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing, 2023

Sustainable Supercomputing for AI: GPU Power Capping at HPC Scale.
Proceedings of the 2023 ACM Symposium on Cloud Computing, SoCC 2023, 2023

2022
Building Heterogeneous Cloud System for Machine Learning Inference.
CoRR, 2022

Using Multi-Instance GPU for Efficient Operation of Multi-Tenant GPU Clusters.
CoRR, 2022

The MIT Supercloud Workload Classification Challenge.
CoRR, 2022

Great Power, Great Responsibility: Recommendations for Reducing Energy for Training Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2022, 2022


Characterizing Multi-Instance GPU for Machine Learning Workloads.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2022

DASH: Scheduling Deep Learning Workloads on Multi-Generational GPU-Accelerated Clusters.
Proceedings of the IEEE High Performance Extreme Computing Conference, 2022

Benchmarking Resource Usage for Efficient Distributed Deep Learning.
Proceedings of the IEEE High Performance Extreme Computing Conference, 2022


Do Temperature and Humidity Exposures Hurt or Benefit Your SSDs?
Proceedings of the 2022 Design, Automation & Test in Europe Conference & Exhibition, 2022

MISO: exploiting multi-instance GPU capability on multi-tenant GPU clusters.
Proceedings of the 13th Symposium on Cloud Computing, SoCC 2022, 2022

2021
RIBBON: cost-effective and qos-aware deep learning model inference using a diverse pool of cloud computing instances.
Proceedings of the International Conference for High Performance Computing, 2021


Serving Machine Learning Inference Using Heterogeneous Hardware.
Proceedings of the 2021 IEEE High Performance Extreme Computing Conference, 2021

2020
UREQA: Leveraging Operation-Aware Error Rates for Effective Quantum Circuit Mapping on NISQ-Era Quantum Computers.
Proceedings of the 2020 USENIX Annual Technical Conference, 2020

Experimental evaluation of NISQ quantum computers: error measurement, characterization, and implications.
Proceedings of the International Conference for High Performance Computing, 2020


  Loading...