Coleman Hooper

Orcid: 0000-0002-5890-610X

According to our database1, Coleman Hooper authored at least 15 papers between 2020 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
AI and Memory Wall.
CoRR, 2024

KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization.
CoRR, 2024

Learned Best-Effort LLM Serving.
CoRR, 2024

2023
A 16-nm SoC for Noise-Robust Speech and NLP Edge AI Inference With Bayesian Sound Source Separation and Attention-Based DNNs.
IEEE J. Solid State Circuits, February, 2023

S-LoRA: Serving Thousands of Concurrent LoRA Adapters.
CoRR, 2023

Property-Aware Multi-Speaker Data Simulation: A Probabilistic Modelling Technique for Synthetic Data Generation.
CoRR, 2023

SPEED: Speculative Pipelined Execution for Efficient Decoding.
CoRR, 2023

SqueezeLLM: Dense-and-Sparse Quantization.
CoRR, 2023

Full Stack Optimization of Transformer Inference: a Survey.
CoRR, 2023

A 12nm 18.1TFLOPs/W Sparse Transformer Processor with Entropy-Based Early Exit, Mixed-Precision Predication and Fine-Grained Power Management.
Proceedings of the IEEE International Solid- State Circuits Conference, 2023

2021
Quantifying and Maximizing the Benefits of Back-End Noise Adaption on Attention-Based Speech Recognition Models.
CoRR, 2021

EdgeBERT: Sentence-Level Energy Optimizations for Latency-Aware Multi-Task NLP Inference.
Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021

9.8 A 25mm<sup>2</sup> SoC for IoT Devices with 18ms Noise-Robust Speech-to-Text Latency via Bayesian Speech Denoising and Attention-Based Sequence-to-Sequence DNN Speech Recognition in 16nm FinFET.
Proceedings of the IEEE International Solid-State Circuits Conference, 2021

SM6: A 16nm System-on-Chip for Accurate and Noise-Robust Attention-Based NLP Applications : The 33<sup>rd</sup> Hot Chips Symposium - August 22-24, 2021.
Proceedings of the IEEE Hot Chips 33 Symposium, 2021

2020
EdgeBERT: Optimizing On-Chip Inference for Multi-Task NLP.
CoRR, 2020


  Loading...