Liu Liu

Orcid: 0000-0003-0792-8146

Affiliations:
  • University of California, Santa Barbara, CA, USA


According to our database1, Liu Liu authored at least 46 papers between 2016 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

Online presence:

On csauthors.net:

Bibliography

2026
TRACE: Unlocking Effective CXL Bandwidth via Lossless Compression and Precision Scaling.
IEEE Trans. Computers, April, 2026

Efficient Quantization of Mixture-of-Experts with Theoretical Generalization Guarantees.
CoRR, April, 2026

Robust Heterogeneous Analog-Digital Computing for Mixture-of-Experts Models with Theoretical Generalization Guarantees.
CoRR, March, 2026

STARC: Selective Token Access with Remapping and Clustering for Efficient LLM Decoding on PIM Systems.
Proceedings of the 31st ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2026

2025
Making Strong Error-Correcting Codes Work Effectively for HBM in AI Inference.
CoRR, December, 2025

Bandwidth-Efficient Adaptive Mixture-of-Experts via Low-Rank Compensation.
CoRR, December, 2025

Context-Aware Mixture-of-Experts Inference on CXL-Enabled GPU-NDP Systems.
CoRR, December, 2025

SparseST: Exploiting Data Sparsity in Spatiotemporal Modeling and Prediction.
CoRR, November, 2025

Amplifying Effective CXL Memory Bandwidth for LLM Inference via Transparent Near-Data Processing.
CoRR, September, 2025

Leveraging 3D Technologies for Hardware Security: Opportunities and Challenges.
CoRR, August, 2025

Reimagining Memory Access for LLM Inference: Compression-Aware Memory Controller Design.
CoRR, March, 2025

Breaking the HBM Bit Cost Barrier: Domain-Specific ECC for AI Inference Infrastructure.
IEEE Comput. Archit. Lett., 2025

Accelerating LLM Inference via Dynamic KV Cache Placement in Heterogeneous Memory System.
IEEE Comput. Archit. Lett., 2025

ZipCXL: CXL-based Main Memory Compression at Low Performance Penalty.
Proceedings of the International Symposium on Memory Systems, 2025

BitWeaver: Read-Time Truncation in Memory.
Proceedings of the 39th ACM International Conference on Supercomputing, 2025

SAGE: Saliency-Aware Grouping for Efficient Mapping of LLMs on Analog Compute-in-Memory.
Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2025

NORA: Noise-Optimized Rescaling of LLMs on Analog Compute-in-Memory Accelerators.
Proceedings of the Design, Automation & Test in Europe Conference, 2025

2024
SmartQuant: CXL-Based AI Model Store in Support of Runtime Configurable Weight Quantization.
IEEE Comput. Archit. Lett., 2024

2023
TranCIM: Full-Digital Bitline-Transpose CIM-based Sparse Transformer Accelerator With Pipeline/Parallel Reconfigurable Modes.
IEEE J. Solid State Circuits, 2023

Dynamic N: M Fine-Grained Structured Sparse Attention Mechanism.
Proceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2023

ALCOP: Automatic Load-Compute Pipelining in Deep Learning Compiler for AI-GPUs.
Proceedings of the Sixth Conference on Machine Learning and Systems, 2023

ECSSD: Hardware/Data Layout Co-Designed In-Storage-Computing Architecture for Extreme Classification.
Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023

2022
Elastic Processing and Hardware Architectures for Machine Learning
PhD thesis, 2022

Dynamic Sparse Attention for Scalable Transformer Acceleration.
IEEE Trans. Computers, 2022

Enabling Data Movement and Computation Pipelining in Deep Learning Compiler.
CoRR, 2022

A 28nm 15.59µJ/Token Full-Digital Bitline-Transpose CIM-Based Sparse Transformer Accelerator with Pipeline/Parallel Reconfigurable Modes.
Proceedings of the IEEE International Solid-State Circuits Conference, 2022

INSPIRE: in-storage private information retrieval via protocol and architecture co-design.
Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022

A one-for-all and <i>o</i>(<i>v</i> log(<i>v</i> ))-cost solution for parallel merge style operations on sorted key-value arrays.
Proceedings of the ASPLOS '22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022, 2022

DOTA: detect and omit weak attentions for scalable transformer acceleration.
Proceedings of the ASPLOS '22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022, 2022

2021
Transformer Acceleration with Dynamic Sparse Attention.
CoRR, 2021

Π-RT: A Runtime Framework to Enable Energy-Efficient Real-Time Robotic Vision Applications on Heterogeneous Architectures.
Computer, 2021

Efficient tensor core-based GPU kernels for structured sparsity under reduced precision.
Proceedings of the International Conference for High Performance Computing, 2021

ENMC: Extreme Near-Memory Classification via Approximate Screening.
Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021

2020
SemiMap: A Semi-Folded Convolution Mapping for Speed-Overhead Balance on Crossbars.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2020

Computation on Sparse Neural Networks: an Inspiration for Future Hardware.
CoRR, 2020

DUET: Boosting Deep Neural Network Efficiency on Dual-Module Architecture.
Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020

Boosting Deep Neural Network Efficiency with Dual-Module Inference.
Proceedings of the 37th International Conference on Machine Learning, 2020

INVITED: Computation on Sparse Neural Networks and its Implications for Future Hardware.
Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

2019
L1-Norm Batch Normalization for Efficient Training of Deep Neural Networks.
IEEE Trans. Neural Networks Learn. Syst., 2019

Dynamic Sparse Graph for Efficient Deep Learning.
Proceedings of the 7th International Conference on Learning Representations, 2019

2018
L1-Norm Batch Normalization for Efficient Training of Deep Neural Networks.
CoRR, 2018

PIRT: A Runtime Framework to Enable Energy-Efficient Real-Time Robotic Applications on Heterogeneous Architectures.
CoRR, 2018

2017
Building energy-efficient multi-level cell STT-RAM caches with data compression.
Proceedings of the 22nd Asia and South Pacific Design Automation Conference, 2017

2016
CNNLab: a Novel Parallel Framework for Neural Networks using GPU and FPGA-a Practical Study with Trade-off Analysis.
CoRR, 2016

NVSim-CAM: a circuit-level simulator for emerging nonvolatile memory based content-addressable memory.
Proceedings of the 35th International Conference on Computer-Aided Design, 2016

Leveraging 3D Technologies for Hardware Security: Opportunities and Challenges.
Proceedings of the 26th edition on Great Lakes Symposium on VLSI, 2016


  Loading...