Luis Ceze

Orcid: 0000-0002-1377-6217

Affiliations:
  • University of Washington, Seattle, WA, USA


According to our database1, Luis Ceze authored at least 155 papers between 2002 and 2023.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2023
Virtualizing Existing Fluidic Programs.
ACM J. Emerg. Technol. Comput. Syst., July, 2023

Atom: Low-bit Quantization for Efficient and Accurate LLM Serving.
CoRR, 2023

Punica: Multi-Tenant LoRA Serving.
CoRR, 2023

Fridge Compiler: Optimal Circuits from Molecular Inventories.
Proceedings of the Computational Methods in Systems Biology, 2023

SparseTIR: Composable Abstractions for Sparse Compilation in Deep Learning.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

2022
DietCode: Automatic Optimization for Dynamic Tensor Programs.
Proceedings of Machine Learning and Systems 2022, 2022

SRIFTY: Swift and Thrifty Distributed Neural Network Training on the Cloud.
Proceedings of Machine Learning and Systems 2022, 2022

2021
DNA Sequencing Flow Cells and the Security of the Molecular-Digital Interface.
Proc. Priv. Enhancing Technol., 2021

Cloud Collectives: Towards Cloud-aware Collectives forML Workloads with Rank Reordering.
CoRR, 2021

Accelerating SpMM Kernel with Cache-First Edge Sampling for Graph Neural Networks.
CoRR, 2021

VSS: A Storage System for Video Analytics [Technical Report].
CoRR, 2021

Automated Backend-Aware Post-Training Quantization.
CoRR, 2021

VSS: A Storage System for Video Analytics.
Proceedings of the SIGMOD '21: International Conference on Management of Data, 2021

Reticle: a virtual machine for programming modern FPGAs.
Proceedings of the PLDI '21: 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation, 2021

Pure tensor program rewriting via access patterns (representation pearl).
Proceedings of the MAPS@PLDI 2021: Proceedings of the 5th ACM SIGPLAN International Symposium on Machine Programming, 2021

Characterizing and Taming Resolution in Convolutional Neural Networks.
Proceedings of the IEEE International Symposium on Workload Characterization, 2021

Robust Digital Molecular Design of Binarized Neural Networks.
Proceedings of the 27th International Conference on DNA Computing and Molecular Programming, 2021

2020
LastLayer: Toward Hardware and Software Continuous Integration.
IEEE Micro, 2020

PurpleDrop: A Digital Microfluidics-Based Platform for Hybrid Molecular-Electronics Applications.
IEEE Micro, 2020

ASPLOS Report.
IEEE Des. Test, 2020

Srift: Swift and Thrift Cloud-Based Distributed Training.
CoRR, 2020

Enumerating Hardware-Software Splits with Program Rewriting.
CoRR, 2020

Genotype Extraction and False Relative Attacks: Security Risks to Third-Party Genetic Genealogy Services Beyond Identity Inference.
Proceedings of the 27th Annual Network and Distributed System Security Symposium, 2020

PLink: Discovering and Exploiting Locality for Accelerated Distributed Training on the public Cloud.
Proceedings of Machine Learning and Systems 2020, 2020

Riptide: Fast End-to-End Binarized Neural Networks.
Proceedings of Machine Learning and Systems 2020, 2020

VisualWorldDB: A DBMS for the Visual World.
Proceedings of the 10th Conference on Innovative Data Systems Research, 2020

Automatic generation of high-performance quantized machine learning kernels.
Proceedings of the CGO '20: 18th ACM/IEEE International Symposium on Code Generation and Optimization, 2020

2019
Iterative Search for Reconfigurable Accelerator Blocks With a Compiler in the Loop.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2019

DNA Data Storage and Hybrid Molecular-Electronic Computing.
Proc. IEEE, 2019

A Hardware-Software Blueprint for Flexible Deep Learning Specialization.
IEEE Micro, 2019

Synthesizing Number Generators for Stochastic Computing using Mixed Integer Programming.
CoRR, 2019

Vignette: Perceptual Compression for Video Storage and Processing Systems.
CoRR, 2019

Visual Road: A Video Data Management Benchmark.
Proceedings of the 2019 International Conference on Management of Data, 2019

Scaling Microfluidics to Complex, Dynamic Protocols: Invited Paper.
Proceedings of the International Conference on Computer-Aided Design, 2019

Perceptual Compression for Video Storage and Processing Systems.
Proceedings of the ACM Symposium on Cloud Computing, SoCC 2019, 2019

DNA Data Storage and Near-Molecule Processing for the Yottabyte Era.
Proceedings of the 9th Biennial Conference on Innovative Data Systems Research, 2019

Puddle: A Dynamic, Error-Correcting, Full-Stack Microfluidics Platform.
Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, 2019

2018
Energy-Efficient Neural Network Acceleration in the Presence of Bit-Level Memory Errors.
IEEE Trans. Circuits Syst. I Regul. Pap., 2018

Architecture Considerations for Stochastic Computing Accelerators.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2018

LightDB: A DBMS for Virtual Reality Video.
Proc. VLDB Endow., 2018

A Taxonomy of General Purpose Approximate Computing Techniques.
IEEE Embed. Syst. Lett., 2018

Automating Generation of Low Precision Deep Learning Operators.
CoRR, 2018

Stochastic Synthesis for Stochastic Computing.
CoRR, 2018

Computer Security Risks of Distant Relative Matching in Consumer Genetic Databases.
CoRR, 2018

VTA: An Open Hardware-Software Stack for Deep Learning.
CoRR, 2018

TVM: End-to-End Optimization Stack for Deep Learning.
CoRR, 2018

Parameter Hub: High Performance Parameter Servers for Efficient Distributed Deep Neural Network Training.
CoRR, 2018

Troubleshooting Transiently-Recurring Errors in Production Systems with Blame-Proportional Logging.
Proceedings of the 2018 USENIX Annual Technical Conference, 2018

TVM: An Automated End-to-End Optimizing Compiler for Deep Learning.
Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation, 2018

Learning to Optimize Tensor Programs.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Application Codesign of Near-Data Processing for Similarity Search.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium, 2018

A Content-Addressable DNA Database with Learned Sequence Encodings.
Proceedings of the DNA Computing and Molecular Programming - 24th International Conference, 2018

Correlation manipulating circuits for stochastic computing.
Proceedings of the 2018 Design, Automation & Test in Europe Conference & Exhibition, 2018

MATIC: Learning around errors for efficient low-voltage neural network accelerators.
Proceedings of the 2018 Design, Automation & Test in Europe Conference & Exhibition, 2018

Parameter Hub: a Rack-Scale Parameter Server for Distributed Deep Neural Network Training.
Proceedings of the ACM Symposium on Cloud Computing, 2018

Leveraging the VTA-TVM Hardware-Software Stack for FPGA Acceleration of 8-bit ResNet-18 Inference.
Proceedings of the 1st on Reproducible Quality-Efficient Systems Tournament on Co-designing Pareto-efficient Deep Learning, 2018

2017
Toward a DNA-Based Archival Storage System.
IEEE Micro, 2017

Making data center computations fast, but not so furious.
CoRR, 2017

Democratizing Design for Future Computing Platforms.
CoRR, 2017

Computer Security, Privacy, and DNA Sequencing: Compromising Computers with Synthesized DNA, Privacy Leaks, and More.
Proceedings of the 26th USENIX Security Symposium, 2017

VisualCloud Demonstration: A DBMS for Virtual Reality.
Proceedings of the 2017 ACM International Conference on Management of Data, 2017

Clustering Billions of Reads for DNA Data Storage.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Similarity Search on Automata Processors.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium, 2017

Exploring computation-communication tradeoffs in camera systems.
Proceedings of the 2017 IEEE International Symposium on Workload Characterization, 2017

Customizing Progressive JPEG for Efficient Image Storage.
Proceedings of the 9th USENIX Workshop on Hot Topics in Storage and File Systems, 2017

A hardware-friendly bilateral solver for real-time virtual reality video.
Proceedings of High Performance Graphics, 2017

Energy-efficient hybrid stochastic-binary neural networks for near-sensor computing.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2017

Exploiting quality-energy tradeoffs with arbitrary quantization: special session paper.
Proceedings of the Twelfth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis Companion, 2017

A Visual Cloud for Virtual Reality Applications.
Proceedings of the 8th Biennial Conference on Innovative Data Systems Research, 2017

Augmenting Interpersonal Communication through Connected Lighting.
Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, 2017

IncBricks: Toward In-Network Computation with an In-Network Cache.
Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems, 2017

Approximate Storage of Compressed and Encrypted Videos.
Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems, 2017

POSTER: Application-Driven Near-Data Processing for Similarity Search.
Proceedings of the 26th International Conference on Parallel Architectures and Compilation Techniques, 2017

2016
Approximate Computing: Unlocking Efficiency with Hardware-Software Co-Design.
GetMobile Mob. Comput. Commun., 2016

NCAM: Near-Data Processing for Nearest Neighbor Search.
CoRR, 2016

Near Memory Similarity Search on Automata Processors.
CoRR, 2016

21st Century Computer Architecture.
CoRR, 2016

Arch2030: A Vision of Computer Architecture Research over the Next 15 Years.
CoRR, 2016

Optimizing synthesis with metasketches.
Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 2016

Disciplined Inconsistency with Consistency Types.
Proceedings of the Seventh ACM Symposium on Cloud Computing, 2016

High-Density Image Storage Using Approximate Memory Cells.
Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems, 2016

A DNA-Based Archival Storage System.
Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems, 2016

2015
Trading Latency for Performance in Data-Intensive Applications.
login Usenix Mag., 2015

Approximate Computing: Making Mobile Systems More Efficient.
IEEE Pervasive Comput., 2015

Alternative Computing Designs and Technologies.
IEEE Micro, 2015

The 2014 Top Picks in Computer Architecture.
IEEE Micro, 2015

SAP: an Architecture for Selectively Approximate Wireless Communication.
CoRR, 2015

Latency-Tolerant Software Distributed Shared Memory.
Proceedings of the 2015 USENIX Annual Technical Conference, 2015

Hardware-Software Co-Design: Not Just a Cliché.
Proceedings of the 1st Summit on Advances in Programming Languages, 2015

Probability type inference for flexible approximate programming.
Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, 2015

NCAM: Near-Data Processing for Nearest Neighbor Search.
Proceedings of the 2015 International Symposium on Memory Systems, 2015

SNNAP: Approximate computing on programmable SoCs via neural acceleration.
Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture, 2015

Claret: using data types for highly concurrent distributed transactions.
Proceedings of the First Workshop on Principles and Practice of Consistency for Distributed Data, 2015

Data provenance tracking for concurrent programs.
Proceedings of the 13th Annual IEEE/ACM International Symposium on Code Generation and Optimization, 2015

Monitoring and Debugging the Quality of Results in Approximate Programs.
Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, 2015

2014
Approximate Storage in Solid-State Memories.
ACM Trans. Comput. Syst., 2014

Data Race Detection with Minimal Hardware Support.
Comput. J., 2014

Expressing and verifying probabilistic assertions.
Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, 2014

Alembic: automatic locality extraction via migration.
Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications, 2014

Symbolic execution of multithreaded programs from arbitrary program contexts.
Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications, 2014

General-purpose code acceleration with limited-precision analog computation.
Proceedings of the ACM/IEEE 41st International Symposium on Computer Architecture, 2014

Low-level detection of language-level data races with LARD.
Proceedings of the Architectural Support for Programming Languages and Operating Systems, 2014

Integrated 3D-stacked server designs for increasing physical density of key-value stores.
Proceedings of the Architectural Support for Programming Languages and Operating Systems, 2014

2013
Neural Acceleration for General-Purpose Approximate Programs.
IEEE Micro, 2013

Exploring storage class memory with key value stores.
Proceedings of the 1st Workshop on Interactions of NVM/FLASH with Operating Systems and Workloads, 2013

Input-covering schedules for multithreaded programs.
Proceedings of the 2013 ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages & Applications, 2013

DNA-based molecular architecture with spatially localized components.
Proceedings of the 40th Annual International Symposium on Computer Architecture, 2013

Cooperative empirical failure avoidance for multithreaded programs.
Proceedings of the Architectural Support for Programming Languages and Operating Systems, 2013

DDOS: taming nondeterminism in distributed systems.
Proceedings of the Architectural Support for Programming Languages and Operating Systems, 2013

2012
IFRit: interference-free regions for dynamic data-race detection.
Proceedings of the 27th Annual ACM SIGPLAN Conference on Object-Oriented Programming, 2012

RADISH: Always-on sound and complete race detection in software and hardware.
Proceedings of the 39th International Symposium on Computer Architecture (ISCA 2012), 2012

Automatic discovery of performance and energy pitfalls in HTML and CSS.
Proceedings of the 2012 IEEE International Symposium on Workload Characterization, 2012

Architecture support for disciplined approximate programming.
Proceedings of the 17th International Conference on Architectural Support for Programming Languages and Operating Systems, 2012

2011
Shared-Memory Multiprocessors.
Proceedings of the Encyclopedia of Parallel Computing, 2011

The impact of memory models on software reliability in multiprocessors.
Proceedings of the 30th Annual ACM Symposium on Principles of Distributed Computing, 2011

Data-race exceptions have benefits beyond the memory model.
Proceedings of the 2011 ACM SIGPLAN workshop on Memory Systems Performance and Correctness: held in conjunction with PLDI '11, 2011

EnerJ: approximate data types for safe and general low-power computation.
Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation, 2011

Isolating and understanding concurrency errors using reconstructed execution fragments.
Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation, 2011

Checked Load: Architectural support for JavaScript type-checking on mobile processors.
Proceedings of the 17th International Conference on High-Performance Computer Architecture (HPCA-17 2011), 2011

Crunching Large Graphs with Commodity Processors.
Proceedings of the 3rd USENIX Workshop on Hot Topics in Parallelism, 2011

Operating System Implications of Fast, Cheap, Non-Volatile Memory.
Proceedings of the 13th Workshop on Hot Topics in Operating Systems, 2011

Accelerating Data Race Detection with Minimal Hardware Support.
Proceedings of the Euro-Par 2011 Parallel Processing - 17th International Conference, 2011

RCDC: a relaxed consistency deterministic computer.
Proceedings of the 16th International Conference on Architectural Support for Programming Languages and Operating Systems, 2011

Characterizing the Performance and Energy Efficiency of Lock-Free Data Structures.
Proceedings of the 15th Workshop on Interaction between Compilers and Computer Architectures, 2011

2010
DMP: Deterministic Shared-Memory Multiprocessing.
IEEE Micro, 2010

Deterministic Process Groups in dOS.
Proceedings of the 9th USENIX Symposium on Operating Systems Design and Implementation, 2010

Composable specifications for structured shared-memory communication.
Proceedings of the 25th Annual ACM SIGPLAN Conference on Object-Oriented Programming, 2010

Conflict exceptions: simplifying concurrent language semantics with precise hardware exceptions for data-races.
Proceedings of the 37th International Symposium on Computer Architecture (ISCA 2010), 2010

ColorSafe: architectural support for debugging and dynamically avoiding multi-variable atomicity violations.
Proceedings of the 37th International Symposium on Computer Architecture (ISCA 2010), 2010

A limit study of JavaScript parallelism.
Proceedings of the 2010 IEEE International Symposium on Workload Characterization, 2010

CoreDet: a compiler and runtime system for deterministic multithreaded execution.
Proceedings of the 15th International Conference on Architectural Support for Programming Languages and Operating Systems, 2010

2009
SoftSig: Software-Exposed Hardware Signatures for Code Analysis and Optimization.
IEEE Micro, 2009

Atom-Aid: Detecting and Surviving Atomicity Violations.
IEEE Micro, 2009

The Bulk Multicore architecture for improved programmability.
Commun. ACM, 2009

Two hardware-based approaches for deterministic multiprocessor replay.
Commun. ACM, 2009

Finding concurrency bugs with context-aware communication graphs.
Proceedings of the 42st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-42 2009), 2009

2008
DeLorean: Recording and Deterministically Replaying Shared-Memory Multiprocessor Execution Effciently.
Proceedings of the 35th International Symposium on Computer Architecture (ISCA 2008), 2008

Concurrency control with data coloring.
Proceedings of the 2008 ACM SIGPLAN workshop on Memory Systems Performance and Correctness: held in conjunction with the Thirteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '08), 2008

2007
Bulk Operation and Data Coloring for Multiprocessor Programmability
PhD thesis, 2007

Implicit parallelism with ordered transactions.
Proceedings of the 12th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2007

BulkSC: bulk enforcement of sequential consistency.
Proceedings of the 34th International Symposium on Computer Architecture (ISCA 2007), 2007

Colorama: Architectural Support for Data-Centric Synchronization.
Proceedings of the 13st International Conference on High-Performance Computer Architecture (HPCA-13 2007), 2007

2006
CAVA: Using checkpoint-assisted value prediction to hide L2 misses.
ACM Trans. Archit. Code Optim., 2006

Energy-Efficient Thread-Level Speculation.
IEEE Micro, 2006

POSH: a TLS compiler that exploits program structure.
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2006

Scalable Cache Miss Handling for High Memory-Level Parallelism.
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-39 2006), 2006

Bulk Disambiguation of Speculative Threads in Multiprocessors.
Proceedings of the 33rd International Symposium on Computer Architecture (ISCA 2006), 2006

2005
Tasking with out-of-order spawn in TLS chip multiprocessors: microarchitecture and compilation.
Proceedings of the 19th Annual International Conference on Supercomputing, 2005

Thread-Level Speculation on a CMP can be energy efficient.
Proceedings of the 19th Annual International Conference on Supercomputing, 2005

2004
CAVA: Hiding L2 Misses with Checkpoint-Assisted Value Prediction.
IEEE Comput. Archit. Lett., 2004

2003
An Overview Of The Bluegene/L System Software Organization.
Parallel Process. Lett., 2003

An Overview of the Blue Gene/L System Software Organization.
Proceedings of the Euro-Par 2003. Parallel Processing, 2003

2002
An overview of the BlueGene/L Supercomputer.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Proceedings of the 2002 ACM/IEEE conference on Supercomputing, 2002

Evaluation of a Multithreaded Architecture for Cellular Computing.
Proceedings of the Eighth International Symposium on High-Performance Computer Architecture (HPCA'02), 2002



  Loading...