Daniel Sánchez

Computer, January, 2025

Quartz: A Reconfigurable, Distributed-Memory Accelerator for Sparse Applications.

[BibT_eX]

[DOI]

Proceedings of the 58th IEEE/ACM International Symposium on Microarchitecture, 2025

Hopps: Leveraging Sparsity to Accelerate Automata Processing.

[BibT_eX]

[DOI]

Xingran Du

Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2025

2024

A Tensor Compiler with Automatic Data Packing for Simple and Efficient Fully Homomorphic Encryption.

[BibT_eX]

[DOI]

Proc. ACM Program. Lang., 2024

Accelerating Zero-Knowledge Proofs Through Hardware-Algorithm Co-Design.

[BibT_eX]

[DOI]

Proceedings of the 57th IEEE/ACM International Symposium on Microarchitecture, 2024

Terminus: A Programmable Accelerator for Read and Update Operations on Sparse Data Structures.

[BibT_eX]

[DOI]

Hyun Ryong Lee

Proceedings of the 57th IEEE/ACM International Symposium on Microarchitecture, 2024

Azul: An Accelerator for Sparse Iterative Solvers Leveraging Distributed On-Chip Memory.

[BibT_eX]

[DOI]

Proceedings of the 57th IEEE/ACM International Symposium on Microarchitecture, 2024

Trapezoid: A Versatile Accelerator for Dense and Sparse Matrix Multiplications.

[BibT_eX]

[DOI]

Yifan Yang

Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024

BitPacker: Enabling High Arithmetic Efficiency in Fully Homomorphic Encryption Accelerators.

[BibT_eX]

[DOI]

Nikola Samardzic

Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

2023

Spatula: A Hardware Accelerator for Sparse Matrix Factorization.

[BibT_eX]

[DOI]

Axel Feldmann

Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

Accelerating RTL Simulation with Hardware-Software Co-Design.

[BibT_eX]

[DOI]

Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

ISOSceles: Accelerating Sparse CNNs through Inter-Layer Pipelining.

[BibT_eX]

[DOI]

Yifan Yang

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2023

Phloem: Automatic Acceleration of Irregular Applications with Fine-Grain Pipeline Parallelism.

[BibT_eX]

[DOI]

Quan M. Nguyen

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2023

2022

An Architecture to Accelerate Computation on Encrypted Data.

[BibT_eX]

[DOI]

IEEE Micro, 2022

Datamime: Generating Representative Benchmarks by Automatically Synthesizing Datasets.

[BibT_eX]

[DOI]

Hyun Ryong Lee

Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

CraterLake: a hardware accelerator for efficient unbounded computation on encrypted data.

[BibT_eX]

[DOI]

Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022

Designing Hardware for Cryptography and Cryptography for Hardware.

[BibT_eX]

[DOI]

Srinivas Devadas

Simon Langowski

Nikola Samardzic

Sacha Servan-Schreiber

Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, 2022

2021

Leaking Secrets Through Compressed Caches.

[BibT_eX]

[DOI]

Andrés Sánchez

Christopher W. Fletcher

IEEE Micro, 2021

F1: A Fast and Programmable Accelerator for Fully Homomorphic Encryption (Extended Version).

[BibT_eX]

[DOI]

CoRR, 2021

F1: A Fast and Programmable Accelerator for Fully Homomorphic Encryption.

[BibT_eX]

[DOI]

Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021

Fifer: Practical Acceleration of Irregular Applications on Reconfigurable Architectures.

[BibT_eX]

[DOI]

Quan M. Nguyen

Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021

SpZip: Architectural Support for Effective Data Compression In Irregular Applications.

[BibT_eX]

[DOI]

Yifan Yang

Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021

Taming the Zoo: The Unified GraphIt Compiler Framework for Novel Architectures.

[BibT_eX]

[DOI]

Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021

Gamma: leveraging Gustavson's algorithm to accelerate sparse matrix multiplication.

[BibT_eX]

[DOI]

Proceedings of the ASPLOS '21: 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2021

2020

Pipette: Improving Core Utilization on Irregular Applications through Intra-Core Pipeline Parallelism.

[BibT_eX]

[DOI]

Quan M. Nguyen

Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020

T4: Compiling Sequential Code for Effective Speculative Parallelization in Hardware.

[BibT_eX]

[DOI]

Victor A. Ying

Mark C. Jeffrey

Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020

Safecracker: Leaking Secrets through Compressed Caches.

[BibT_eX]

[DOI]

Andrés Sánchez

Christopher W. Fletcher

Proceedings of the ASPLOS '20: Architectural Support for Programming Languages and Operating Systems, 2020

Livia: Data-Centric Computing Throughout the Memory Hierarchy.

[BibT_eX]

[DOI]

Elliot Lockerman

Axel Feldmann

Mohammad Bakhshalipour

Proceedings of the ASPLOS '20: Architectural Support for Programming Languages and Operating Systems, 2020

Chronos: Efficient Speculative Parallelism for Accelerators.

[BibT_eX]

[DOI]

Maleen Abeydeera

Proceedings of the ASPLOS '20: Architectural Support for Programming Languages and Operating Systems, 2020

2019

Leveraging Caches to Accelerate Hash Tables and Memoization.

[BibT_eX]

[DOI]

Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019

PHI: Architectural Support for Synchronization- and Bandwidth-Efficient Commutative Scatter Updates.

[BibT_eX]

[DOI]

Anurag Mukkara

Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019

Compress Objects, Not Cache Lines: An Object-Based Compressed Memory Hierarchy.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, 2019

2018

Benzene: An Energy-Efficient Distributed Hybrid Cache Architecture for Manycore Systems.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2018

Sundial: Harmonizing Concurrency Control and Caching in a Distributed OLTP Database Management System.

[BibT_eX]

[DOI]

Proc. VLDB Endow., 2018

Leveraging Hardware Caches for Memoization.

[BibT_eX]

[DOI]

IEEE Comput. Archit. Lett., 2018

Rethinking the Memory Hierarchy for Modern Languages.

[BibT_eX]

[DOI]

Yee Ling Gan

Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, 2018

Adaptive Scheduling for Systems with Asymmetric Memory Hierarchies.

[BibT_eX]

[DOI]

Changping Chen

Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, 2018

Exploiting Locality in Graph Analytics through Hardware-Accelerated Traversal Scheduling.

[BibT_eX]

[DOI]

Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, 2018

Harmonizing Speculative and Non-Speculative Execution in Architectures for Ordered Parallelism.

[BibT_eX]

[DOI]

Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, 2018

KPart: A Hybrid Cache Partitioning-Sharing Technique for Commodity Multicores.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2018

2017

Cache Calculus: Modeling Caches through Differential Equations.

[BibT_eX]

[DOI]

IEEE Comput. Archit. Lett., 2017

Understanding object-level memory access patterns across the spectrum.

[BibT_eX]

[DOI]

Sudharshan S. Vazhkudai

Wei Xue

Proceedings of the International Conference for High Performance Computing, 2017

Jenga: Software-Defined Cache Hierarchies.

[BibT_eX]

[DOI]

Proceedings of the 44th Annual International Symposium on Computer Architecture, 2017

Fractal: An Execution Model for Fine-Grain Nested Speculative Parallelism.

[BibT_eX]

[DOI]

Proceedings of the 44th Annual International Symposium on Computer Architecture, 2017

Using Application-Level Thread Progress Information to Manage Power and Performance.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Computer Design, 2017

Maximizing Cache Performance Under Uncertainty.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Symposium on High Performance Computer Architecture, 2017

Nexus: A New Approach to Replication in Distributed Shared Caches.

[BibT_eX]

[DOI]

Proceedings of the 26th International Conference on Parallel Architectures and Compilation Techniques, 2017

POSTER: Improving Datacenter Efficiency Through Partitioning-Aware Scheduling.

[BibT_eX]

[DOI]

Proceedings of the 26th International Conference on Parallel Architectures and Compilation Techniques, 2017

SAM: Optimizing Multithreaded Cores for Speculative Parallelism.

[BibT_eX]

[DOI]

Proceedings of the 26th International Conference on Parallel Architectures and Compilation Techniques, 2017

2016

Unlocking Ordered Parallelism with the Swarm Architecture.

[BibT_eX]

[DOI]

IEEE Micro, 2016

Validating Simplified Processor Models in Architectural Studies.

[BibT_eX]

[DOI]

CoRR, 2016

TicToc: Time Traveling Optimistic Concurrency Control.

[BibT_eX]

[DOI]

Proceedings of the 2016 International Conference on Management of Data, 2016

Exploiting semantic commutativity in hardware speculation.

[BibT_eX]

[DOI]

Virginia Chiu

Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016

Data-centric execution of speculative parallel programs.

[BibT_eX]

[DOI]

Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016

Tailbench: a benchmark suite and evaluation methodology for latency-critical applications.

[BibT_eX]

[DOI]

Harshad Kasture

Proceedings of the 2016 IEEE International Symposium on Workload Characterization, 2016

Modeling cache performance beyond LRU.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Symposium on High Performance Computer Architecture, 2016

Whirlpool: Improving Dynamic Cache Management with Static Data Classification.

[BibT_eX]

[DOI]

Anurag Mukkara

Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems, 2016

2015

A scalable architecture for ordered irregular parallelism.

[BibT_eX]

[DOI]

Proceedings of the 5th Workshop on Irregular Applications - Architectures and Algorithms, 2015

Exploiting commutativity to reduce the cost of updates to shared data in cache-coherent systems.

[BibT_eX]

[DOI]

Webb Horn

Proceedings of the 48th International Symposium on Microarchitecture, 2015

Rubik: fast analytical power management for latency-critical systems.

[BibT_eX]

[DOI]

Proceedings of the 48th International Symposium on Microarchitecture, 2015

A scalable architecture for ordered parallelism.

[BibT_eX]

[DOI]

Proceedings of the 48th International Symposium on Microarchitecture, 2015

Scaling distributed cache hierarchies through computation and data co-scheduling.

[BibT_eX]

[DOI]

Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture, 2015

Talus: A simple way to remove cliffs in cache performance.

[BibT_eX]

[DOI]

Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture, 2015

Tarcil: reconciling scheduling speed and quality in large shared clusters.

[BibT_eX]

[DOI]

Christina Delimitrou

Proceedings of the Sixth ACM Symposium on Cloud Computing, 2015

2014

Ubik: efficient cache sharing with strict qos for latency-critical workloads.

[BibT_eX]

[DOI]

Harshad Kasture

Proceedings of the Architectural Support for Programming Languages and Operating Systems, 2014

2013

ZSim: fast and accurate microarchitectural simulation of thousand-core systems.

[BibT_eX]

[DOI]

Proceedings of the 40th Annual International Symposium on Computer Architecture, 2013

Jigsaw: Scalable software-defined caches.

[BibT_eX]

[DOI]

Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, 2013

2012

Scalable and Efficient Fine-Grained Cache Partitioning with Vantage.

[BibT_eX]

[DOI]

IEEE Micro, 2012

SCD: A scalable coherence directory with flexible sharer set encoding.

[BibT_eX]

[DOI]

Proceedings of the 18th IEEE International Symposium on High Performance Computer Architecture, 2012

2011

Vantage: scalable and efficient fine-grain cache partitioning.

[BibT_eX]

[DOI]

Proceedings of the 38th International Symposium on Computer Architecture (ISCA 2011), 2011

A few ways can take you a long way: Efficient and highly associative caches with scalable partitioning for many-core CMPs.

[BibT_eX]

[DOI]

Proceedings of the 2011 IEEE Hot Chips 23 Symposium (HCS), 2011

Dynamic Fine-Grain Scheduling of Pipeline Parallelism.

[BibT_eX]

[DOI]

Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques, 2011

2010

An analysis of on-chip interconnection networks for large-scale chip multiprocessors.

[BibT_eX]

[DOI]

George Michelogiannakis

ACM Trans. Archit. Code Optim., 2010

Evaluating Bufferless Flow Control for On-chip Networks.

[BibT_eX]

[DOI]

George Michelogiannakis

William J. Dally

Proceedings of the NOCS 2010, 2010

The ZCache: Decoupling Ways and Associativity.

[BibT_eX]

[DOI]

Proceedings of the 43rd Annual IEEE/ACM International Symposium on Microarchitecture, 2010

Flexible architectural support for fine-grain scheduling.

[BibT_eX]

[DOI]

Richard M. Yoo

Proceedings of the 15th International Conference on Architectural Support for Programming Languages and Operating Systems, 2010

2007

Implementing Signatures for Transactional Memory.

[BibT_eX]

[DOI]