Kunle Olukotun

IEEE Comput. Archit. Lett., 2021

Bayesian Optimization with a Prior for the Optimum.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning and Knowledge Discovery in Databases. Research Track, 2021

Capstan: A Vector RDA for Sparsity.

[BibT_eX]

[DOI]

Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021

Aurochs: An Architecture for Dataflow Threads.

[BibT_eX]

[DOI]

Matthew Vilim

Alexander Rucker

Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021

SARA: Scaling a Reconfigurable Dataflow Accelerator.

[BibT_eX]

[DOI]

Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021

High performance lattice regression on FPGAs via a high level hardware description language.

[BibT_eX]

[DOI]

Nathan Zhang

Matthew Feldman

Proceedings of the International Conference on Field-Programmable Technology, 2021

"Let the Data Flow!".

[BibT_eX]

Proceedings of the 11th Conference on Innovative Data Systems Research, 2021

2020

Prior-guided Bayesian Optimization.

[BibT_eX]

[DOI]

CoRR, 2020

Taurus: An Intelligent Data Plane.

[BibT_eX]

[DOI]

CoRR, 2020

Gorgon: Accelerating Machine Learning from Relational Data.

[BibT_eX]

[DOI]

Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020

2019

Analysis of DAWNBench, a Time-to-Accuracy Machine Learning Performance Benchmark.

[BibT_eX]

[DOI]

ACM SIGOPS Oper. Syst. Rev., 2019

DeepFreak: Learning Crystallography Diffraction Patterns with Automated Machine Learning.

[BibT_eX]

[DOI]

CoRR, 2019

Efficient Multiway Hash Join on Reconfigurable Hardware.

[BibT_eX]

[DOI]

Proceedings of the Performance Evaluation and Benchmarking for the Era of Cloud(s), 2019

Serving Recurrent Neural Networks Efficiently with a Spatial Accelerator.

[BibT_eX]

[DOI]

Tian Zhao

Yaqi Zhang

Proceedings of Machine Learning and Systems 2019, 2019

HyperMapper: a Practical Design Space Exploration Framework.

[BibT_eX]

[DOI]

Proceedings of the 27th IEEE International Symposium on Modeling, 2019

Practical Design Space Exploration.

[BibT_eX]

[DOI]

Luigi Nardi

David Koeplinger

Proceedings of the 27th IEEE International Symposium on Modeling, 2019

Scalable interconnects for reconfigurable spatial architectures.

[BibT_eX]

[DOI]

Proceedings of the 46th International Symposium on Computer Architecture, 2019

Polystore++: Accelerated Polystore System for Heterogeneous Workloads.

[BibT_eX]

[DOI]

Proceedings of the 39th IEEE International Conference on Distributed Computing Systems, 2019

TensorFlow to Cloud FPGAs: Tradeoffs for Accelerating Deep Neural Networks.

[BibT_eX]

[DOI]

Stefan Hadjis

Proceedings of the 29th International Conference on Field Programmable Logic and Applications, 2019

Elastic RSS: Co-Scheduling Packets and Cores Using Programmable NICs.

[BibT_eX]

[DOI]

Proceedings of the 3rd Asia-Pacific Workshop on Networking, 2019

2018

Plasticine: A Reconfigurable Accelerator for Parallel Patterns.

[BibT_eX]

[DOI]

IEEE Micro, 2018

High-Accuracy Low-Precision Training.

[BibT_eX]

[DOI]

CoRR, 2018

Exploring the Utility of Developer Exhaust.

[BibT_eX]

[DOI]

Proceedings of the Second Workshop on Data Management for End-To-End Machine Learning, 2018

Spatial: a language and compiler for application accelerators.

[BibT_eX]

[DOI]

Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2018

Flare: Optimizing Apache Spark with Native Compilation for Scale-Up Architectures and Medium-Size Data.

[BibT_eX]

[DOI]

Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation, 2018

LevelHeaded: A Unified Engine for Business Intelligence and Linear Algebra Querying.

[BibT_eX]

[DOI]

Andrew Lamb

Proceedings of the 34th IEEE International Conference on Data Engineering, 2018

2017

EmptyHeaded: A Relational Engine for Graph Processing.

[BibT_eX]

[DOI]

ACM Trans. Database Syst., 2017

Mind the Gap: Bridging Multi-Domain Query Workloads with EmptyHeaded.

[BibT_eX]

[DOI]

Andrew Lamb

Proc. VLDB Endow., 2017

LevelHeaded: Making Worst-Case Optimal Joins Work in the Common Case.

[BibT_eX]

[DOI]

Andrew Lamb

CoRR, 2017

Flare: Native Compilation for Heterogeneous Workloads in Apache Spark.

[BibT_eX]

[DOI]

CoRR, 2017

Infrastructure for Usable Machine Learning: The Stanford DAWN Project.

[BibT_eX]

[DOI]

CoRR, 2017

Understanding and Optimizing Asynchronous Low-Precision Stochastic Gradient Descent.

[BibT_eX]

[DOI]

Proceedings of the 44th Annual International Symposium on Computer Architecture, 2017

Plasticine: A Reconfigurable Architecture For Parallel Paterns.

[BibT_eX]

[DOI]

Proceedings of the 44th Annual International Symposium on Computer Architecture, 2017

2016

EmptyHeaded: A Relational Engine for Graph Processing.

[BibT_eX]

[DOI]

Susan Tu

Proceedings of the 2016 International Conference on Management of Data, 2016

Automatic Generation of Efficient Accelerators for Reconfigurable Hardware.

[BibT_eX]

[DOI]

Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, 2016

Ensuring Rapid Mixing and Low Bias for Asynchronous Gibbs Sampling.

[BibT_eX]

[DOI]

Christopher De Sa

Proceedings of the 33nd International Conference on Machine Learning, 2016

Old techniques for new join algorithms: A case study in RDF processing.

[BibT_eX]

[DOI]

Susan Tu

Proceedings of the 32nd IEEE International Conference on Data Engineering Workshops, 2016

GraphOps: A Dataflow Library for Graph Analytics Acceleration.

[BibT_eX]

[DOI]

Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2016

Have abstraction and eat performance, too: optimized heterogeneous computing with parallel patterns.

[BibT_eX]

[DOI]

Proceedings of the 2016 International Symposium on Code Generation and Optimization, 2016

Generating Configurable Hardware from Parallel Patterns.

[BibT_eX]

[DOI]

Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems, 2016

Scaling Data Analytics with Moore's Law.

[BibT_eX]

[DOI]

Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, 2016

2015

EmptyHeaded: Boolean Algebra Based Graph Processing.

[BibT_eX]

[DOI]

Andres Nötzli

CoRR, 2015

Energy-Efficient Abundant-Data Computing: The N3XT 1, 000x.

[BibT_eX]

[DOI]

Computer, 2015

Go Meta! A Case for Generative Programming and DSLs in Performance Critical Systems.

[BibT_eX]

[DOI]

Proceedings of the 1st Summit on Advances in Programming Languages, 2015

Taming the Wild: A Unified Analysis of Hogwild-Style Algorithms.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Rapidly Mixing Gibbs Sampling for a Class of Factor Graphs Using Hierarchy Width.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Global Convergence of Stochastic Gradient Descent for Some Non-convex Matrix Problems.

[BibT_eX]

[DOI]

Christopher De Sa

Proceedings of the 32nd International Conference on Machine Learning, 2015

Automatic support for multi-module parallelism from computational patterns.

[BibT_eX]

[DOI]

Proceedings of the 25th International Conference on Field Programmable Logic and Applications, 2015

EMEURO: a framework for generating multi-purpose accelerators via deep learning.

[BibT_eX]

[DOI]

Lawrence C. McAfee

Proceedings of the 13th Annual IEEE/ACM International Symposium on Code Generation and Optimization, 2015

2014

Delite: A Compiler Architecture for Performance-Oriented Embedded Domain-Specific Languages.

[BibT_eX]

[DOI]

ACM Trans. Embed. Comput. Syst., 2014

Guest Editorial.

[BibT_eX]

[DOI]

Int. J. Parallel Program., 2014

Global Convergence of Stochastic Gradient Descent for Some Nonconvex Matrix Problems.

[BibT_eX]

[DOI]

Christopher De Sa

CoRR, 2014

Beyond parallel programming with domain specific languages.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2014

Surgical precision JIT compilers.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, 2014

Locality-Aware Mapping of Nested Parallel Patterns on GPUs.

[BibT_eX]

[DOI]

Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture, 2014

Author's retrospective for: improving the performance of speculatively parallel applications on the hydra CMP.

[BibT_eX]

[DOI]

Mark Willey

Proceedings of the ACM International Conference on Supercomputing 25th Anniversary Volume, 2014

Hardware system synthesis from Domain-Specific Languages.

[BibT_eX]

[DOI]

Proceedings of the 24th International Conference on Field Programmable Logic and Applications, 2014

Hardware acceleration of database operations.

[BibT_eX]

[DOI]

Proceedings of the 2014 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2014

Simplifying Scalable Graph Processing with a Domain-Specific Language.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual IEEE/ACM International Symposium on Code Generation and Optimization, 2014

2013

On fast parallel detection of strongly connected components (SCC) in small-world graphs.

[BibT_eX]

[DOI]

Nicole C. Rodia

Proceedings of the International Conference for High Performance Computing, 2013

Optimizing data structures in high-level programs: new directions for extensible compilers based on staging.

[BibT_eX]

[DOI]

Proceedings of the 40th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 2013

Forge: generating a high performance DSL implementation from a declarative specification.

[BibT_eX]

[DOI]

Proceedings of the Generative Programming: Concepts and Experiences, 2013

Composition and Reuse with Compiled Domain-Specific Languages.

[BibT_eX]

[DOI]

Proceedings of the ECOOP 2013 - Object-Oriented Programming, 2013

2012

Utilizing Static Analysis and Code Generation to Accelerate Neural Networks.

[BibT_eX]

[DOI]

Lawrence C. McAfee

Proceedings of the 29th International Conference on Machine Learning, 2012

High performance embedded domain specific languages.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGPLAN International Conference on Functional Programming, 2012

A case of system-level hardware/software co-design and co-verification of a commodity multi-processor system with custom hardware.

[BibT_eX]

[DOI]

Proceedings of the 10th International Conference on Hardware/Software Codesign and System Synthesis, 2012

Green-Marl: a DSL for easy and efficient graph analysis.

[BibT_eX]

[DOI]

Proceedings of the 17th International Conference on Architectural Support for Programming Languages and Operating Systems, 2012

2011

Implementing Domain-Specific Languages for Heterogeneous Parallel Computing.

[BibT_eX]

[DOI]

IEEE Micro, 2011

Building-Blocks for Performance Oriented DSLs

[BibT_eX]

[DOI]

Proceedings of the Proceedings IFIP Working Conference on Domain-Specific Languages, 2011

Accelerating CUDA graph algorithms at maximum warp.

[BibT_eX]

[DOI]

Proceedings of the 16th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2011

A domain-specific approach to heterogeneous parallelism.

[BibT_eX]

[DOI]

Proceedings of the 16th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2011

Panel Statement.

[BibT_eX]

[DOI]

Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

OptiML: An Implicitly Parallel Domain-Specific Language for Machine Learning.

[BibT_eX]

[DOI]

Proceedings of the 28th International Conference on Machine Learning, 2011

Runtime automatic speculative parallelization.

[BibT_eX]

[DOI]

Ben Hertzberg

Proceedings of the CGO 2011, 2011

Hardware acceleration of transactional memory on commodity systems.

[BibT_eX]

[DOI]

Proceedings of the 16th International Conference on Architectural Support for Programming Languages and Operating Systems, 2011

Efficient Parallel Graph Exploration on Multi-Core CPU and GPU.

[BibT_eX]

[DOI]

Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques, 2011

A Heterogeneous Parallel Framework for Domain-Specific Languages.

[BibT_eX]

[DOI]

Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques, 2011

2010

Ubiquitous Parallel Computing from Berkeley, Illinois, and Stanford.

[BibT_eX]

[DOI]

IEEE Micro, 2010

Implementing and evaluating nested parallel transactions in software transactional memory.

[BibT_eX]

[DOI]

Proceedings of the SPAA 2010: Proceedings of the 22nd Annual ACM Symposium on Parallelism in Algorithms and Architectures, 2010

A practical concurrent binary search tree.

[BibT_eX]

[DOI]

Hassan Chafi

Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2010

Transactional predication: high-performance concurrent sets and maps for STM.

[BibT_eX]

[DOI]

Hassan Chafi

Proceedings of the 29th Annual ACM Symposium on Principles of Distributed Computing, 2010

Language virtualization for heterogeneous parallel computing.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual ACM SIGPLAN Conference on Object-Oriented Programming, 2010

Chip multiprocessor architecture: A programmability-driven approach.

[BibT_eX]

[DOI]

Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Eigenbench: A simple exploration tool for orthogonal TM characteristics.

[BibT_eX]

[DOI]

Proceedings of the 2010 IEEE International Symposium on Workload Characterization, 2010

Making nested parallel transactions practical using lightweight hardware support.

[BibT_eX]

[DOI]

Proceedings of the 24th International Conference on Supercomputing, 2010

Implementing and Evaluating a Model Checker for Transactional Memory Systems.

[BibT_eX]

[DOI]

Proceedings of the 15th IEEE International Conference on Engineering of Complex Computer Systems, 2010

Extreme scale computing: Challenges and opportunities.

[BibT_eX]

[DOI]

Proceedings of the 16th International Conference on High-Performance Computer Architecture (HPCA-16 2010), 2010

FARM: A Prototyping Environment for Tightly-Coupled, Heterogeneous Architectures.

[BibT_eX]

[DOI]

Proceedings of the 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2010

A Large-Scale Architecture for Restricted Boltzmann Machines.

[BibT_eX]

[DOI]

Sang Kyun Kim

Peter Leonard McMahon

Proceedings of the 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2010

Hardware/software co-design for high performance computing: challenges and opportunities.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Hardware/Software Codesign and System Synthesis, 2010

2009

Feedback-directed barrier optimization in a strongly isolated STM.

[BibT_eX]

[DOI]

Proceedings of the 36th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 2009

The stanford pervasive parallelism lab.

[BibT_eX]

[DOI]

Proceedings of the 2009 IEEE Hot Chips 21 Symposium (HCS), 2009

A highly scalable Restricted Boltzmann Machine FPGA implementation.

[BibT_eX]

[DOI]

Sang Kyun Kim

Lawrence C. McAfee

Peter Leonard McMahon

Proceedings of the 19th International Conference on Field Programmable Logic and Applications, 2009

2008

Improving software concurrency with hardware-assisted memory snapshot.

[BibT_eX]

[DOI]

Proceedings of the SPAA 2008: Proceedings of the 20th Annual ACM Symposium on Parallelism in Algorithms and Architectures, 2008

Ased: availability, security, and debugging support usingtransactional memory.

[BibT_eX]

[DOI]

JaeWoong Chung

Jiwon Seo

Proceedings of the SPAA 2008: Proceedings of the 20th Annual ACM Symposium on Parallelism in Algorithms and Architectures, 2008

STAMP: Stanford Transactional Applications for Multi-Processing.

[BibT_eX]

[DOI]

Proceedings of the 4th International Symposium on Workload Characterization (IISWC 2008), 2008

2007

iChip Multiprocessor Architecture: Techniques to Improve Throughput and Latency

[BibT_eX]

[DOI]

James Laudon

Synthesis Lectures on Computer Architecture, Morgan & Claypool Publishers, ISBN: 978-3-031-01720-9, 2007

Transactional Memory: The Hardware-Software Interface.

[BibT_eX]

[DOI]

IEEE Micro, 2007

Towards soft optimization techniques for parallel cognitive applications.

[BibT_eX]

[DOI]

Proceedings of the SPAA 2007: Proceedings of the 19th Annual ACM Symposium on Parallelism in Algorithms and Architectures, 2007

Transactional collection classes.

[BibT_eX]

[DOI]

Proceedings of the 12th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2007

An effective hybrid transactional memory system with strong isolation guarantees.

[BibT_eX]

[DOI]

Christoforos E. Kozyrakis

Proceedings of the 34th International Symposium on Computer Architecture (ISCA 2007), 2007

A Scalable, Non-blocking Approach to Transactional Memory.

[BibT_eX]

[DOI]

Proceedings of the 13st International Conference on High-Performance Computer Architecture (HPCA-13 2007), 2007

A practical FPGA-based framework for novel CMP research.

[BibT_eX]

[DOI]

Proceedings of the ACM/SIGDA 15th International Symposium on Field Programmable Gate Arrays, 2007

ATLAS: a chip-multiprocessor with transactional memory support.

[BibT_eX]

[DOI]

Proceedings of the 2007 Design, Automation and Test in Europe Conference and Exposition, 2007

The OpenTM Transactional Application Programming Interface.

[BibT_eX]

[DOI]

Proceedings of the 16th International Conference on Parallel Architectures and Compilation Techniques (PACT 2007), 2007

2006

Executing Java programs with transactional memory.

[BibT_eX]

[DOI]

Christoforos E. Kozyrakis

Sci. Comput. Program., 2006

The Identity Management Kalman Filter (IMKF).

[BibT_eX]

[DOI]

Proceedings of the Robotics: Science and Systems II, 2006

The Atomos transactional programming language.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGPLAN 2006 Conference on Programming Language Design and Implementation, 2006

Map-Reduce for Machine Learning on Multicore.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 19, 2006

Architectural Semantics for Practical Transactional Memory.

[BibT_eX]

[DOI]

Proceedings of the 33rd International Symposium on Computer Architecture (ISCA 2006), 2006

The common case transactional behavior of multithreaded programs.

[BibT_eX]

[DOI]

Proceedings of the 12th International Symposium on High-Performance Computer Architecture, 2006

Tradeoffs in transactional memory virtualization.

[BibT_eX]

[DOI]

Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems, 2006

Testing implementations of transactional memory.

[BibT_eX]

[DOI]

Proceedings of the 15th International Conference on Parallel Architectures and Compilation Techniques (PACT 2006), 2006

2005

A chip prototyping substrate: the flexible architecture for simulation and testing (FAST).

[BibT_eX]

[DOI]

John D. Davis

Stephen E. Richardson

Charis Charitsis

SIGARCH Comput. Archit. News, 2005

The future of microprocessors.

[BibT_eX]

[DOI]

ACM Queue, 2005

Niagara: A 32-Way Multithreaded Sparc Processor.

[BibT_eX]

[DOI]

Poonacha Kongetira

Kathirgamar Aingaran

IEEE Micro, 2005

Exposing speculative thread parallelism in SPEC2000.

[BibT_eX]

[DOI]

Manohar K. Prabhu

Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2005

The Information-Form Data Association Filter.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 18 [Neural Information Processing Systems, 2005

An Application Analysis Framework For Polymorphic Chip Multiprocessors.

[BibT_eX]

[DOI]

Ayodele Thomas

Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005

TAPE: a transactional application profiling environment.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual International Conference on Supercomputing, 2005

A New Approach to Programming and Prototyping Parallel Systems.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing, 2005

Characterization of TCC on Chip-Multiprocessors.

[BibT_eX]

[DOI]

Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques (PACT 2005), 2005

Maximizing CMP Throughput with Mediocre Cores.

[BibT_eX]

[DOI]

John D. Davis

James Laudon

Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques (PACT 2005), 2005

2004

Transactional Coherence and Consistency: Simplifying Parallel Hardware and Software.

[BibT_eX]

[DOI]

IEEE Micro, 2004

Transactional Memory Coherence and Consistency.

[BibT_eX]

[DOI]

Proceedings of the 31st International Symposium on Computer Architecture (ISCA 2004), 2004

Programming with transactional coherence and consistency (TCC).

[BibT_eX]

[DOI]

Proceedings of the 11th International Conference on Architectural Support for Programming Languages and Operating Systems, 2004

2003

The Jrpm System for Dynamically Parallelizing Sequential Java Programs.

[BibT_eX]

[DOI]

IEEE Micro, 2003

Using thread-level speculation to simplify manual parallelization.

[BibT_eX]

[DOI]

Manohar K. Prabhu

Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2003

The Jrpm System for Dynamically Parallelizing Java Programs.

[BibT_eX]

[DOI]

Proceedings of the 30th International Symposium on Computer Architecture (ISCA 2003), 2003

TEST: A Tracer for Extracting Speculative Thread.

[BibT_eX]

[DOI]

Proceedings of the 1st IEEE / ACM International Symposium on Code Generation and Optimization (CGO 2003), 2003

2002

Targeting Dynamic Compilation for Embedded Environments.

[BibT_eX]

[DOI]

Proceedings of the 2nd Java Virtual Machine Research and Technology Symposium, 2002

Efficient state representation for symbolic simulation.

[BibT_eX]

[DOI]

Valeria Bertacco

Proceedings of the 39th Design Automation Conference, 2002

2001

High Bandwidth On-Chip Cache Design.

[BibT_eX]

[DOI]

Kenneth M. Wilson

IEEE Trans. Computers, 2001

2000

The Stanford Hydra CMP.

[BibT_eX]

[DOI]

IEEE Micro, 2000

1999

Improving the performance of speculatively parallel applications on the Hydra CMP.

[BibT_eX]

[DOI]

Mark Willey

Proceedings of the 13th international conference on Supercomputing, 1999

JMTP: an architecture for exploiting concurrency in embedded Java applications with real-time considerations.

[BibT_eX]

[DOI]

Rachid Helaihel

Proceedings of the 1999 IEEE/ACM International Conference on Computer-Aided Design, 1999

1998

DCP: an algorithm for datapath/control partitioning of synthesizable RTL models.

[BibT_eX]

[DOI]

Victor J. Lam

Proceedings of the International Conference on Computer Design: VLSI in Computers and Processors, 1998

REMARC: Reconfigurable Multimedia Array Coprocessor (Abstract).

[BibT_eX]

[DOI]

Takashi Miyamori

Proceedings of the 1998 ACM/SIGDA Sixth International Symposium on Field Programmable Gate Arrays, 1998

A Quantitative Analysis of Reconfigurable Coprocessors for Multimedia Applications.

[BibT_eX]

[DOI]

Takashi Miyamori

Proceedings of the 6th IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM '98), 1998

Digital System Simulation: Methodologies and Examples.

[BibT_eX]

[DOI]

Mark A. Heinrich

David Ofelt

Proceedings of the 35th Conference on Design Automation, 1998

Data Speculation Support for a Chip Multiprocessor.

[BibT_eX]

[DOI]

Mark Willey

Proceedings of the ASPLOS-VIII Proceedings of the 8th International Conference on Architectural Support for Programming Languages and Operating Systems, 1998

Exploiting Method-Level Parallelism in Single-Threaded Java Programs.

[BibT_eX]

[DOI]

Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques, 1998

1997

Multilevel Optimization of Pipelined Caches.

[BibT_eX]

[DOI]

Richard B. Brown

IEEE Trans. Computers, 1997

A Single-Chip Multiprocessor.

[BibT_eX]

[DOI]

Computer, 1997

Designing High Bandwidth On-Chip Caches.

[BibT_eX]

[DOI]

Kenneth M. Wilson

Proceedings of the 24th International Symposium on Computer Architecture, 1997

Verifying correct pipeline implementation for microprocessors.

[BibT_eX]

[DOI]

Jeremy R. Levitt

Proceedings of the 1997 IEEE/ACM International Conference on Computer-Aided Design, 1997

Java as a specification language for hardware-software systems.

[BibT_eX]

[DOI]

Rachid Helaihel

Proceedings of the 1997 IEEE/ACM International Conference on Computer-Aided Design, 1997

The Hierarchical Multi-Bank DRAM: A High-Performance Architecture for Memory Integrated with Processors.

[BibT_eX]

[DOI]

Tadaaki Yamauchi

Proceedings of the 17th Conference on Advanced Research in VLSI (ARVLSI '97), 1997

1996

Increasing Cache Port Efficiency for Dynamic Superscalar Microprocessors.

[BibT_eX]

[DOI]

Kenneth M. Wilson

Mendel Rosenblum

Proceedings of the 23rd Annual International Symposium on Computer Architecture, 1996

Evaluation of Design Alternatives for a Multiprocessor Microprocessor.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual International Symposium on Computer Architecture, 1996

The Impact of Shared-Cache Clustering in Small-Scale Shared-Memory Multiprocessors.

[BibT_eX]

[DOI]

Jaswinder Pal Singh

Proceedings of the Second International Symposium on High-Performance Computer Architecture, 1996

A Scalable Formal Verification Methodology for Pipelined Microprocessors.

[BibT_eX]

[DOI]

Jeremy R. Levitt

Proceedings of the 33st Conference on Design Automation, 1996

The Case for a Single-Chip Multiprocessor.

[BibT_eX]

[DOI]

Proceedings of the ASPLOS-VII Proceedings, 1996

1995

The Benefits of Clustering in Shared Address Space Multiprocessors: An Applications-Driven Investigation.

[BibT_eX]

[DOI]

Proceedings of the Proceedings Supercomputing '95, San Diego, CA, USA, December 4-8, 1995, 1995

A General Method for Compiling Event-Driven Simulations.

[BibT_eX]

[DOI]

Proceedings of the 32st Conference on Design Automation, 1995

1994

A software-hardware cosynthesis approach to digital system simulation.

[BibT_eX]

[DOI]

IEEE Micro, 1994

Exploring the Design Space for a Shared-Cache Multiprocessor.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual International Symposium on Computer Architecture. Chicago, 1994

1992

Analysis and design of latch-controlled synchronous digital circuits.

[BibT_eX]

[DOI]

Karem A. Sakallah

Oyekunle A. Olukotun

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 1992

Performance Optimization of Pipelined Primary Caches.

[BibT_eX]

[DOI]

Richard B. Brown

Proceedings of the 19th Annual International Symposium on Computer Architecture. Gold Coast, 1992

1991

Technology-organization tradeoffs in the architecture of a high-performance processor.

[BibT_eX]

[DOI]

PhD thesis, 1991

The Design of a Microsupercomputer.

[BibT_eX]

[DOI]

Computer, 1991

Implementing a Cache for a High-Performance GaAs Microprocessor.

[BibT_eX]

[DOI]

Richard B. Brown

Proceedings of the 18th Annual International Symposium on Computer Architecture. Toronto, 1991

1990

Hierarchical Gate-Array Routing on a Hypercube Multiprocessor.

[BibT_eX]

[DOI]

Oyekunle A. Olukotun

J. Parallel Distributed Comput., 1990

check Tc and min Tc: Timing Verification and Optimal Clocking of Synchronous Digtal Circuits.

[BibT_eX]

[DOI]

Karem A. Sakallah

Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, 1990

1987

A Preliminary Investigation into Parallel Routing on a Hypercube Computer.

[BibT_eX]

[DOI]