Mark Horowitz

Affiliations:
  • Stanford University, Department of Electrical Engineering, CA, USA


According to our database1, Mark Horowitz authored at least 250 papers between 1983 and 2024.

Collaborative distances:

Awards

ACM Fellow

ACM Fellow 2003, "For contributions to multiprocessor architecture.".

IEEE Fellow

IEEE Fellow 2000, "For contributions to the design of high-speed digital integrated circuits and systems".

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Amber: A 16-nm System-on-Chip With a Coarse- Grained Reconfigurable Array for Flexible Acceleration of Dense Linear Algebra.
IEEE J. Solid State Circuits, March, 2024

2023
Improving Energy Efficiency of CGRAs with Low-Overhead Fine-Grained Power Domains.
ACM Trans. Reconfigurable Technol. Syst., June, 2023

Unified Buffer: Compiling Image Processing and Machine Learning Applications to Push-Memory Accelerators.
ACM Trans. Archit. Code Optim., June, 2023

AHA: An Agile Approach to the Design of Coarse-Grained Reconfigurable Accelerators and Compilers.
ACM Trans. Embed. Comput. Syst., March, 2023

Higher education's influence on social networks and entrepreneurship in Brazil.
Soc. Netw. Anal. Min., 2023

Retrospective: EIE: Efficient Inference Engine on Sparse and Compressed Neural Network.
CoRR, 2023

Hardware Abstractions and Hardware Mechanisms to Support Multi-Task Execution on Coarse-Grained Reconfigurable Arrays.
CoRR, 2023

Canal: A Flexible Interconnect Generator for Coarse-Grained Reconfigurable Arrays.
IEEE Comput. Archit. Lett., 2023

APEX: A Framework for Automated Processing Element Design Space Exploration using Frequent Subgraph Analysis.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

The Sparse Abstract Machine.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

2022
A Fast Large-Integer Extended GCD Algorithm and Hardware Design for Verifiable Delay Functions and Modular Inversion.
IACR Trans. Cryptogr. Hardw. Embed. Syst., 2022

An Open-Source Framework for FPGA Emulation of Analog/Mixed-Signal Integrated Circuit Designs.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

Enabling and Accelerating Dynamic Vision Transformer Inference for Real-Time Applications.
CoRR, 2022

Cascade: An Application Pipelining Toolkit for Coarse-Grained Reconfigurable Arrays.
CoRR, 2022

Amber: A 367 GOPS, 538 GOPS/W 16nm SoC with a Coarse-Grained Reconfigurable Array for Flexible Acceleration of Dense Linear Algebra.
Proceedings of the IEEE Symposium on VLSI Technology and Circuits (VLSI Technology and Circuits 2022), 2022


Bringing source-level debugging frameworks to hardware generators.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

mflowgen: a modular flow generator and ecosystem for community-driven physical design: invited.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

2021
Fast Extended GCD Calculation for Large Integers for Verifiable Delay Functions.
IACR Cryptol. ePrint Arch., 2021

Enabling Reusable Physical Design Flows with Modular Flow Generators.
CoRR, 2021

Compiling Halide Programs to Push-Memory Accelerators.
CoRR, 2021

Automated Design Space Exploration of CGRA Processing Element Architectures using Frequent Subgraph Analysis.
CoRR, 2021

Automating System Configuration.
Proceedings of the Formal Methods in Computer Aided Design, 2021

2020
Open-Source Synthesizable Analog Blocks for High-Speed Link Designs: 20-GS/s 5b ENOB Analog-to-Digital Converter and 5-GHz Phase Interpolator.
Proceedings of the IEEE Symposium on VLSI Circuits, 2020

SegAlign: a scalable GPU-based whole genome aligner.
Proceedings of the International Conference for High Performance Computing, 2020

A Framework for Adding Low-Overhead, Fine-Grained Power Domains to CGRAs.
Proceedings of the 2020 Design, Automation & Test in Europe Conference & Exhibition, 2020


fault: A Python Embedded Domain-Specific Language for Metaprogramming Portable Hardware Verification Components.
Proceedings of the Computer Aided Verification - 32nd International Conference, 2020

Interstellar: Using Halide's Scheduling Language to Analyze DNN Accelerators.
Proceedings of the ASPLOS '20: Architectural Support for Programming Languages and Operating Systems, 2020

2019
An Analog Model Template Library: Simplifying Chip-Level, Mixed-Signal Design Verification.
IEEE Trans. Very Large Scale Integr. Syst., 2019

StartupBR: Higher Education's Influence on Social Networks and Entrepreneurship in Brazil.
CoRR, 2019

Falcon - A Flexible Architecture For Accelerating Cryptography.
Proceedings of the 16th IEEE International Conference on Mobile Ad Hoc and Sensor Systems, 2019

Dataset Culling: Towards Efficient Training of Distillation-Based Domain Specific Models.
Proceedings of the 2019 IEEE International Conference on Image Processing, 2019

TANGRAM: Optimized Coarse-Grained Dataflow for Scalable NN Accelerators.
Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, 2019

2018
Compiling Algorithms for Heterogeneous Systems
Synthesis Lectures on Computer Architecture, Morgan & Claypool Publishers, ISBN: 978-3-031-01758-2, 2018

Mapping Histological Slice Sequences to the Allen Mouse Brain Atlas Without 3D Reconstruction.
Frontiers Neuroinformatics, 2018

Training Domain Specific Models for Energy-Efficient Object Detection.
CoRR, 2018

DNN Dataflow Choice Is Overrated.
CoRR, 2018

Tethys: Collecting Sensor Data without Infrastracture or Trust.
Proceedings of the 2018 IEEE/ACM Third International Conference on Internet-of-Things Design and Implementation, 2018

Fast FPGA emulation of analog dynamics in digitally-driven systems.
Proceedings of the International Conference on Computer-Aided Design, 2018

2017
Volumetric Image Registration From Invariant Keypoints.
IEEE Trans. Image Process., 2017

Programming Heterogeneous Systems from an Image Processing DSL.
ACM Trans. Archit. Code Optim., 2017

Dark Memory and Accelerator-Rich System Optimization in the Dark Silicon Era.
IEEE Des. Test, 2017

TETRIS: Scalable and Efficient Neural Network Acceleration with 3D Memory.
Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems, 2017

2016
Rigel: flexible multi-rate image processing hardware.
ACM Trans. Graph., 2016

Error Control and Limit Cycle Elimination in Event-Driven Piecewise Linear Analog Functional Models.
IEEE Trans. Circuits Syst. I Regul. Pap., 2016

Tomographic Reconstruction and Alignment Using Matrix Norm Minimization.
IEEE J. Sel. Top. Signal Process., 2016

A Systematic Approach to Blocking Convolutional Neural Networks.
CoRR, 2016

FPMax: a 106GFLOPS/W at 217GFLOPS/mm2 Single-Precision FPU, and a 43.7GFLOPS/W at 74.6GFLOPS/mm2 Double-Precision FPU, in 28nm UTBB FDSOI.
CoRR, 2016

A 220pJ/pixel/frame CMOS image sensor with partial settling readout architecture.
Proceedings of the 2016 IEEE Symposium on VLSI Circuits, 2016

Evaluating programmable architectures for imaging and vision applications.
Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016

Improving energy efficiency of DRAM by exploiting half page row access.
Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016

EIE: Efficient Inference Engine on Compressed Deep Neural Network.
Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, 2016

Deep compression and EIE: Efficient inference engine on compressed deep neural network.
Proceedings of the 2016 IEEE Hot Chips 28 Symposium (HCS), 2016

CESEL: Securing a Mote for 20 Years.
Proceedings of the International Conference on Embedded Wireless Systems and Networks, 2016

2015
Building Conflict-Free FFT Schedules.
IEEE Trans. Circuits Syst. I Regul. Pap., 2015

Digital Analog Design: Enabling Mixed-Signal System Validation.
IEEE Des. Test, 2015

Convolution engine: balancing efficiency and flexibility in specialized computing.
Commun. ACM, 2015

Demo: Tethys - An Energy Harvesting Networked Water Flow Sensor.
Proceedings of the 13th ACM Conference on Embedded Networked Sensor Systems, 2015

Scale- and orientation-invariant keypoints in higher-dimensional data.
Proceedings of the 2015 IEEE International Conference on Image Processing, 2015

2014
Darkroom: compiling high-level image processing code into hardware pipelines.
ACM Trans. Graph., 2014

A Verilog Piecewise-Linear Analog Behavior Model for Mixed-Signal Validation.
IEEE Trans. Circuits Syst. I Regul. Pap., 2014

1.1 Computing's energy problem (and what we can do about it).
Proceedings of the 2014 IEEE International Conference on Solid-State Circuits Conference, 2014

2013
An area-efficient minimum-time FFT schedule using single-ported memory.
Proceedings of the 21st IEEE/IFIP International Conference on VLSI and System-on-Chip, 2013

Forwarding metamorphosis: fast programmable match-action processing in hardware for SDN.
Proceedings of the ACM SIGCOMM 2013 Conference, 2013

Convolution engine: balancing efficiency & flexibility in specialized computing.
Proceedings of the 40th Annual International Symposium on Computer Architecture, 2013

FPU Generator for Design Space Exploration.
Proceedings of the 21st IEEE Symposium on Computer Arithmetic, 2013

Design principles for packet parsers.
Proceedings of the Symposium on Architecture for Networking and Communications Systems, 2013

2012
CMOS Image Sensors With Multi-Bucket Pixels for Computational Photography.
IEEE J. Solid State Circuits, 2012

Bringing up a chip on the cheap.
IEEE Des. Test, 2012

CPU DB: recording microprocessor history.
Commun. ACM, 2012

The Frankencamera: an experimental platform for computational photography.
Commun. ACM, 2012

A 3-stage Pseudo Single-phase Flip-flop family.
Proceedings of the Symposium on VLSI Circuits, 2012

Rethinking DRAM Power Modes for Energy Proportionality.
Proceedings of the 45th Annual IEEE/ACM International Symposium on Microarchitecture, 2012

Towards energy-proportional datacenter memory with mobile DRAM.
Proceedings of the 39th International Symposium on Computer Architecture (ISCA 2012), 2012

Sparse matrix-vector multiply on the HICAMP architecture.
Proceedings of the International Conference on Supercomputing, 2012

Avoiding game over: bringing design to the next level.
Proceedings of the 49th Annual Design Automation Conference 2012, 2012

Removing overhead from high-level interfaces.
Proceedings of the 49th Annual Design Automation Conference 2012, 2012

Design Automation Framework for Application-Specific Logic-in-Memory Blocks.
Proceedings of the 23rd IEEE International Conference on Application-Specific Systems, 2012

2011
Energy-Efficient Floating-Point Unit Design.
IEEE Trans. Computers, 2011

Understanding sources of ineffciency in general-purpose chips.
Commun. ACM, 2011

Beyond the horizon: The next 10x reduction in power - Challenges and solutions.
Proceedings of the IEEE International Solid-State Circuits Conference, 2011

Intermediate representations for controllers in chip generators.
Proceedings of the Design, Automation and Test in Europe, 2011

Global convergence analysis of mixed-signal systems.
Proceedings of the 48th Design Automation Conference, 2011

Joint DAC/IWBDA special session design and synthesis of biological circuits.
Proceedings of the 48th Design Automation Conference, 2011

Latency Sensitive FMA Design.
Proceedings of the 20th IEEE Symposium on Computer Arithmetic, 2011

2010
Fast, Non-Monte-Carlo Estimation of Transient Performance Variation Due to Device Mismatch.
IEEE Trans. Circuits Syst. I Regul. Pap., 2010

Rethinking Digital Design: Why Design Must Change.
IEEE Micro, 2010

Understanding sources of inefficiency in general-purpose chips.
Proceedings of the 37th International Symposium on Computer Architecture (ISCA 2010), 2010

Energy-performance tradeoffs in processor architecture and circuit design: a marginal cost analysis.
Proceedings of the 37th International Symposium on Computer Architecture (ISCA 2010), 2010

Intent-leveraged optimization of analog circuits via homotopy.
Proceedings of the Design, Automation and Test in Europe, 2010

Why design must change: Rethinking digital design.
Proceedings of the Design, Automation and Test in Europe, 2010

An integrated framework for joint design space exploration of microarchitecture and circuits.
Proceedings of the Design, Automation and Test in Europe, 2010

An efficient test vector generation for checking analog/mixed-signal functional models.
Proceedings of the 47th Design Automation Conference, 2010

Fortifying analog models with equivalence checking and coverage analysis.
Proceedings of the 47th Design Automation Conference, 2010

2009
Area-efficiency in CMP core design: co-optimization of microarchitecture and physical design.
SIGARCH Comput. Archit. News, 2009

Energy-Performance Tunable Logic.
IEEE J. Solid State Circuits, 2009

Towards an explanatory and computational theory of scientific discovery.
J. Informetrics, 2009

Using a configurable processor generator for computer architecture prototyping.
Proceedings of the 42st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-42 2009), 2009

A memory system design framework: creating smart memories.
Proceedings of the 36th International Symposium on Computer Architecture (ISCA 2009), 2009

In field, energy-performance tunable FPGA architectures.
Proceedings of the 19th International Conference on Field Programmable Logic and Applications, 2009

Stochastic steady-state and AC analyses of mixed-signal systems.
Proceedings of the 46th Design Automation Conference, 2009

Energy-performance tunable logic.
Proceedings of the IEEE Custom Integrated Circuits Conference, 2009

Leveraging designer's intent: A path toward simpler analog CAD tools.
Proceedings of the IEEE Custom Integrated Circuits Conference, 2009

2008
Comparative evaluation of memory models for chip multiprocessors.
ACM Trans. Archit. Code Optim., 2008

A 90 nm CMOS 16 Gb/s Transceiver for Optical Interconnects.
IEEE J. Solid State Circuits, 2008

Digital Circuit Design Trends.
IEEE J. Solid State Circuits, 2008

A 24 Gb/s Software Programmable Analog Multi-Tone Transmitter.
IEEE J. Solid State Circuits, 2008

Integrated Regulation for Energy-Efficient Digital Circuits.
IEEE J. Solid State Circuits, 2008

Verification of chip multiprocessor memory systems using a relaxed scoreboard.
Proceedings of the 41st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-41 2008), 2008

Processor Performance Modeling using Symbolic Simulation.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2008

A high-speed, low-power 3D-SRAM architecture.
Proceedings of the IEEE 2008 Custom Integrated Circuits Conference, 2008

The case for simple, visible cache coherency.
Proceedings of the 2008 ACM SIGPLAN workshop on Memory Systems Performance and Correctness: held in conjunction with the Thirteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '08), 2008

2007
Veiling glare in high dynamic range imaging.
ACM Trans. Graph., 2007

A 14-mW 6.25-Gb/s Transceiver in 90-nm CMOS.
IEEE J. Solid State Circuits, 2007

Scaling, Power and the Future of CMOS.
Proceedings of the 20th International Conference on VLSI Design (VLSI Design 2007), 2007

A 14mW 6.25Gb/s Transceiver in 90nm CMOS for Serial Chip-to-Chip Communications.
Proceedings of the 2007 IEEE International Solid-State Circuits Conference, 2007

A 90nm CMOS 16Gb/s Transceiver for Optical Interconnects.
Proceedings of the 2007 IEEE International Solid-State Circuits Conference, 2007

Comparing memory systems for chip multiprocessors.
Proceedings of the 34th International Symposium on Computer Architecture (ISCA 2007), 2007

Variable domain transformation for linear PAC analysis of mixed-signal systems.
Proceedings of the 2007 International Conference on Computer-Aided Design, 2007

Practical Limits of Multi-Tone Signaling Over High-Speed Backplane Electrical Links.
Proceedings of IEEE International Conference on Communications, 2007

A new technique for characterization of digital-to-analog converters in high-speed systems.
Proceedings of the 2007 Design, Automation and Test in Europe Conference and Exposition, 2007

Chip Multi-Processor Generator.
Proceedings of the 44th Design Automation Conference, 2007

Time-Variant Characterization and Compensation of Wideband Circuits.
Proceedings of the IEEE 2007 Custom Integrated Circuits Conference, 2007

Robust Energy-Efficient Adder Topologies.
Proceedings of the 18th IEEE Symposium on Computer Arithmetic (ARITH-18 2007), 2007

2006
Light field microscopy.
ACM Trans. Graph., 2006

The implementation of a 2-core, multi-threaded itanium family processor.
IEEE J. Solid State Circuits, 2006

Replica compensated linear regulators for supply-regulated phase-locked loops.
IEEE J. Solid State Circuits, 2006

Improving CDR Performance via Estimation.
Proceedings of the 2006 IEEE International Solid State Circuits Conference, 2006

Analog Multi-Tone Signaling for High-Speed Backplane Electrical Links.
Proceedings of the Global Telecommunications Conference, 2006. GLOBECOM '06, San Francisco, CA, USA, 27 November, 2006

2005
High performance imaging using large camera arrays.
ACM Trans. Graph., 2005

Dual photography.
ACM Trans. Graph., 2005

False coupling exploration in timing analysis.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2005

On task mapping optimization for parallel decoding of low-density parity-check codes on message-passing architectures.
Parallel Comput., 2005

Autonomous dual-mode (PAM2/4) serial link transceiver with adaptive equalization and data recovery.
IEEE J. Solid State Circuits, 2005

Architecture and circuit techniques for a 1.1-GHz 16-kb reconfigurable memory in 0.18-μm CMOS.
IEEE J. Solid State Circuits, 2005

A 20-Gb/s 0.13-μm CMOS serial link transmitter using an LC-PLL to directly drive the output multiplexer.
IEEE J. Solid State Circuits, 2005

Circuits and techniques for high-resolution measurement of on-chip power supply noise.
IEEE J. Solid State Circuits, 2005

Digital Circuit Optimization via Geometric Programming.
Oper. Res., 2005

A New Method for Design of Robust Digital Circuits.
Proceedings of the 6th International Symposium on Quality of Electronic Design (ISQED 2005), 2005

Scalable circuits for supply noise measurement.
Proceedings of the 31st European Solid-State Circuits Conference, 2005

Synthetic Aperture Focusing using a Shear-Warp Factorization of the Viewing Transform.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2005

2004
Synthetic aperture confocal imaging.
ACM Trans. Graph., 2004

Methods for true energy-performance optimization.
IEEE J. Solid State Circuits, 2004

Optimal linear precoding with theoretical and practical data rates in high-speed serial-link backplane communication.
Proceedings of IEEE International Conference on Communications, 2004

Multi-tone signaling for high-speed backplane electrical links.
Proceedings of the Global Telecommunications Conference, 2004. GLOBECOM '04, Dallas, Texas, USA, 29 November, 2004

Equalization of modal dispersion in multimode fiber using spatial light modulators.
Proceedings of the Global Telecommunications Conference, 2004. GLOBECOM '04, Dallas, Texas, USA, 29 November, 2004

High-Speed Videography Using a Dense Camera Array.
Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2004), with CD-ROM, 27 June, 2004

The Stream Virtual Machine.
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques (PACT 2004), 29 September, 2004

2003
Design of CMOS adaptive-bandwidth PLL/DLLs: a general approach.
IEEE Trans. Circuits Syst. II Express Briefs, 2003

Equalization and clock recovery for a 2.5-10-Gb/s 2-PAM/4-PAM backplane transceiver cell.
IEEE J. Solid State Circuits, 2003

A 10-GHz global clock distribution using coupled standing-wave oscillators.
IEEE J. Solid State Circuits, 2003

A 0.4-4-Gb/s CMOS quad transceiver cell using on-chip regulated dual-loop PLLs.
IEEE J. Solid State Circuits, 2003

Specifying and Verifying Hardware for Tamper-Resistant Software.
Proceedings of the 2003 IEEE Symposium on Security and Privacy (S&P 2003), 2003

Implementing an untrusted operating system on trusted hardware.
Proceedings of the 19th ACM Symposium on Operating Systems Principles 2003, 2003

Scaling internet routers using optics.
Proceedings of the ACM SIGCOMM 2003 Conference on Applications, 2003

High-Speed Link Design, Then and Now.
Proceedings of the 21st International Conference on Computer Design (ICCD 2003), 2003

A Framework for Designing Reusable Analog Circuits.
Proceedings of the 2003 International Conference on Computer-Aided Design, 2003

Reshaping EDA for power.
Proceedings of the 40th Design Automation Conference, 2003

Design of a 10GHz clock distribution network using coupled standing-wave oscillators.
Proceedings of the 40th Design Automation Conference, 2003

Modeling and analysis of high-speed links.
Proceedings of the IEEE Custom Integrated Circuits Conference, 2003

2002
High-frequency characterization of on-chip digital interconnects.
IEEE J. Solid State Circuits, 2002

Adaptive supply serial links with sub-1-V operation and per-pin clock recovery.
IEEE J. Solid State Circuits, 2002

An efficient digital sliding controller for adaptive power-supply regulation.
IEEE J. Solid State Circuits, 2002

Methods for true power minimization.
Proceedings of the 2002 IEEE/ACM International Conference on Computer-aided Design, 2002

Transmit pre-emphasis for high-speed time-division-multiplexed serial-link transceiver.
Proceedings of the IEEE International Conference on Communications, 2002

2001
The future of wires.
Proc. IEEE, 2001

A serial-link transceiver based on 8-GSamples/s A/D and D/A converters in 0.25-μm CMOS.
IEEE J. Solid State Circuits, 2001

Fast low-power decoders for RAMs.
IEEE J. Solid State Circuits, 2001

Optimizing the Mapping of Low-Density Parity Check Codes on Parallel Decoding Architectures.
Proceedings of the 2001 International Symposium on Information Technology (ITCC 2001), 2001

Sampling-rate optimization of an interleaved-sampling front-end.
Proceedings of the 2001 International Symposium on Circuits and Systems, 2001

Optimizing iterative decoding of low-density parity check codes on programmable pipelined parallel architectures.
Proceedings of the Global Telecommunications Conference, 2001

Using Texture Mapping with Mipmapping to Render a VLSI Layout.
Proceedings of the 38th Design Automation Conference, 2001

2000
A 2.4 Gb/s/pin simultaneous bidirectional parallel link with per-pin skew compensation.
IEEE J. Solid State Circuits, 2000

A variable-frequency parallel I/O interface with adaptive power-supply regulation.
IEEE J. Solid State Circuits, 2000

A 0.3-μm CMOS 8-Gb/s 4-PAM serial link transceiver.
IEEE J. Solid State Circuits, 2000

Speed and power scaling of SRAM's.
IEEE J. Solid State Circuits, 2000

Smart Memories: a modular reconfigurable architecture.
Proceedings of the 27th International Symposium on Computer Architecture (ISCA 2000), 2000

A 64Mbit Mesochronous Hybrid Wave Pipelined Multibank DRAM Macro.
Proceedings of the Intelligent Memory Systems, Second International Workshop, 2000

Life at the end of CMOS scaling (and beyond) (panel session) (abstract only).
Proceedings of the 37th Conference on Design Automation, 2000

Architectural Support for Copy and Tamper Resistant Software.
Proceedings of the ASPLOS-IX Proceedings of the 9th International Conference on Architectural Support for Programming Languages and Operating Systems, 2000

1999
Timing analysis including clock skew.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 1999

A fully digital, energy-efficient, adaptive power-supply regulator.
IEEE J. Solid State Circuits, 1999

A portable digital DLL for high-speed CMOS interface circuits.
IEEE J. Solid State Circuits, 1999

A 0.4-μm CMOS 10-Gb/s 4-PAM pre-emphasis serial link transmitter.
IEEE J. Solid State Circuits, 1999

Interconnect scaling implications for CAD.
Proceedings of the 1999 IEEE/ACM International Conference on Computer-Aided Design, 1999

Improving coverage analysis and test generation for large designs.
Proceedings of the 1999 IEEE/ACM International Conference on Computer-Aided Design, 1999

Using Partitioning to Help Convergence in the Standard-Cell Design Automation Methodology.
Proceedings of the 36th Conference on Design Automation, 1999

Vex - A CAD Toolbox.
Proceedings of the 36th Conference on Design Automation, 1999

1998
Informing Memory Operations: Memory Performance Feedback Mechanisms and Their Applications.
ACM Trans. Comput. Syst., 1998

High-speed electrical signaling: overview and limitations.
IEEE Micro, 1998

A 0.5-μm CMOS 4.0-Gbit/s serial link transceiver with data recovery using oversampling.
IEEE J. Solid State Circuits, 1998

Low-power dividerless frequency synthesis using aperture phase detection.
IEEE J. Solid State Circuits, 1998

Low-power SRAM design using half-swing pulse-mode techniques.
IEEE J. Solid State Circuits, 1998

A replica technique for wordline and sense control in low-power SRAM's.
IEEE J. Solid State Circuits, 1998

The Stanford FLASH Multiprocessor.
Proceedings of the 25 Years of the International Symposia on Computer Architecture (Selected Papers)., 1998

Approximate Reachability with BDDs Using Overlapping Projections.
Proceedings of the 35th Conference on Design Automation, 1998

1997
Hardware/software co-design of the Stanford FLASH multiprocessor.
Proc. IEEE, 1997

Tiny Tera: a packet switch core.
IEEE Micro, 1997

A semidigital dual delay-locked loop.
IEEE J. Solid State Circuits, 1997

A 700-Mb/s/pin CMOS signaling interface using current integrating receivers.
IEEE J. Solid State Circuits, 1997

Circuit techniques for 1.5-V power supply flash memory.
IEEE J. Solid State Circuits, 1997

Skew-tolerant domino circuits.
IEEE J. Solid State Circuits, 1997

Supply and threshold voltage scaling for low power CMOS.
IEEE J. Solid State Circuits, 1997

Hardware Fault Containment in Scalable Shared-Memory Multiprocessors.
Proceedings of the 24th International Symposium on Computer Architecture, 1997

SRT Division Architectures and Implementations.
Proceedings of the 13th Symposium on Computer Arithmetic (ARITH-13 '97), 1997

1996
A 0.8-μm CMOS 2.5 Gb/s oversampling receiver and transmitter for serial links.
IEEE J. Solid State Circuits, 1996

Energy dissipation in general purpose microprocessors.
IEEE J. Solid State Circuits, 1996

A low power switching power supply for self-clocked systems.
Proceedings of the 1996 International Symposium on Low Power Electronics and Design, 1996

Informing Memory Operations: Providing Memory Performance Feedback in Modern Processors.
Proceedings of the 23rd Annual International Symposium on Computer Architecture, 1996

Validation coverage analysis for complex digital designs.
Proceedings of the 1996 IEEE/ACM International Conference on Computer-Aided Design, 1996

1995
Regenerative feedback repeaters for programmable interconnections.
IEEE J. Solid State Circuits, November, 1995

Clustered voltage scaling technique for low-power design.
Proceedings of the 1995 International Symposium on Low Power Design 1995, 1995

Architecture Validation for Processors.
Proceedings of the 22nd Annual International Symposium on Computer Architecture, 1995

Array-of-arrays architecture for parallel floating point multiplication.
Proceedings of the 16th Conference on Advanced Research in VLSI (ARVLSI '95), 1995

1994
Self-timed logic using Current-Sensing Completion Detection (CSCD).
J. VLSI Signal Process., 1994

Timing analysis for piecewise linear Rsim.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 1994

Eliminating redundant DC equations for asymptotic waveform evaluation.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 1994

Techniques for Characterizing DRAMs With a 500-MHz Interface.
Proceedings of the Proceedings IEEE International Test Conference 1994, 1994

Interleaving: A Multithreading Technique Targeting Multiprocessors and Workstations.
Proceedings of the ASPLOS-VI Proceedings, 1994

The Performance Impact of Flexibility in the Stanford FLASH Multiprocessor.
Proceedings of the ASPLOS-VI Proceedings, 1994

Architectural and Implementation Tradeoffs in the Design of Multiple-Context Processors.
Proceedings of the Multithreaded Computer Architecture, 1994

1993
The design of a high-performance cache controller: a case study in asynchronous synthesis.
Integr., 1993

Piecewise linear models for Rsim.
Proceedings of the 1993 IEEE/ACM International Conference on Computer-Aided Design, 1993

1992
The Stanford Dash Multiprocessor.
Computer, 1992

Architectural and implementation tradeoffs in the design of multiple-context processors.
Proceedings of the 19th Annual International Symposium on Computer Architecture. Gold Coast, 1992

Efficient Superscalar Performance Through Boosting.
Proceedings of the ASPLOS-V Proceedings, 1992

1991
Modeling the Performance of Limited Pointers Directories for Cache Coherence.
Proceedings of the 18th Annual International Symposium on Computer Architecture. Toronto, 1991

A 160 ns 54 bit CMOS division implementation using self-timing and symmetrically overlapped SRT stages.
Proceedings of the 10th IEEE Symposium on Computer Arithmetic, 1991

1990
Techniques for calculating currents and voltages in VLSI power supply networks.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 1990

Boosting Beyond Static Scheduling in a Superscalar Processor.
Proceedings of the 17th Annual International Symposium on Computer Architecture, 1990

Design of scalable shared-memory multiprocessors: the DASH approach.
Proceedings of the Intellectual Leverage: Thirty-Fifth IEEE Computer Society International Conference, 1990

1989
An Analytical Cache Model.
ACM Trans. Comput. Syst., 1989

Characteristics of Performance-Optimal Multi-Level Cache Hierarchies.
Proceedings of the 16th Annual International Symposium on Computer Architecture. Jerusalem, 1989

IRSIM: An Incremental MOS Switch-Level Simulator.
Proceedings of the 26th ACM/IEEE Design Automation Conference, 1989

Limits on Multiple Instruction Issue.
Proceedings of the ASPLOS-III Proceedings, 1989

Rounding algorithms for IEEE multipliers.
Proceedings of the 9th Symposium on Computer Arithmetic, 1989

1988
Cache Performance of Operating System and Multiprogramming Workloads.
ACM Trans. Comput. Syst., 1988

Generalization in digital functions.
Neural Networks, 1988

Performance Tradeoffs in Cache Design.
Proceedings of the 15th Annual International Symposium on Computer Architecture, 1988

An Evaluation of Directory Schemes for Cache Coherence.
Proceedings of the 15th Annual International Symposium on Computer Architecture, 1988

Bisim: a simulator for custom ECL circuits.
Proceedings of the 1988 IEEE International Conference on Computer-Aided Design, 1988

Analyzing CMOS Power Supply Networks Using Ariel.
Proceedings of the 25th ACM/IEEE Conference on Design Automation, 1988

1987
Charge-Sharing Models for Switch-Level Simulation.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 1987

Architectural Tradeoffs in the Design of MIPS-X.
Proceedings of the 14th Annual International Symposium on Computer Architecture. Pittsburgh, 1987

RED: Resistance Extraction for Digital Simulation.
Proceedings of the 24th ACM/IEEE Design Automation Conference. Miami Beach, FL, USA, June 28, 1987

Generating Incremental VLSI Compaction Spacing Constraints.
Proceedings of the 24th ACM/IEEE Design Automation Conference. Miami Beach, FL, USA, June 28, 1987

1986
ATUM: A New Technique for Capturing Address Traces Using Microcode.
Proceedings of the 13th Annual Symposium on Computer Architecture, Tokyo, Japan, June 1986, 1986

1983
Signal Delay in RC Tree Networks.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 1983

Resistance Extraction from Mask Layout Data.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 1983


  Loading...