Margaret Martonosi

Orcid: 0000-0001-9683-8032

Affiliations:
  • Princeton University, USA


According to our database1, Margaret Martonosi authored at least 246 papers between 1989 and 2024.

Collaborative distances:

Awards

ACM Fellow

ACM Fellow 2009, "For contributions in power-aware computing.".

IEEE Fellow

IEEE Fellow 2010, "For contributions to power-efficient computer architecture and systems design".

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Distributed Quantum Computing via Integrating Quantum and Classical Computing.
Computer, April, 2024

2023
Graphfire: Synergizing Fetch, Insertion, and Replacement Policies for Graph Analytics.
IEEE Trans. Computers, 2023

Muchisim: A Simulation Framework for Design Exploration of Multi-Chip Manycore Systems.
CoRR, 2023

Tascade: Hardware Support for Atomic-free, Asynchronous and Efficient Reduction Trees.
CoRR, 2023

DCRA: A Distributed Chiplet-based Reconfigurable Architecture for Irregular Applications.
CoRR, 2023

Using LLMs to Facilitate Formal Verification of RTL.
CoRR, 2023

Microarchitectures for Heterogeneous Superconducting Quantum Computers.
CoRR, 2023

Massive Data-Centric Parallelism in the Chiplet Era.
CoRR, 2023

SoCurity: A Design Approach for Enhancing SoC Security.
IEEE Comput. Archit. Lett., 2023

Divide and Conquer for Combinatorial Optimization and Distributed Quantum Computation.
Proceedings of the IEEE International Conference on Quantum Computing and Engineering, 2023

HetArch: Heterogeneous Microarchitectures for Superconducting Quantum Systems.
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

AutoCC: Automatic Discovery of Covert Channels in Time-Shared Hardware.
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

Architectural Support for Optimizing Huge Page Selection Within the OS.
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

Dalorex: A Data-Local Program Execution and Architecture for Memory-bound Applications.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2023

DECADES: A 67mm<sup>2</sup>, 1.46TOPS, 55 Giga Cache-Coherent 64-bit RISC-V Instructions per second, Heterogeneous Manycore SoC with 109 Tiles including Accelerators, Intelligent Storage, and eFPGA in 12nm FinFET.
Proceedings of the IEEE Custom Integrated Circuits Conference, 2023

2022
ScaleQC: A Scalable Framework for Hybrid Computation on Quantum and Classical Processors.
CoRR, 2022

Cutting Quantum Circuits to Run on Quantum and Classical Platforms.
CoRR, 2022

Transforming science through cyberinfrastructure.
Commun. ACM, 2022

Toward systematic architectural design of near-term trapped ion quantum computers.
Commun. ACM, 2022

Margaret Martonosi, National Science Foundation.
Proceedings of the 2022 USENIX Annual Technical Conference, 2022

Tiny but mighty: designing and realizing scalable latency tolerance for manycore SoCs.
Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022

The Implications of Page Size Management on Graph Analytics.
Proceedings of the IEEE International Symposium on Workload Characterization, 2022

SupermarQ: A Scalable Quantum Benchmark Suite.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022

2021
GraphAttack: Optimizing Data Supply for Graph Applications on In-Order Multicore Architectures.
ACM Trans. Archit. Code Optim., 2021

Specifying and testing GPU workgroup progress models.
Proc. ACM Program. Lang., 2021

Quantum Codesign.
IEEE Micro, 2021

Navigating the Seismic Shift of Post-Moore Computer Systems Design.
IEEE Micro, 2021

SPAA'21 Panel Paper: Architecture-Friendly Algorithms versus Algorithm-Friendly Architectures.
Proceedings of the SPAA '21: 33rd ACM Symposium on Parallelism in Algorithms and Architectures, 2021

Designing Calibration and Expressivity-Efficient Instruction Sets for Quantum Computing.
Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021

Optimized Quantum Program Execution Ordering to Mitigate Errors in Simulations of Quantum Systems.
Proceedings of the 2021 International Conference on Rebooting Computing (ICRC), Los Alamitos, CA, USA, November 30, 2021

AutoSVA: Democratizing Formal Verification of RTL Module Interactions.
Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021

CutQC: using small Quantum computers for large Quantum circuit evaluations.
Proceedings of the ASPLOS '21: 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2021

Logical abstractions for noisy variational Quantum algorithm simulation.
Proceedings of the ASPLOS '21: 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2021

2020
Resource-Efficient Quantum Computing by Breaking Abstractions.
Proc. IEEE, 2020

Foundations of empirical memory consistency testing.
Proc. ACM Program. Lang., 2020

Architecting Noisy Intermediate-Scale Quantum Computers: A Real-System Study.
IEEE Micro, 2020

Optimizing IoT and Web Traffic Using Selective Edge Compression.
CoRR, 2020

The MosaicSim Simulator (Full Technical Report).
CoRR, 2020

RealityCheck: Bringing Modularity, Hierarchy, and Abstraction to Automated Microarchitectural Memory Consistency Verification.
CoRR, 2020

Optimization of Simultaneous Measurement for Variational Quantum Eigensolver Applications.
Proceedings of the IEEE International Conference on Quantum Computing and Engineering, 2020

PerpLE: Improving the Speed and Effectiveness of Memory Consistency Testing.
Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020

MosaicSim: A Lightweight, Modular Simulator for Heterogeneous Systems.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2020

Architecting Noisy Intermediate-Scale Trapped Ion Quantum Computers.
Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020

TransForm: Formally Specifying Transistency Models and Synthesizing Enhanced Litmus Tests.
Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020

SQUARE: Strategic Quantum Ancilla Reuse for Modular Quantum Programs via Cost-Effective Uncomputation.
Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020

EduPar-20 Invited Panel.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium Workshops, 2020

A Simulator and Compiler Framework for Agile Hardware-Software Co-design Evaluation and Exploration.
Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2020

Software Mitigation of Crosstalk on Noisy Intermediate-Scale Quantum Computers.
Proceedings of the ASPLOS '20: Architectural Support for Programming Languages and Operating Systems, 2020

2019
Efficient Data Supply for Parallel Heterogeneous Architectures.
ACM Trans. Archit. Code Optim., 2019

Security Verification via Automatic Hardware-Aware Exploit Synthesis: The CheckMate Approach.
IEEE Micro, 2019

Formal constraint-based compilation for noisy intermediate-scale quantum systems.
Microprocess. Microsystems, 2019

Resource optimized quantum architectures for surface code implementations of magic-state distillation.
Microprocess. Microsystems, 2019

Next Steps in Quantum Computing: Computer Science's Role.
CoRR, 2019

Full-stack, real-system quantum computer studies: architectural comparisons and design insights.
Proceedings of the 46th International Symposium on Computer Architecture, 2019

Statistical assertions for validating patterns and finding bugs in quantum programs.
Proceedings of the 46th International Symposium on Computer Architecture, 2019

OKAPI: In Support of Application Correctness in Smart Home Environments.
Proceedings of the Fourth International Conference on Fog and Mobile Edge Computing, 2019

Noise-Adaptive Compiler Mappings for Noisy Intermediate-Scale Quantum Computers.
Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, 2019

2018
2018 ACM SIGMOBILE Rockstar Award: Kyle Jamieson, Princeton University.
GetMobile Mob. Comput. Commun., 2018

Broadening participation: CRA-W.
ACM SIGCSE Bull., 2018

Full-Stack Memory Model Verification with TriCheck.
IEEE Micro, 2018

MeltdownPrime and SpectrePrime: Automatically-Synthesized Attacks Exploiting Invalidation-Based Coherence Protocols.
CoRR, 2018

Science, policy, and service.
Commun. ACM, 2018

Realizing the potential of data science.
Commun. ACM, 2018

New Metrics and Models for a Post-ISA Era: Managing Complexity and Scaling Performance in Heterogeneous Parallelism and Internet-of-Things.
Proceedings of the Abstracts of the 2018 ACM International Conference on Measurement and Modeling of Computer Systems, 2018

QDB: From Quantum Algorithms Towards Correct Quantum Programs.
Proceedings of the 9th Workshop on Evaluation and Usability of Programming Languages and Tools, 2018

CheckMate: Automated Synthesis of Hardware Exploits and Security Litmus Tests.
Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, 2018

PipeProof: Automated Memory Consistency Proofs for Microarchitectural Specifications.
Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, 2018

Magic-State Functional Units: Mapping and Scheduling Multi-Level Distillation Circuits for Fault-Tolerant Quantum Architectures.
Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, 2018

ILA-MCM: Integrating Memory Consistency Models with Instruction-Level Abstractions for Heterogeneous System-on-Chip Verification.
Proceedings of the 2018 Formal Methods in Computer Aided Design, 2018

2017
Decoupling Data Supply from Computation for Latency-Tolerant Communication in Heterogeneous Architectures.
ACM Trans. Archit. Code Optim., 2017

Programming languages and compiler design for realistic quantum hardware.
Nat., 2017

2016 Maurice Wilkes Award Given to Timothy Sherwood.
IEEE Micro, 2017

Transistency Models: Memory Ordering at the Hardware-OS Interface.
IEEE Micro, 2017

PPU: A Control Error-Tolerant Processor for Streaming Applications with Formal Guarantees.
ACM J. Emerg. Technol. Comput. Syst., 2017

Locomotive: Optimizing mobile web traffic using selective compression.
Proceedings of the 18th IEEE International Symposium on A World of Wireless, 2017

RTLcheck: verifying the memory consistency of RTL designs.
Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, 2017

Optimized surface code communication in superconducting quantum computers.
Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, 2017

TriCheck: Memory Model Verification at the Trisection of Software, Hardware, and ISA.
Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems, 2017

2016
MOBILE SENSING: Retrospectives and Trends.
GetMobile Mob. Comput. Commun., 2016

Exploring the Trisection of Software, Hardware, and ISA in Memory Model Design.
CoRR, 2016

Counterexamples and Proof Loophole for the C/C++ to POWER and ARMv7 Trailing-Sync Compiler Mappings.
CoRR, 2016

21st Century Computer Architecture.
CoRR, 2016

Keynotes: Internet of Things: History and hype, technology and policy.
Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016

Graphicionado: A high-performance and energy-efficient accelerator for graph analytics.
Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016

COATCheck: Verifying Memory Ordering at the Hardware-OS Interface.
Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems, 2016

2015
GPU Performance and Power Tuning Using Regression Trees.
ACM Trans. Archit. Code Optim., 2015

ScaffCC: Scalable compilation and analysis of quantum programs.
Parallel Comput., 2015

Verifying Correct Microarchitectural Enforcement of Memory Consistency Models.
IEEE Micro, 2015

CCICheck: using µhb graphs to verify the coherence-consistency interface.
Proceedings of the 48th International Symposium on Microarchitecture, 2015

DeSC: decoupled supply-compute communication management for heterogeneous architectures.
Proceedings of the 48th International Symposium on Microarchitecture, 2015

Characterization and cross-platform analysis of high-throughput accelerators.
Proceedings of the 2015 IEEE International Symposium on Performance Analysis of Systems and Software, 2015

ArMOR: defending against memory consistency model mismatches in heterogeneous architectures.
Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015

Dynamic adaptive techniques for learning application delay tolerance for mobile data offloading.
Proceedings of the 2015 IEEE Conference on Computer Communications, 2015

CommGuard: Mitigating Communication Errors in Error-Prone Parallel Execution.
Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, 2015

Compiler Management of Communication and Parallelism for Quantum Computation.
Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, 2015

2014
Power-Efficient Computer Architectures: Recent Advances
Synthesis Lectures on Computer Architecture, Morgan & Claypool Publishers, ISBN: 978-3-031-01745-2, 2014

2013 International Symposium on Computer Architecture Influential Paper Award.
IEEE Micro, 2014

Adaptive delay-tolerant scheduling for efficient cellular and WiFi usage.
Proceedings of the Proceeding of IEEE International Symposium on a World of Wireless, 2014

Pipe Check: Specifying and Verifying Microarchitectural Enforcement of Memory Consistency Models.
Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture, 2014

Characterizing the performance effect of trials and rotations in applications that use Quantum Phase Estimation.
Proceedings of the 2014 IEEE International Symposium on Workload Characterization, 2014

MRPB: Memory request prioritization for massively parallel processors.
Proceedings of the 20th IEEE International Symposium on High Performance Computer Architecture, 2014

ScaffCC: a framework for compilation and analysis of quantum computing programs.
Proceedings of the Computing Frontiers Conference, CF'14, 2014

2013
TLB Improvements for Chip Multiprocessors: Inter-Core Cooperative Prefetchers and Shared Last-Level TLBs.
ACM Trans. Archit. Code Optim., 2013

Human mobility characterization from cellular network data.
Commun. ACM, 2013

Reducing GPU offload latency via fine-grained CPU-GPU synchronization.
Proceedings of the 19th IEEE International Symposium on High Performance Computer Architecture, 2013

Extracting useful computation from error-prone processors for streaming applications.
Proceedings of the Design, Automation and Test in Europe, 2013

DP-WHERE: Differentially private modeling of human mobility.
Proceedings of the 2013 IEEE International Conference on Big Data (IEEE BigData 2013), 2013

Starchart: Hardware and software optimization using recursive partitioning regression trees.
Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, 2013

2012
Leveraging Smartphone Cameras for Collaborative Road Advisories.
IEEE Trans. Mob. Comput., 2012

Human mobility modeling at metropolitan scales.
Proceedings of the 10th International Conference on Mobile Systems, 2012

Adaptive usage of cellular and WiFi bandwidth: an optimal scheduling formulation.
Proceedings of the seventh ACM international workshop on Challenged networks, 2012

Keynote: Parallelism, heterogeneity, communication: Emerging challenges for performance analysis.
Proceedings of the 2012 IEEE International Symposium on Performance Analysis of Systems & Software, 2012

Stargazer: Automated regression-based GPU design space exploration.
Proceedings of the 2012 IEEE International Symposium on Performance Analysis of Systems & Software, 2012

Optimizing the use of request distribution and stored energy for cost reduction in multi-site internet services.
Proceedings of the Sustainable Internet and ICT for Sustainability, 2012

Characterizing and improving the use of demand-fetched caches in GPUs.
Proceedings of the International Conference on Supercomputing, 2012

EPROF: An energy/performance/reliability optimization framework for streaming applications.
Proceedings of the 17th Asia and South Pacific Design Automation Conference, 2012

2011
Adaptive timekeeping replacement: Fine-grained capacity management for shared CMP caches.
ACM Trans. Archit. Code Optim., 2011

Parallelization libraries: Characterizing and reducing overheads.
ACM Trans. Archit. Code Optim., 2011

Low-infrastructure methods to improve internet access for mobile users in emerging regions.
Proceedings of the 20th International Conference on World Wide Web, 2011

Distributed rating prediction in user generated content streams.
Proceedings of the 2011 ACM Conference on Recommender Systems, 2011

Identifying Important Places in People's Lives from Cellular Network Data.
Proceedings of the Pervasive Computing - 9th International Conference, 2011

RegReS: Adaptively maintaining a target density of regional services in opportunistic vehicular networks.
Proceedings of the Ninth Annual IEEE International Conference on Pervasive Computing and Communications, 2011

Ranges of human mobility in Los Angeles and New York.
Proceedings of the Ninth Annual IEEE International Conference on Pervasive Computing and Communications, 2011

Demo: SignalGuru: leveraging mobile phones for collaborative traffic signal schedule advisory.
Proceedings of the 9th International Conference on Mobile Systems, 2011

SignalGuru: leveraging mobile phones for collaborative traffic signal schedule advisory.
Proceedings of the 9th International Conference on Mobile Systems, 2011

PACMan: prefetch-aware cache management for high performance caching.
Proceedings of the 44rd Annual IEEE/ACM International Symposium on Microarchitecture, 2011

SHiP: signature-based hit predictor for high performance caching.
Proceedings of the 44rd Annual IEEE/ACM International Symposium on Microarchitecture, 2011

Characterization and dynamic mitigation of intra-application cache interference.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2011

Shared last-level TLBs for chip multiprocessors.
Proceedings of the 17th International Conference on High-Performance Computer Architecture (HPCA-17 2011), 2011

Distributed collaborative filtering over social networks.
Proceedings of the 49th Annual Allerton Conference on Communication, 2011

2010
CA-TSL: Energy Adaptation for Targeted System Lifetime in Sparse Mobile Ad Hoc Networks.
IEEE Trans. Mob. Comput., 2010

Managing the cost, energy consumption, and carbon footprint of internet services.
Proceedings of the SIGMETRICS 2010, 2010

Capping the brown energy consumption of Internet services at low cost.
Proceedings of the International Green Computing Conference 2010, 2010

Inter-core cooperative TLB for chip multiprocessors.
Proceedings of the 15th International Conference on Architectural Support for Programming Languages and Operating Systems, 2010

Adaptive spatiotemporal node selection in dynamic networks.
Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, 2010

2009
Repeatable and Realistic Experimentation in Mobile Wireless Networks.
IEEE Trans. Mob. Comput., 2009

Thread criticality predictors for dynamic performance, power, and resource management in chip multiprocessors.
Proceedings of the 36th International Symposium on Computer Architecture (ISCA 2009), 2009

Characterizing the TLB Behavior of Emerging Parallel Workloads on Chip Multiprocessors.
Proceedings of the PACT 2009, 2009

2008
Computer Architecture Techniques for Power-Efficiency
Synthesis Lectures on Computer Architecture, Morgan & Claypool Publishers, ISBN: 978-3-031-01721-6, 2008

Location-based trust for mobile user-generated content: applications, challenges and implementations.
Proceedings of the 9th Workshop on Mobile Computing Systems and Applications, 2008

Potential for collaborative caching and prefetching in largely-disconnected villages.
Proceedings of the 2008 ACM Workshop on Wireless Networks and Systems for Developing Regions, 2008

Full-system chip multiprocessor power evaluations using FPGA-based emulation.
Proceedings of the 2008 International Symposium on Low Power Electronics and Design, 2008

LOCALE: Collaborative Localization Estimation for Sparse Mobile Sensor Networks.
Proceedings of the 7th International Conference on Information Processing in Sensor Networks, 2008

Characterizing and improving the performance of Intel Threading Building Blocks.
Proceedings of the 4th International Symposium on Workload Characterization (IISWC 2008), 2008

ZebraNet and beyond: applications and systems support for mobile, dynamic networks.
Proceedings of the 2008 International Conference on Compilers, 2008

2007
The XTREM power and performance simulator for the Intel XScale core: Design and experiences.
ACM Trans. Embed. Comput. Syst., 2007

Predicting link quality using supervised learning in wireless sensor networks.
ACM SIGMOBILE Mob. Comput. Commun. Rev., 2007

Dali: a communication-centric data abstraction layer for energy-constrained devices in mobile sensor networks.
Proceedings of the 5th International Conference on Mobile Systems, 2007

Tailoring quantum architectures to implementation style: a quantum computer for mobile and persistent qubits.
Proceedings of the 34th International Symposium on Computer Architecture (ISCA 2007), 2007

2006
Dynamic-Compiler-Driven Control for Microprocessor Energy and Performance.
IEEE Micro, 2006

An Efficient, Practical Parallelization Methodology for Multicore Architecture Simulation.
IEEE Comput. Archit. Lett., 2006

Transport layer approaches for improving idle energy in challenged sensor networks.
Proceedings of the 2006 SIGCOMM workshop on Challenged networks, 2006

Energy adaptation techniques to optimize data delivery in store-and-forward sensor networks.
Proceedings of the 4th International Conference on Embedded Networked Sensor Systems, 2006

Data compression algorithms for energy-constrained devices in delay tolerant networks.
Proceedings of the 4th International Conference on Embedded Networked Sensor Systems, 2006

Supervised Learning in Sensor Networks: New Approaches with Routing, Reliability Optimizations.
Proceedings of the Third Annual IEEE Communications Society on Sensor and Ad Hoc Communications and Networks, 2006

A supervised learning approach for routing optimizations in wireless sensor networks.
Proceedings of the 2nd International Workshop on Multi-Hop Ad Hoc Networks: From Theory to Reality, 2006

Middleware for long-term deployment of delay-tolerant sensor networks.
Proceedings of the First International Workshop on Middleware for Sensor Networks, 2006

Live, Runtime Phase Monitoring and Prediction on Real Systems with Application to Dynamic Power Management.
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-39 2006), 2006

An Analysis of Efficient Multi-Core Global Power Management Policies: Maximizing Performance for a Given Power Budget.
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-39 2006), 2006

Situation-Aware Caching Strategies in Highly Varying Mobile Networks.
Proceedings of the 14th International Symposium on Modeling, 2006

Embedded systems in the wild: ZebraNet software, hardware, and deployment experiences.
Proceedings of the 2006 ACM SIGPLAN/SIGBED Conference on Languages, 2006

Power efficiency for variation-tolerant multicore processors.
Proceedings of the 2006 International Symposium on Low Power Electronics and Design, 2006

Techniques for Multicore Thermal Management: Classification and New Exploration.
Proceedings of the 33rd International Symposium on Computer Architecture (ISCA 2006), 2006

Techniques for Real-System Characterization of Java Virtual Machine Energy and Power Behavior.
Proceedings of the 2006 IEEE International Symposium on Workload Characterization, 2006

Phase characterization for power: evaluating control-flow-based and event-counter-based techniques.
Proceedings of the 12th International Symposium on High-Performance Computer Architecture, 2006

2005
Hardware-modulated parallelism in chip multiprocessors.
SIGARCH Comput. Archit. News, 2005

Formal Control Techniques for Power-Performance Management.
IEEE Micro, 2005

Long-Term Workload Phases: Duration Predictions and Applications to DVFS.
IEEE Micro, 2005

Erasure-coding based routing for opportunistic networks.
Proceedings of the 2005 ACM SIGCOMM Workshop on Delay-Tolerant Networking, 2005

A new scheme on link quality prediction and its applications to metric-based routing.
Proceedings of the 3rd International Conference on Embedded Networked Sensor Systems, 2005

A Dynamic Compilation Framework for Controlling Microprocessor Energy and Performance.
Proceedings of the 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-38 2005), 2005

Bounds on power savings using runtime dynamic voltage scaling: an exact algorithm and a linear-time heuristic approximation.
Proceedings of the 2005 International Symposium on Low Power Electronics and Design, 2005

Coordinated, distributed, formal energy management of chip multiprocessors.
Proceedings of the 2005 International Symposium on Low Power Electronics and Design, 2005

Power prediction for intel XScale processors using performance monitoring unit events.
Proceedings of the 2005 International Symposium on Low Power Electronics and Design, 2005

Voltage and Frequency Control With Adaptive Reaction Time in Multiple-Clock-Domain Processors.
Proceedings of the 11th International Conference on High-Performance Computer Architecture (HPCA-11 2005), 2005

Efficient behavior-driven runtime dynamic voltage scaling policies.
Proceedings of the 3rd IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, 2005

2004
Intraprogram dynamic voltage scaling: Bounding opportunities with analytic modeling.
ACM Trans. Archit. Code Optim., 2004

Implementing branch-predictor decay using quasi-static memory cells.
ACM Trans. Archit. Code Optim., 2004

MARio: mobility-adaptive routing using route lifetime abstractions in mobile ad hoc networks.
ACM SIGMOBILE Mob. Comput. Commun. Rev., 2004

Power-performance simulation: design and validation strategies.
SIGMETRICS Perform. Evaluation Rev., 2004

Hardware design experiences in ZebraNet.
Proceedings of the 2nd International Conference on Embedded Networked Sensor Systems, 2004

Implementing Software on Resource-Constrained Mobile Sensors: Experiences with Impala and ZebraNet.
Proceedings of the Second International Conference on Mobile Systems, 2004

XTREM: a power simulator for the Intel XScale® core.
Proceedings of the 2004 ACM SIGPLAN/SIGBED Conference on Languages, 2004

Spectral analysis for characterizing program power and performance.
Proceedings of the 2004 IEEE International Symposium on Performance Analysis of Systems and Software, 2004

Wavelet Analysis for Microprocessor Design: Experiences with Wavelet-Based dI/dt Characterization.
Proceedings of the 10th International Conference on High-Performance Computer Architecture (HPCA-10 2004), 2004

Formal online methods for voltage/frequency control in multiple clock domain microprocessors.
Proceedings of the 11th International Conference on Architectural Support for Programming Languages and Operating Systems, 2004

2003
Challenges in Computer Architecture Evaluation.
Computer, 2003

Impala: a middleware system for managing autonomic, parallel sensor systems.
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2003

Compile-time dynamic voltage scaling settings: opportunities and limits.
Proceedings of the ACM SIGPLAN 2003 Conference on Programming Language Design and Implementation 2003, 2003

Runtime Power Monitoring in High-End Processors: Methodology and Empirical Data.
Proceedings of the 36th Annual International Symposium on Microarchitecture, 2003

Control Techniques to Eliminate Voltage Emergencies in High Performance Processors.
Proceedings of the Ninth International Symposium on High-Performance Computer Architecture (HPCA'03), 2003

TCP: Tag Correlating Prefetchers.
Proceedings of the Ninth International Symposium on High-Performance Computer Architecture (HPCA'03), 2003

2002
Let caches decay: reducing leakage energy via exploitation of cache generational behavior.
ACM Trans. Comput. Syst., 2002

Implementing Decay Techniques using 4T Quasi-Static Memory Cells.
IEEE Comput. Archit. Lett., 2002

Managing leakage for transient data: decay and quasi-static 4T memory cells.
Proceedings of the 2002 International Symposium on Low Power Electronics and Design, 2002

Timekeeping in the Memory System: Predicting and Optimizing Memory Behavior.
Proceedings of the 29th International Symposium on Computer Architecture (ISCA 2002), 2002

Applying Decay Strategies to Branch Predictors for Leakage Energy Savings.
Proceedings of the 20th International Conference on Computer Design (ICCD 2002), 2002

Energy-efficient computing for wildlife tracking: design tradeoffs and early experiences with ZebraNet.
Proceedings of the 10th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-X), 2002

2001
A Mathematical Cache Miss Analysis for Pointer Data Structures.
Proceedings of the Tenth SIAM Conference on Parallel Processing for Scientific Computing, 2001

Run-time power estimation in high performance microprocessors.
Proceedings of the 2001 International Symposium on Low Power Electronics and Design, 2001

Cache decay: exploiting generational behavior to reduce cache leakage power.
Proceedings of the 28th Annual International Symposium on Computer Architecture, 2001

Dynamic Thermal Management for High-Performance Microprocessors.
Proceedings of the Seventh International Symposium on High-Performance Computer Architecture (HPCA'01), 2001

2000
An Edge-endpoint-based Configurable Hardware Architecture for VLSI Layout Design Rule Checking.
VLSI Design, 2000

Value-based clock gating and operation packing: dynamic strategies for improving processor power and performance.
ACM Trans. Comput. Syst., 2000

Accelerating Pipelined Integer and Floating-Point Accumulations in Configurable Hardware with Delayed Addition Techniques.
IEEE Trans. Computers, 2000

Speculative Updates of Local and Global Branch History: A Quantitative Analysis.
J. Instr. Level Parallelism, 2000

Shared-memory multiprocessing: Current state and future directions.
Adv. Comput., 2000

Power-Performance Modeling and Tradeoff Analysis for a High End Microprocessor.
Proceedings of the Power-Aware Computer Systems, First International Workshop, 2000

Wattch: a framework for architectural-level power analysis and optimizations.
Proceedings of the 27th International Symposium on Computer Architecture (ISCA 2000), 2000

Augmenting Modern Superscalar Architectures with Configurable Extended Instructions.
Proceedings of the Parallel and Distributed Processing, 2000

Automated cache optimizations using CME driven diagnosis.
Proceedings of the 14th international conference on Supercomputing, 2000

On Availability of Bit-Narrow Operations in General-Purpose Applications.
Proceedings of the Field-Programmable Logic and Applications, 2000

Limits and Graph Structure of Available Instruction-Level Parallelism (Research Note).
Proceedings of the Euro-Par 2000, Parallel Processing, 6th International Euro-Par Conference, Munich, Germany, August 29, 2000

A Taxonomy of Branch Mispredictions, and Alloyed Prediction as a Robust Solution to Wrong-History Mispredictions.
Proceedings of the 2000 International Conference on Parallel Architectures and Compilation Techniques (PACT'00), 2000

1999
Cache miss equations: a compiler framework for analyzing and tuning memory behavior.
ACM Trans. Program. Lang. Syst., 1999

Using configurable computing to accelerate Boolean satisfiability.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 1999

Branch Prediction, Instruction-Window Size, and Cache Size: Performance Trade-Offs and Simulation Techniques.
IEEE Trans. Computers, 1999

Experience with an Adaptive Globally-Synchronizing Clock Algorithm.
Proceedings of the Eleventh Annual ACM Symposium on Parallel Algorithms and Architectures, 1999

An Adaptive Globally-Synchronizing Clock Algorithm and its Implementation on a Myrinet-based PC Cluster.
Proceedings of the 1999 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, 1999

Dynamically Exploiting Narrow Width Operands to Improve Processor Power and Performance.
Proceedings of the Fifth International Symposium on High-Performance Computer Architecture, 1999

An Edge-Endpoint-Based Configurable Hardware Architecture for VLSI CAD Layout Design Rule Checking.
Proceedings of the 7th IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM '99), 1999

Implementing Application-Specific Cache-Coherence Protocols in Configurable Hardware.
Proceedings of the Network-Based Parallel Computing: Communication, 1999

1998
Informing Memory Operations: Memory Performance Feedback Mechanisms and Their Applications.
ACM Trans. Comput. Syst., 1998

Adaptive parallelism in compiler-parallelized code.
Concurr. Pract. Exp., 1998

Performance monitoring in a Myrinet-connected SHRIMP cluster.
Proceedings of the SIGMETRICS Symposium on Parallel and Distributed Tools, 1998

Improving Prediction for Procedure Returns with Return-address-stack Repair Mechanisms.
Proceedings of the 31st Annual IEEE/ACM International Symposium on Microarchitecture, 1998

Design Choices in the SHRIMP System: An Empirical Study.
Proceedings of the 25th Annual International Symposium on Computer Architecture, 1998

Monitoring Shared Virtual Memory Performance on a Myrinet-based PC Cluster.
Proceedings of the 12th international conference on Supercomputing, 1998

Multipath Execution: Opportunities and Limits.
Proceedings of the 12th international conference on Supercomputing, 1998

Solving Boolean Satisfiability with Dynamic Hardware Configurations.
Proceedings of the Field-Programmable Logic and Applications, 1998

Accelerating Boolean Satisfiability with Configurable Hardware.
Proceedings of the 6th IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM '98), 1998

Using Reconfigurable Computing Techniques to Accelerate Problems in the CAD Domain: A Case Study with Boolean Satisfiability.
Proceedings of the 35th Conference on Design Automation, 1998

Precise Miss Analysis for Program Transformations with Caches of Arbitrary Associativity.
Proceedings of the ASPLOS-VIII Proceedings of the 8th International Conference on Architectural Support for Programming Languages and Operating Systems, 1998

1997
Trends in Shared Memory Multiprocessing.
Computer, 1997

Cache Miss Equations: An Analytical Representation of Cache Misses.
Proceedings of the 11th international conference on Supercomputing, 1997

Static Timing Analysis of Embedded Software.
Proceedings of the 34st Conference on Design Automation, 1997

1996
Characterizing the Memory Behavior of Compiler-Parallelized Applications.
IEEE Trans. Parallel Distributed Syst., 1996

Memory Referencing Behavior in Compiler-Parallelized Applications.
Int. J. Parallel Program., 1996

The SHRIMP performance monitor: design and applications.
Proceedings of the SIGMETRICS symposium on Parallel and distributed tools, 1996

Integrating Performance Monitoring and Communication in Parallel Computers.
Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, 1996

Informing Memory Operations: Providing Memory Performance Feedback in Modern Processors.
Proceedings of the 23rd Annual International Symposium on Computer Architecture, 1996

1995
Tuning Memory Performance of Sequential and Parallel Programs.
Computer, 1995

Evaluating the impact of advanced memory systems on compiler-parallelized codes.
Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques, 1995

1993
Effectiveness of Trace Sampling for Performance Debugging Tools.
Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems, 1993

1992
MemSpy: Analyzing Memory System Bottlenecks in Programs.
Proceedings of the 1992 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems, 1992

1989
Tradeoffs in Message Passing and Shared Memory Implementations of a Standard Cell Router.
Proceedings of the International Conference on Parallel Processing, 1989


  Loading...