Martin Schulz

According to our database1, Martin Schulz
  • authored at least 218 papers between 1997 and 2017.
  • has a "Dijkstra number"2 of three.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

Homepage:

On csauthors.net:

Bibliography

2017
ScrubJay: deriving knowledge from the disarray of HPC performance data.
Proceedings of the International Conference for High Performance Computing, 2017

REFINE: realistic fault injection via compiler-based instrumentation for accuracy, portability and speed.
Proceedings of the International Conference for High Performance Computing, 2017

Simulating Power Scheduling at Scale.
Proceedings of the 5th International Workshop on Energy Efficient Supercomputing, 2017

Noise Injection Techniques to Expose Subtle and Unintended Message Races.
Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2017

OpenMP Tools Interface: Synchronization Information for Data Race Detection.
Proceedings of the Scaling OpenMP for Exascale Performance and Portability, 2017

Production Hardware Overprovisioning: Real-World Performance Optimization Using an Extensible Power-Aware Resource Management Framework.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium, 2017

Power Aware High Performance Computing: Challenges and Opportunities for Application and System Developers - Survey & Tutorial.
Proceedings of the 2017 International Conference on High Performance Computing & Simulation, 2017

Accelerating Big Data Infrastructure and Applications (Ongoing Collaboration).
Proceedings of the 37th IEEE International Conference on Distributed Computing Systems Workshops, 2017

Understanding the Spatial Characteristics of DRAM Errors in HPC Clusters.
Proceedings of the ACM Workshop on Fault-Tolerance for HPC at Extreme Scale, 2017

Flexible Data Aggregation for Performance Profiling.
Proceedings of the 2017 IEEE International Conference on Cluster Computing, 2017

2016
Exploiting Redundancy and Application Scalability for Cost-Effective, Time-Constrained Execution of HPC Applications on Amazon EC2.
IEEE Trans. Parallel Distrib. Syst., 2016

Ordering Traces Logically to Identify Lateness in Message Passing Programs.
IEEE Trans. Parallel Distrib. Syst., 2016

Evaluating and extending user-level fault tolerance in MPI applications.
IJHPCA, 2016

Exploring the MPI tool information interface: features and capabilities.
IJHPCA, 2016

Development effort estimation in HPC.
Proceedings of the International Conference for High Performance Computing, 2016

Economic Viability of Hardware Overprovisioning in Power-Constrained High Performance Computing.
Proceedings of the 4th International Workshop on Energy Efficient Supercomputing, 2016

VIPACT: A Visualization Interface for Analyzing Calling Context Trees.
Proceedings of the Third Workshop on Visual Performance Analysis, 2016

Pinpointing scale-dependent integer overflow bugs in large-scale parallel applications.
Proceedings of the International Conference for High Performance Computing, 2016

A machine learning framework for performance coverage analysis of proxy applications.
Proceedings of the International Conference for High Performance Computing, 2016

A Performance Model for Allocating the Parallelism in a Multigrid-in-Time Solver.
Proceedings of the 7th International Workshop on Performance Modeling, 2016

A Unified Platform for Exploring Power Management Strategies.
Proceedings of the 4th International Workshop on Energy Efficient Supercomputing, 2016

Caliper: performance introspection for HPC software stacks.
Proceedings of the International Conference for High Performance Computing, 2016

Allowing MPI tools builders to forget about Fortran.
Proceedings of the 23rd European MPI Users' Group Meeting, EuroMPI 2016, 2016

MPI Sessions: Leveraging Runtime Infrastructure to Increase Scalability of Applications at Exascale.
Proceedings of the 23rd European MPI Users' Group Meeting, EuroMPI 2016, 2016

Testing Infrastructure for OpenMP Debugging Interface Implementations.
Proceedings of the OpenMP: Memory, Devices, and Tasks, 2016

Structural Clustering: A New Approach to Support Performance Analysis at Scale.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016

I/O Aware Power Shifting.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016

MPMD Framework for Offloading Load Balance Computation.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016

Power Balancing in an Emulated Exascale Environment.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016

Systemwide Power Management with Argo.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016

ARCHER: Effectively Spotting Data Races in Large OpenMP Applications.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016

Runtime-Guided Mitigation of Manufacturing Variability in Power-Constrained Multi-Socket NUMA Nodes.
Proceedings of the 2016 International Conference on Supercomputing, 2016

Fast Multi-parameter Performance Modeling.
Proceedings of the 2016 IEEE International Conference on Cluster Computing, 2016

IPAS: intelligent protection against silent output corruption in scientific applications.
Proceedings of the 2016 International Symposium on Code Generation and Optimization, 2016

2015
Connecting Performance Analysis and Visualization (Dagstuhl Perspectives Workshop 14022).
Dagstuhl Manifestos, 2015

Debugging high-performance computing applications at massive scales.
Commun. ACM, 2015

A Run-Time System for Power-Constrained HPC Applications.
Proceedings of the High Performance Computing - 30th International Conference, 2015

Clock delta compression for scalable order-replay of non-deterministic parallel applications.
Proceedings of the International Conference for High Performance Computing, 2015

Recovering logical structure from Charm++ event traces.
Proceedings of the International Conference for High Performance Computing, 2015

Analyzing and mitigating the impact of manufacturing variability in power-constrained supercomputing.
Proceedings of the International Conference for High Performance Computing, 2015

Dynamic power sharing for higher job throughput.
Proceedings of the International Conference for High Performance Computing, 2015

Finding the limits of power-constrained application performance.
Proceedings of the International Conference for High Performance Computing, 2015

Decoupled load balancing.
Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2015

Lessons Learned from Implementing OMPD: A Debugging Interface for OpenMP.
Proceedings of the OpenMP: Heterogenous Execution and Data Movements, 2015

Predicting Optimal Power Allocation for CPU and DRAM Domains.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 2015

A Scalable Prescriptive Parallel Debugging Model.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium, 2015

Identifying the Culprits Behind Network Congestion.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium, 2015

Practical Resource Management in Power-Constrained, High Performance Computing.
Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing, 2015

POW: System-wide Dynamic Reallocation of Limited Power in HPC.
Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing, 2015

Event-Action Mappings for Parallel Tools Infrastructures.
Proceedings of the Euro-Par 2015: Parallel Processing, 2015

Distributed Monitoring and Management of Exascale Systems in the Argo Project.
Proceedings of the Distributed Applications and Interoperable Systems, 2015

An Approach to Selecting Thread + Process Mixes for Hybrid MPI + OpenMP Applications.
Proceedings of the 2015 IEEE International Conference on Cluster Computing, 2015

2014
Combing the Communication Hairball: Visualizing Parallel Execution Traces using Logical Time.
IEEE Trans. Vis. Comput. Graph., 2014

Enabling fair pricing on high performance computer systems with node sharing.
Scientific Programming, 2014

Connecting Performance Analysis and Visualization to Advance Extreme Scale Computing (Dagstuhl Perspectives Workshop 14022).
Dagstuhl Reports, 2014

Towards providing low-overhead data race detection for large OpenMP applications.
Proceedings of the 2014 LLVM Compiler Infrastructure in HPC, 2014

Algebraic Multigrid on a Dragonfly Network: First Experiences on a Cray XC30.
Proceedings of the High Performance Computing Systems. Performance Modeling, Benchmarking, and Simulation, 2014

Evaluating User-Level Fault Tolerance for MPI Applications.
Proceedings of the 21st European MPI Users' Group Meeting, 2014

Exploring the Capabilities of the New MPI_T Interface.
Proceedings of the 21st European MPI Users' Group Meeting, 2014

Extracting logical structure and identifying stragglers in parallel execution traces.
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2014

Accurate application progress analysis for large-scale parallel debugging.
Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, 2014

Overcoming the Scalability Challenges of Epidemic Simulations on Blue Waters.
Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014

MPI Runtime Error Detection with MUST: A Scalable and Crash-Safe Approach.
Proceedings of the 43rd International Conference on Parallel Processing Workshops, 2014

Flux: A Next-Generation Resource Management Framework for Large HPC Centers.
Proceedings of the 43rd International Conference on Parallel Processing Workshops, 2014

Adaptive Configuration Selection for Power-Constrained Heterogeneous Systems.
Proceedings of the 43rd International Conference on Parallel Processing, 2014

Exploiting redundancy for cost-effective, time-constrained execution of HPC applications on amazon EC2.
Proceedings of the 23rd International Symposium on High-Performance Parallel and Distributed Computing, 2014

Modeling the Impact of Reduced Memory Bandwidth on HPC Applications.
Proceedings of the Euro-Par 2014 Parallel Processing, 2014

Memory Usage Optimizations for Online Event Analysis.
Proceedings of the Solving Software Challenges for Exascale, 2014

2013
Strategies for Energy-Efficient Resource Management of Hybrid Programming Models.
IEEE Trans. Parallel Distrib. Syst., 2013

Characterizing and mitigating work time inflation in task parallel programs.
Scientific Programming, 2013

MPI runtime error detection with MUST: Advances in deadlock detection.
Scientific Programming, 2013

Parallelizing heavyweight debugging tools with mpiecho.
Parallel Computing, 2013

LIBI: A framework for bootstrapping extreme scale software systems.
Parallel Computing, 2013

A study of application-level recovery methods for transient network faults.
Proceedings of the Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, 2013

Enabling fair pricing on HPC systems with node sharing.
Proceedings of the International Conference for High Performance Computing, 2013

Overcoming extreme-scale reproducibility challenges through a unified, targeted, and multilevel toolset.
Proceedings of the 1st International Workshop on Software Engineering for High Performance Computing in Computational Science and Engineering, 2013

Runtime MPI collective checking with tree-based overlay networks.
Proceedings of the 20th European MPI Users's Group Meeting, 2013

Performance Analysis Techniques for the Exascale Co-Design Process.
Proceedings of the Parallel Computing: Accelerating Computational Science and Engineering (CSE), 2013

OMPT: An OpenMP Tools Application Programming Interface for Performance Analysis.
Proceedings of the OpenMP in the Era of Low Power Devices and Accelerators, 2013

Exploring Traditional and Emerging Parallel Programming Models Using a Proxy Application.
Proceedings of the 27th IEEE International Symposium on Parallel and Distributed Processing, 2013

Systematic Reduction of Data Movement in Algebraic Multigrid Solvers.
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

Efficient and Scalable Retrieval Techniques for Global File Properties.
Proceedings of the 27th IEEE International Symposium on Parallel and Distributed Processing, 2013

Exploring hardware overprovisioning in power-constrained, high performance computing.
Proceedings of the International Conference on Supercomputing, 2013

Intralayer Communication for Tree-Based Overlay Networks.
Proceedings of the 42nd International Conference on Parallel Processing, 2013

A comparative study of high-performance computing on the cloud.
Proceedings of the 22nd International Symposium on High-Performance Parallel and Distributed Computing, 2013

Alignment-Based Metrics for Trace Comparison.
Proceedings of the Euro-Par 2013 Parallel Processing, 2013

2012
Visualizing Network Traffic to Understand the Performance of Massively Parallel Simulations.
IEEE Trans. Vis. Comput. Graph., 2012

What scientific applications can benefit from hardware transactional memory?
Proceedings of the SC Conference on High Performance Computing Networking, 2012

Characterizing and mitigating work time inflation in task parallel programs.
Proceedings of the SC Conference on High Performance Computing Networking, 2012

MPI runtime error detection with MUST: advances in deadlock detection.
Proceedings of the SC Conference on High Performance Computing Networking, 2012

Performance Modeling of Algebraic Multigrid on Blue Gene/Q: Lessons Learned.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Mapping applications with collectives over sub-communicators on torus networks.
Proceedings of the SC Conference on High Performance Computing Networking, 2012

Novel views of performance data to analyze large-scale adaptive applications.
Proceedings of the SC Conference on High Performance Computing Networking, 2012

MPI Runtime Error Detection with MUST: Advanced Error Reports.
Proceedings of the Tools for High Performance Computing 2012, 2012

The myrmics memory allocator: hierarchical, message-passing allocation for global address spaces.
Proceedings of the International Symposium on Memory Management, 2012

Beyond DVFS: A First Look at Performance under a Hardware-Enforced Power Bound.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

GTI: A Generic Tools Infrastructure for Event-Based Tools in Parallel Systems.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

Scalable Critical-Path Based Performance Analysis.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

Quantifying the effectiveness of load balance algorithms.
Proceedings of the International Conference on Supercomputing, 2012

Fault resilience of the algebraic multi-grid solver.
Proceedings of the International Conference on Supercomputing, 2012

Mechanisms and Evaluation of Cross-Layer Fault-Tolerance for Supercomputing.
Proceedings of the 41st International Conference on Parallel Processing, 2012

Modeling the Performance of an Algebraic Multigrid Cycle Using Hybrid MPI/OpenMP.
Proceedings of the 41st International Conference on Parallel Processing, 2012

2011
Checkpointing.
Proceedings of the Encyclopedia of Parallel Computing, 2011

Formal analysis of MPI-based parallel programs.
Commun. ACM, 2011

Large scale debugging of parallel tasks with AutomaDeD.
Proceedings of the Conference on High Performance Computing Networking, 2011

Order Preserving Event Aggregation in TBONs.
Proceedings of the Recent Advances in the Message Passing Interface, 2011

Creating a Tool Set for Optimizing Topology-Aware Node Mappings.
Proceedings of the Tools for High Performance Computing 2011, 2011

Reconciling Sampling and Direct Instrumentation for Unintrusive Call-Path Profiling of MPI Programs.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Exploiting Data Similarity to Reduce Memory Footprints.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Challenges of Scaling Algebraic Multigrid Across Modern Multicore Architectures.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Modeling the performance of an algebraic multigrid cycle on HPC platforms.
Proceedings of the 25th International Conference on Supercomputing, 2011, Tucson, AZ, USA, May 31, 2011

Interpreting Performance Data across Intuitive Domains.
Proceedings of the International Conference on Parallel Processing, 2011

Practical performance prediction under Dynamic Voltage Frequency Scaling.
Proceedings of the 2011 International Green Computing Conference and Workshops, 2011

Scalable memory registration for high performance networks using helper threads.
Proceedings of the 8th Conference on Computing Frontiers, 2011

Large Scale Verification of MPI Programs Using Lamport Clocks with Lazy Update.
Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques, 2011

2010
Transforming MPI source code based on communication patterns.
Future Generation Comp. Syst., 2010

On the Performance of an Algebraic Multigrid Solver on Multicore Clusters.
Proceedings of the High Performance Computing for Computational Science - VECPAR 2010, 2010

A Scalable and Distributed Dynamic Formal Verifier for MPI Programs.
Proceedings of the Conference on High Performance Computing Networking, 2010

ScalaTrace: Tracing, Analysis and Modeling of HPC Codes at Scale.
Proceedings of the Applied Parallel and Scientific Computing, 2010

Hybrid MPI/OpenMP power-aware computing.
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Power-aware MPI task aggregation prediction for high-end computing systems.
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Using focused regression for accurate time-constrained scaling of scientific applications.
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Clustering performance data efficiently at massive scales.
Proceedings of the 24th International Conference on Supercomputing, 2010

Exploitation of Dynamic Communication Patterns through Static Analysis.
Proceedings of the 39th International Conference on Parallel Processing, 2010

Comparing Scalability Prediction Strategies on an SMP of CMPs.
Proceedings of the Euro-Par 2010 - Parallel Processing, 16th International Euro-Par Conference, Ischia, Italy, August 31, 2010

AutomaDeD: Automata-based debugging for dissimilar parallel tasks.
Proceedings of the 2010 IEEE/IFIP International Conference on Dependable Systems and Networks, 2010

10181 Executive Summary - Program Development for Extreme-Scale Computing.
Proceedings of the Program Development for Extreme-Scale Computing, 02.05. - 07.05.2010, 2010

10181 Abstracts Collection - Program Development for Extreme-Scale Computing.
Proceedings of the Program Development for Extreme-Scale Computing, 02.05. - 07.05.2010, 2010

2009
ScalaTrace: Scalable compression and replay of communication traces for high-performance computing.
J. Parallel Distrib. Comput., 2009

Scalable temporal order analysis for large scale debugging.
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2009

8th International Special Session on Current Trends in Numerical Simulation for Parallel Engineering Environments.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2009

MUST: A Scalable Approach to Runtime Error Detection in MPI Programs.
Proceedings of the Tools for High Performance Computing 2009, 2009

Machine learning based online performance prediction for runtime parallelization and task scheduling.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2009

Adagio: making DVS practical for complex HPC applications.
Proceedings of the 23rd international conference on Supercomputing, 2009

A graph based approach for MPI deadlock detection.
Proceedings of the 23rd international conference on Supercomputing, 2009

2008
Efficient architectural design space exploration via predictive modeling.
TACO, 2008

Open | SpeedShop: An open source infrastructure for parallel performance analysis.
Scientific Programming, 2008

BlueGene/L applications: Parallelism On a Massive Scale.
IJHPCA, 2008

Lessons learned at 208K: towards debugging millions of cores.
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2008

Scalable load-balance measurement for SPMD codes.
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2008

7th International Special Session on Current Trends in Numerical Simulation for Parallel Engineering Environments: New Directions and Work-in-Progress (ParSim 2008).
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2008

On the Performance of Transparent MPI Piggyback Messages.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2008

Preserving time in large-scale communication traces.
Proceedings of the 22nd Annual International Conference on Supercomputing, 2008

A regression-based approach to scalability prediction.
Proceedings of the 22nd Annual International Conference on Supercomputing, 2008

Detecting Patterns in MPI Communication Traces.
Proceedings of the 2008 International Conference on Parallel Processing, 2008

Overcoming Scalability Challenges for Tool Daemon Launching.
Proceedings of the 2008 International Conference on Parallel Processing, 2008

Using MPI Communication Patterns to Guide Source Code Transformations.
Proceedings of the Computational Science, 2008

Topic 2: Performance Prediction and Evaluation.
Proceedings of the Euro-Par 2008, 2008

Prediction models for multi-dimensional power-performance optimization on many cores.
Proceedings of the 17th International Conference on Parallel Architecture and Compilation Techniques, 2008

2007
Dynamic Binary Instrumentation and Data Aggregation on Large Scale Systems.
International Journal of Parallel Programming, 2007

Predicting parallel application performance via machine learning approaches.
Concurrency and Computation: Practice and Experience, 2007

PNMPI tools: a whole lot greater than the sum of their parts.
Proceedings of the ACM/IEEE Conference on High Performance Networking and Computing, 2007

Bounding energy consumption in large-scale MPI programs.
Proceedings of the ACM/IEEE Conference on High Performance Networking and Computing, 2007

6th International Special Session on Current Trends in Numerical Simulation for Parallel Engineering Environments New Directions and Work-in-Progress ParSim 2007.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 14th European PVM/MPI User's Group Meeting, Paris, France, September 30, 2007

Methods of inference and learning for performance modeling of parallel applications.
Proceedings of the 12th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2007

Benchmarking the Stack Trace Analysis Tool for BlueGene/L.
Proceedings of the Parallel Computing: Architectures, 2007

Scalable Compression and Replay of Communication Traces in Massively P arallel E nvironments.
Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

Stack Trace Analysis for Large Scale Debugging.
Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

Practical Differential Profiling.
Proceedings of the Euro-Par 2007, 2007

Identifying energy-efficient concurrency levels using machine learning.
Proceedings of the 2007 IEEE International Conference on Cluster Computing, 2007

2006
Poster reception - Scalable compression and replay of communication traces in massively parallel environments.
Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

Gordon Bell finalists I - Large-scale electronic structure calculations of high-Z metals on the BlueGene/L platform.
Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

Poster reception - Patterns in parallel programs: toward high-level understanding of large-scale traces.
Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

5th International Special Session on Current Trends in Numerical Simulation for Parallel Engineering Environments.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2006

Improving distributed memory applications testing by message perturbation.
Proceedings of the 4th Workshop on Parallel and Distributed Systems: Testing, 2006

Dynamic program phase detection in distributed shared-memory multiprocessors.
Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

A Flexible and Dynamic Infrastructure for MPI Tool Interoperability.
Proceedings of the 2006 International Conference on Parallel Processing (ICPP 2006), 2006

Exploring Unexpected Behavior in MPI.
Proceedings of the High Performance Computing and Communications, 2006

Efficiently exploring architectural design spaces via predictive modeling.
Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems, 2006

2005
Scalable dynamic binary instrumentation for Blue Gene/L.
SIGARCH Computer Architecture News, 2005

Simulation as a tool for optimizing memory accesses on NUMA machines.
Perform. Eval., 2005

Monitoring cache behavior on parallel SMP architectures and related programming tools.
Future Generation Comp. Syst., 2005

4th International Special Session on: Current Trends in Numerical Simulation for Parallel Engineering Environments ParSim 2005.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2005

Improving the computational intensity of unstructured mesh applications.
Proceedings of the 19th Annual International Conference on Supercomputing, 2005

DynTG: A Tool for Interactive, Dynamic Instrumentation.
Proceedings of the Computational Science, 2005

An Approach to Performance Prediction for Parallel Applications.
Proceedings of the Euro-Par 2005, Parallel Processing, 11th International Euro-Par Conference, Lisbon, Portugal, August 30, 2005

Owl: next generation system monitoring.
Proceedings of the Second Conference on Computing Frontiers, 2005

2004
SIMT/OMP: A Toolset to Study and Exploit Memory Locality of OpenMP Applications on NUMA Architectures.
Proceedings of the Shared Memory Parallel Programming with OpenMP, 2004

Implementation and Evaluation of a Scalable Application-Level Checkpoint-Recovery Scheme for MPI Programs.
Proceedings of the ACM/IEEE SC2004 Conference on High Performance Networking and Computing, 2004

Current Trends in Numerical Simulation for Parallel Engineering Environments. ParSim 2004.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2004

Application-level checkpointing for shared memory programs.
Proceedings of the 11th International Conference on Architectural Support for Programming Languages and Operating Systems, 2004

SimSnap: Fast-Forwarding via Native Execution and Application-Level Checkpointing.
Proceedings of the 8th Annual Workshop on Interaction between Compilers and Computer Architecture (INTERACT-8 2004), 2004

2003
Pathways of Relevance: Exploring Inflows of Knowledge into Subunits of Multinational Corporations.
Organization Science, 2003

ARS: an adaptive runtime system for locality optimization.
Future Generation Comp. Syst., 2003

SMiLE: an integrated, multi-paradigm software infrastructure for SCI-basedclusters.
Future Generation Comp. Syst., 2003

Interactive Locality Optimization on NUMA Architectures.
Proceedings of the Proceedings ACM 2003 Symposium on Software Visualization, 2003

Identifying and Exploiting Spatial Regularity in Data Memory References.
Proceedings of the ACM/IEEE SC2003 Conference on High Performance Networking and Computing, 2003

Special Session of EuroPVM/MPI 2003: Current Trends in Numerical Simulation for Parallel Engineering Environments - ParSim 2003.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface,10th European PVM/MPI Users' Group Meeting, Venice, Italy, September 29, 2003

A Framework for Portable Shared Memory Programming.
Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003

CAD Grid: Corporate-Wide Resource Sharing for Parameter Studies.
Proceedings of the Euro-Par 2003. Parallel Processing, 2003

A Simulation Tool for Evaluating Shared Memory Systems.
Proceedings of the Proceedings 36th Annual Simulation Symposium (ANSS-36 2003), Orlando, Florida, March 30, 2003

2002
Memory access behavior analysis of NUMA-based shared memory programs.
Scientific Programming, 2002

A Comprehensive Electric Field Simulation Environment on Top of SCI.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 9th European PVM/MPI Users' Group Meeting, Linz, Austria, September 29, 2002

Current Trends in Numerical Simulation for Parallel Engineering Environments.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 9th European PVM/MPI Users' Group Meeting, Linz, Austria, September 29, 2002

Notes on Nondeterminism in Message Passing Programs.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 9th European PVM/MPI Users' Group Meeting, Linz, Austria, September 29, 2002

Performance Analysis for Teraflop Computers: A Distributed Automatic Approach.
Proceedings of the 10th Euromicro Workshop on Parallel, 2002

Boosting the Performance of Electromagnetic Simulations on a PC-Cluster.
Proceedings of the 2002 International Conference on Parallel Computing in Electrical Engineering (PARELEC 2002), 2002

A proposal for a new hardware cache monitoring architecture.
Proceedings of The Workshop on Memory Systems Performance (MSP 2002), 2002

Improving Data Locality Using Dynamic Page Migration Based on Memory Access Histograms.
Proceedings of the Computational Science - ICCS 2002, 2002

Using Semantic Information to Guide Efficient Parallel I/O on Clusters.
Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing (HPDC-11 2002), 2002

SMiLE: An Integrated, Multi-Paradigm Software Infrastructure for SCI-Based Clusters.
Proceedings of the 2nd IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2002), 2002

Overcoming the Problems Associated with the Existence of Too Many DSM APIs.
Proceedings of the 2nd IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2002), 2002

2001
Shared memory programming on NUMA-based clusters using a general and open hybrid hardware, software Approach.
PhD thesis, 2001

Parallel Volume Rendering based on Isosurface Extraction using Commodity Clusters.
Proceedings of the IASTED International Conference on Visualization, 2001

SCI-Based LINUX PC-Clusters as a Platform for Electromagnetic Field Calculations.
Proceedings of the Parallel Computing Technologies, 2001

Visualizing the Memory Access Behavior of Shared Memory Applications on NUMA Architectures.
Proceedings of the Computational Science - ICCS 2001, 2001

Meeting the Computational Demands of Nuclear Medical Imaging Using Commodity Clusters.
Proceedings of the Computational Science - ICCS 2001, 2001

2000
Workshop on High-Level Parallel Programming Models and Supportive Environments (HIPS 2000).
Proceedings of the Parallel and Distributed Processing, 2000

Using the SMiLE Monitoring Infrastructure to Detect and Lower the Inefficiency of Parallel Applications.
Proceedings of the High-Performance Computing and Networking, 8th International Conference, 2000

NEPHEW: Applying a Toolset for the Efficient Deployment of a Medical Image Application on SCI-Based Clusters.
Proceedings of the Euro-Par 2000, Parallel Processing, 6th International Euro-Par Conference, Munich, Germany, August 29, 2000

Multilayer Online-Monitoring for Hybrid DSM Systems on Top of PC Clusters with a SMiLE.
Proceedings of the Computer Performance Evaluation: Modelling Techniques and Tools, 2000

Multithreaded Programming of PC Clusters.
Proceedings of the 2000 International Conference on Parallel Architectures and Compilation Techniques (PACT'00), 2000

1999
True Shared Memory Programming on SCI-Based Clusters.
Proceedings of the SCI: Scalable Coherent Interface, 1999

SCI-VM: A Flexible Base for Transparent Shared Memory Programming Models on Clusters of PCs.
Proceedings of the Parallel and Distributed Processing, 1999

Supporting Shared Memory and Message Passing on Clusters of PCs with a SMiLE.
Proceedings of the Network-Based Parallel Computing: Communication, 1999

Optimizing Data Locality for SCI-Based PC-Clusters with the SmiLE Monitoring Approach.
Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques, 1999

1998
SISCI-Pthreads, SMP-like programming on an SCI-cluster.
Proceedings of the High-Performance Computing and Networking, 1998

1997
Architectural Adaptation for Application-Specific Locality Optimization.
Proceedings of the Proceedings 1997 International Conference on Computer Design: VLSI in Computers & Processors, 1997


  Loading...