Felix Wolf

According to our database1, Felix Wolf authored at least 148 papers between 1999 and 2018.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

Homepage:

On csauthors.net:

Bibliography

2018
A scalable algorithm for simulating the structural plasticity of the brain.
J. Parallel Distrib. Comput., 2018

Unveiling Thread Communication Bottlenecks Using Hardware-Independent Metrics.
Proceedings of the 47th International Conference on Parallel Processing, 2018

Estimating the Impact of External Interference on Application Performance.
Proceedings of the Euro-Par 2018: Parallel Processing, 2018

Lightweight Requirements Engineering for Exascale Co-design.
Proceedings of the IEEE International Conference on Cluster Computing, 2018

Efficient Fault Tolerance Through Dynamic Node Replacement.
Proceedings of the 18th IEEE/ACM International Symposium on Cluster, 2018

2017
Editorial of special issue on Software Engineering for Parallel Systems.
Journal of Systems and Software, 2017

Brief Announcement: Meeting the Challenges of Parallelizing Sequential Programs.
Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures, 2017

Isoefficiency in Practice: Configuring and Understanding the Performance of Task-based Applications.
Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2017

Parallelizing Audio Analysis Applications - A Case Study.
Proceedings of the 39th IEEE/ACM International Conference on Software Engineering: Software Engineering Education and Training Track, 2017

Following the Blind Seer - Creating Better Performance Models Using Less Information.
Proceedings of the Euro-Par 2017: Parallel Processing - 23rd International Conference on Parallel and Distributed Computing, Santiago de Compostela, Spain, August 28, 2017

Off-Road Performance Modeling - How to Deal with Segmented Data.
Proceedings of the Euro-Par 2017: Parallel Processing - 23rd International Conference on Parallel and Distributed Computing, Santiago de Compostela, Spain, August 28, 2017

2016
Automatic Performance Modeling of HPC Applications.
Proceedings of the Software for Exascale Computing - SPPEXA 2013-2015, 2016

Automated Performance Modeling of the UG4 Simulation Framework.
Proceedings of the Software for Exascale Computing - SPPEXA 2013-2015, 2016

Identifying the Root Causes of Wait States in Large-Scale Parallel Applications.
TOPC, 2016

Unveiling parallelization opportunities in sequential programs.
Journal of Systems and Software, 2016

Automatic Generation of Unit Tests for Correlated Variables in Parallel Programs.
International Journal of Parallel Programming, 2016

A Scalable Algorithm for Simulating the Structural Plasticity of the Brain.
Proceedings of the 28th International Symposium on Computer Architecture and High Performance Computing, 2016

Automatic Parallel Pattern Detection in the Algorithm Structure Design Space.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016

Fast Multi-parameter Performance Modeling.
Proceedings of the 2016 IEEE International Conference on Cluster Computing, 2016

2015
Separating the wheat from the chaff: identifying relevant and similar performance data with visual analytics.
Proceedings of the 2nd Workshop on Visual Performance Analysis, 2015

Preventing the explosion of exascale profile data with smart thread-level aggregation.
Proceedings of the 4th Workshop on Extreme Scale Programming Tools, 2015

A Batch System with Efficient Adaptive Scheduling for Malleable and Evolving Applications.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium, 2015

An Efficient Data-Dependence Profiler for Sequential and Parallel Programs.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium, 2015

Exascaling Your Library: Will Your Implementation Meet Your Expectations?
Proceedings of the 29th ACM on International Conference on Supercomputing, 2015

Characterizing Loop-Level Communication Patterns in Shared Memory.
Proceedings of the 44th International Conference on Parallel Processing, 2015

Beyond Data Parallelism: Identifying Parallel Tasks in Sequential Programs.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2015

Fast Data-Dependence Profiling by Skipping Repeatedly Executed Memory Operations.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2015

10, 000 Performance Models per Minute - Scalability of the UG4 Simulation Framework.
Proceedings of the Euro-Par 2015: Parallel Processing, 2015

How Many Threads will be too Many? On the Scalability of OpenMP Implementations.
Proceedings of the Euro-Par 2015: Parallel Processing, 2015

Dependence-Based Code Transformation for Coarse-Grained Parallelism.
Proceedings of the 2015 International Workshop on Code Optimisation for Multi and Many Cores, 2015

The Basic Building Blocks of Parallel Tasks.
Proceedings of the 2015 International Workshop on Code Optimisation for Multi and Many Cores, 2015

2014
Using Template Matching to Infer Parallel Design Patterns.
TACO, 2014

Special issue: Euro-Par 2013.
Concurrency and Computation: Practice and Experience, 2014

Generating Classified Parallel Unit Tests.
Proceedings of the Tests and Proofs - 8th International Conference, 2014

Down to earth: how to visualize traffic on high-dimensional torus networks.
Proceedings of the First Workshop on Visual Performance Analysis, 2014

Catching Idlers with Ease: A Lightweight Wait-State Profiler for MPI Programs.
Proceedings of the 21st European MPI Users' Group Meeting, 2014

SEPS 2014: first international workshop on software engineering for parallel systems.
Proceedings of the Conference on Systems, 2014

A Comparison between OPARI2 and the OpenMP Tools Interface in the Context of Score-P.
Proceedings of the Using and Improving OpenMP for Devices, Tasks, and More, 2014

A Batch System with Fair Scheduling for Evolving Applications.
Proceedings of the 43rd International Conference on Parallel Processing, 2014

Catwalk: A Quick Development Path for Performance Models.
Proceedings of the Euro-Par 2014: Parallel Processing Workshops, 2014

How file access patterns influence interference among cluster applications.
Proceedings of the 2014 IEEE International Conference on Cluster Computing, 2014

2013
A scalable infrastructure for the performance analysis of passive target synchronization.
Parallel Computing, 2013

Parallel universal access layer: A scalable I/O library for integrated tokamak modeling.
Computer Physics Communications, 2013

Extending the scope of the controlled logical clock.
Cluster Computing, 2013

Using automated performance modeling to find scalability bugs in complex codes.
Proceedings of the International Conference for High Performance Computing, 2013

Understanding the formation of wait states in applications with one-sided communication.
Proceedings of the 20th European MPI Users's Group Meeting, 2013

Massively parallel loading.
Proceedings of the International Conference on Supercomputing, 2013

Efficient Offloading of Parallel Kernels Using MPI_Comm_Spawn.
Proceedings of the 42nd International Conference on Parallel Processing, 2013

A Dynamic Resource Management System for Network-Attached Accelerator Clusters.
Proceedings of the 42nd International Conference on Parallel Processing, 2013

Discovery of Potential Parallelism in Sequential Programs.
Proceedings of the 42nd International Conference on Parallel Processing, 2013

Detecting Correlation Violations and Data Races by Inferring Non-deterministic Reads.
Proceedings of the 19th IEEE International Conference on Parallel and Distributed Systems, 2013

Predicting Parallelization of Sequential Programs Using Supervised Learning.
Proceedings of the 12th International Conference on Machine Learning and Applications, 2013

Capturing inter-application interference on clusters.
Proceedings of the 2013 IEEE International Conference on Cluster Computing, 2013

2012
Scalable detection of MPI-2 remote memory access inefficiency patterns.
IJHPCA, 2012

The HOPSA Workflow and Tools.
Proceedings of the Tools for High Performance Computing 2012, 2012

Generic Support for Remote Memory Access Operations in Score-P and OTF2.
Proceedings of the Tools for High Performance Computing 2012, 2012

Performance Analysis Techniques for Task-Based OpenMP Applications.
Proceedings of the OpenMP in a Heterogeneous World - 8th International Workshop on OpenMP, 2012

Dynamic Load Balancing for Unstructured Meshes on Space-Filling Curves.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

Scalable Critical-Path Based Performance Analysis.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

Characterizing Load and Communication Imbalance in Large-Scale Parallel Applications.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

A Dynamic Accelerator-Cluster Architecture.
Proceedings of the 41st International Conference on Parallel Processing Workshops, 2012

Profiling of OpenMP Tasks with Score-P.
Proceedings of the 41st International Conference on Parallel Processing Workshops, 2012

Pattern-Independent Detection of Manual Collectives in MPI Programs.
Proceedings of the Euro-Par 2012 Parallel Processing - 18th International Conference, 2012

2011
Scalasca.
Proceedings of the Encyclopedia of Parallel Computing, 2011

Parallel Sorting with Minimal Data.
Proceedings of the Recent Advances in the Message Passing Interface, 2011

Scaling Performance Tool MPI Communicator Management.
Proceedings of the Recent Advances in the Message Passing Interface, 2011

Score-P: A Joint Performance Measurement Run-Time Infrastructure for Periscope, Scalasca, TAU, and Vampir.
Proceedings of the Tools for High Performance Computing 2011, 2011

Patterns of Inefficient Performance Behavior in GPU Applications.
Proceedings of the 19th International Euromicro Conference on Parallel, 2011

Open Trace Format 2: The Next Generation of Scalable Trace Formats and Support Libraries.
Proceedings of the Applications, Tools and Techniques on the Road to Exascale Computing, Proceedings of the conference ParCo 2011, 31 August, 2011

Performance Analysis of Long-Running Applications.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Reconciling Sampling and Direct Instrumentation for Unintrusive Call-Path Profiling of MPI Programs.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Reducing the Overhead of Direct Application Instrumentation Using Prior Static Analysis.
Proceedings of the Euro-Par 2011 Parallel Processing - 17th International Conference, 2011

2010
Large-Scale Performance Analysis of Sweep3D with the Scalasca Toolset.
Parallel Processing Letters, 2010

Performance measurement and analysis tools for extremely scalable systems.
Concurrency and Computation: Practice and Experience, 2010

The Scalasca performance toolset architecture.
Concurrency and Computation: Practice and Experience, 2010

Further Improving the Scalability of the Scalasca Toolset.
Proceedings of the Applied Parallel and Scientific Computing, 2010

How to Reconcile Event-Based Performance Analysis with Tasking in OpenMP.
Proceedings of the Beyond Loop Level Parallelism in OpenMP: Accelerators, 2010

Performance analysis of Sweep3D on Blue Gene/P with the Scalasca toolset.
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Proceedings of the 15th international workshop on high-level parallel programming models and supportive environments.
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Improvements of common open Grid standards to increase High Throughput and High Performance Computing effectiveness on large-scale Grid and e-science infrastructures.
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Identifying the Root Causes of Wait States in Large-Scale Parallel Applications.
Proceedings of the 39th International Conference on Parallel Processing, 2010

PROPER 2010: Third Workshop on Productivity and Performance - Tools for HPC Application Development.
Proceedings of the Euro-Par 2010 Parallel Processing Workshops, 2010

Synchronizing the Timestamps of Concurrent Events in Traces of Hybrid MPI/OpenMP Applications.
Proceedings of the 2010 IEEE International Conference on Cluster Computing, 2010

Exploring the Potential of Using Multiple E-science Infrastructures with Emerging Open Standards-Based E-health Research Tools.
Proceedings of the 10th IEEE/ACM International Conference on Cluster, 2010

Experiences and Requirements for Interoperability Between HTC and HPC-driven e-Science Infrastructure.
Proceedings of the Future Application and Middleware Technology on e-Science, 2010

2009
Replay-based synchronization of timestamps in event traces of massively parallel applications.
Scalable Computing: Practice and Experience, 2009

A scalable tool architecture for diagnosing wait states in massively parallel applications.
Parallel Computing, 2009

Scalable timestamp synchronization for event traces of message-passing applications.
Parallel Computing, 2009

Interoperation of world-wide production e-Science infrastructures.
Concurrency and Computation: Practice and Experience, 2009

Research advances by using interoperable e-science infrastructures.
Cluster Computing, 2009

Space-efficient time-series call-path profiling of parallel applications.
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2009

Scalable massively parallel I/O to task-local files.
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2009

Scalable Detection of MPI-2 Remote Memory Access Inefficiency Patterns.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2009

Recent Developments in the Scalasca Toolset.
Proceedings of the Tools for High Performance Computing 2009, 2009

Verifying Causality between Distant Performance Phenomena in Large-Scale MPI Applications.
Proceedings of the 17th Euromicro International Conference on Parallel, 2009

A Generic and Configurable Source-Code Instrumentation Component.
Proceedings of the Computational Science, 2009

Introduction.
Proceedings of the Euro-Par 2009 Parallel Processing, 2009

PROPER 2009: Workshop on Productivity and Performance - Tools for HPC Application Development.
Proceedings of the Euro-Par 2009, 2009

Performance Simulation of Non-blocking Communication in Message-Passing Applications.
Proceedings of the Euro-Par 2009, 2009

Enabling Grid Interoperability by Extending HPC-driven Job Management with an Open Standard Information Model.
Proceedings of the 8th IEEE/ACIS International Conference on Computer and Information Science, 2009

2008
Performance measurement and analysis of large-scale parallel applications on leadership computing systems.
Scientific Programming, 2008

SCALASCA Parallel Performance Analyses of SPEC MPI2007 Applications.
Proceedings of the Performance Evaluation: Metrics, 2008

Usage of the SCALASCA toolset for scalable performance analysis of large-scale parallel applications.
Proceedings of the Tools for High Performance Computing, 2008

Performance Evaluation and Optimization of Parallel Grid Computing Applications.
Proceedings of the 16th Euromicro International Conference on Parallel, 2008

Replay-Based Synchronization of Timestamps in Event Traces of Massively Parallel Applications.
Proceedings of the 37th International Conference on Parallel Processing, 2008

Extending the collaborative online visualization and steering framework for computational Grids with attribute-based authorization.
Proceedings of the 9th IEEE/ACM International Conference on Grid Computing (Grid 2008), Tsukuba, Japan, September 29, 2008

Scalasca Parallel Performance Analyses of PEPC.
Proceedings of the Euro-Par 2008 Workshops, 2008

Classification of Different Approaches for e-Science Applications in Next Generation Computing Infrastructures.
Proceedings of the Fourth International Conference on e-Science, 2008

Grid-Based Workflow Management.
Proceedings of the Grid and Services Evolution, 2008

Implications of non-constant clock drifts for the timestamps of concurrent events.
Proceedings of the 2008 IEEE International Conference on Cluster Computing, 29 September, 2008

2007
Compensation of Measurement Overhead in Parallel Performance Profiling.
IJHPCA, 2007

Automatic analysis of inefficiency patterns in parallel applications.
Concurrency and Computation: Practice and Experience, 2007

Timestamp Synchronization for Event Traces of Large-Scale Message-Passing Applications.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 14th European PVM/MPI User's Group Meeting, Paris, France, September 30, 2007

Scalability and Usability of HPC Programming Tools.
Proceedings of the Parallel Computing: Architectures, 2007

Scalable Collation and Presentation of Call-Path Profile Data with CUBE.
Proceedings of the Parallel Computing: Architectures, 2007

Automatic Trace-Based Performance Analysis of Metacomputing Applications.
Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

Design and evaluation of a collaborative online visualization and steering framework implementation for computational grids.
Proceedings of the 8th IEEE/ACM International Conference on Grid Computing (GRID 2007), 2007

Computational Steering and Online Visualization of Scientific Applications on Large-Scale HPC Systems within e-Science Infrastructures.
Proceedings of the Third International Conference on e-Science and Grid Computing, 2007

2006
Performance Tools for Parallel Programming.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2006

Scalable Parallel Trace-Based Performance Analysis.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2006

Integrated Runtime Measurement Summarisation and Selective Event Tracing for Scalable Parallel Execution Performance Diagnosis.
Proceedings of the Applied Parallel Computing. State of the Art in Scientific Computing, 2006

Tools for Parallel Performance Analysis: Minisymposium Abstract.
Proceedings of the Applied Parallel Computing. State of the Art in Scientific Computing, 2006

A Parallel Trace-Data Interface for Scalable Performance Analysis.
Proceedings of the Applied Parallel Computing. State of the Art in Scientific Computing, 2006

Performance Instrumentation and Compiler Optimizations for MPI/OpenMP Applications.
Proceedings of the OpenMP Shared Memory Parallel Programming - International Workshops, 2006

A systematic multi-step methodology for performance analysis of communication traces of distributed applications based on hierarchical clustering.
Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

Specification of Inefficiency Patterns for MPI-2 One-Sided Communication.
Proceedings of the Euro-Par 2006, Parallel Processing, 12th International Euro-Par Conference, Dresden, Germany, August 28, 2006

Large Event Traces in Parallel Performance Analysis.
Proceedings of the ARCS 2006, 2006

2005
Performance Profiling Overhead Compensation for MPI Programs.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2005

A Scalable Approach to MPI Application Performance Analysis.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2005

Holistic Hardware Counter Performance Analysis of Parallel Programs.
Proceedings of the Parallel Computing: Current & Future Issues of High-End Computing, 2005

Performance Analysis of One-sided Communication Mechanisms.
Proceedings of the Parallel Computing: Current & Future Issues of High-End Computing, 2005

Automatic Experimental Analysis of Communication Patterns in Virtual Topologies.
Proceedings of the 34th International Conference on Parallel Processing (ICPP 2005), 2005

Trace-Based Parallel Performance Overhead Compensation.
Proceedings of the High Performance Computing and Communications, 2005

Event-Based Measurement and Analysis of One-Sided Communication.
Proceedings of the Euro-Par 2005, Parallel Processing, 11th International Euro-Par Conference, Lisbon, Portugal, August 30, 2005

Holistic Hardware Counter Performance Analysis of Parallel Programs.
Proceedings of the Automatic Performance Analysis, 12.-16. December 2005, 2005

2004
An Algebra for Cross-Experiment Performance Analysis.
Proceedings of the 33rd International Conference on Parallel Processing (ICPP 2004), 2004

Efficient Pattern Search in Large Traces Through Successive Refinement.
Proceedings of the Euro-Par 2004 Parallel Processing, 2004

2003
Automatic performance analysis on parallel computers with SMP nodes.
PhD thesis, 2003

Automatic performance analysis of hybrid MPI/OpenMP applications.
Journal of Systems Architecture, 2003

Automatic Performance Analysis of Hybrid MPI/OpenMP Applications.
Proceedings of the 11th Euromicro Workshop on Parallel, 2003

Hardware-Counter Based Automatic Performance Analysis of Parallel Programs.
Proceedings of the Parallel Computing: Software Technology, 2003

KOJAK - A Tool Set for Automatic Performance Analysis of Parallel Programs.
Proceedings of the Euro-Par 2003. Parallel Processing, 2003

2002
Design and Prototype of a Performance Tool Interface for OpenMP.
The Journal of Supercomputing, 2002

CATCH - A Call-Graph Based Automatic Tool for Capture of Hardware Performance Metrics for MPI and OpenMP Applications.
Proceedings of the Euro-Par 2002, 2002

2001
Specifying Performance Properties of Parallel Applications Using Compound Events.
Scalable Computing: Practice and Experience, 2001

2000
Automatic Performance Analysis of MPI Applications Based on Event Traces.
Proceedings of the Euro-Par 2000, Parallel Processing, 6th International Euro-Par Conference, Munich, Germany, August 29, 2000

1999
Performance analysis on CRAY T3E.
Proceedings of the Seventh Euromicro Workshop on Parallel and Distributed Processing. PDP'99, 1999

EARL - A Programmable and Extensible Toolkit for Analyzing Event Traces of Message Passing Programs.
Proceedings of the High-Performance Computing and Networking, 7th International Conference, 1999


  Loading...