Wolfgang E. Nagel

According to our database1, Wolfgang E. Nagel
  • authored at least 163 papers between 1988 and 2017.
  • has a "Dijkstra number"2 of three.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

On csauthors.net:

Bibliography

2017
Challenges in Creating a Sustainable Generic Research Data Infrastructure.
Softwaretechnik-Trends, 2017

Towards a metadata-driven multi-community research data management service.
PeerJ PrePrints, 2017

Metadata Management in the MoSGrid Science Gateway - Evaluation and the Expansion of Quantum Chemistry Support.
J. Grid Comput., 2017

The READEX formalism for automatic tuning for energy efficiency.
Computing, 2017

Detecting Memory-Boundedness with Hardware Performance Counters.
Proceedings of the 8th ACM/SPEC on International Conference on Performance Engineering, 2017

E-Team: Practical Energy Accounting for Multi-Core Systems.
Proceedings of the 2017 USENIX Annual Technical Conference, 2017

A Statistical Approach to Power Estimation for x86 Processors.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium Workshops, 2017

Design Evaluation of a Performance Analysis Trace Repository.
Proceedings of the International Conference on Computational Science, 2017

2016
FFMK: A Fast and Fault-Tolerant Microkernel-Based System for Exascale Computing.
Proceedings of the Software for Exascale Computing - SPPEXA 2013-2015, 2016

Editorial for the special issue on energy-aware high performance computing.
Computer Science - R&D, 2016

Editorial for the special issue on Energy-aware high performance computing.
Computer Science - R&D, 2016

Alpaka - An Abstraction Library for Parallel Kernel Acceleration.
CoRR, 2016

Performance-Portable Many-Core Plasma Simulations: Porting PIConGPU to OpenPower and Beyond.
CoRR, 2016

Performance-Portable Many-Core Plasma Simulations: Porting PIConGPU to OpenPower and Beyond.
Proceedings of the High Performance Computing, 2016

Simulation models verification for resilient communication on a highly adaptive energy-efficient computer.
Proceedings of the 24th High Performance Computing Symposium, 2016

The Potential of Diffusive Load Balancing at Large Scale.
Proceedings of the 23rd European MPI Users' Group Meeting, EuroMPI 2016, 2016

Runtime Correctness Analysis of MPI-3 Nonblocking Collectives.
Proceedings of the 23rd European MPI Users' Group Meeting, EuroMPI 2016, 2016

Lessons Learned from Spatial and Temporal Correlation of Node Failures in High Performance Computers.
Proceedings of the 24th Euromicro International Conference on Parallel, 2016

Towards a Metadata-driven Multi-Community Research Data Management Service.
Proceedings of the 8th International Workshop on Science Gateways, 2016

Alpaka - An Abstraction Library for Parallel Kernel Acceleration.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016

Detection and Visualization of Performance Variations to Guide Identification of Application Bottlenecks.
Proceedings of the 45th International Conference on Parallel Processing Workshops, 2016

OTFX: An In-memory Event Tracing Extension to the Open Trace Format 2.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2016

Synchronization Debugging of Hybrid Parallel Programs.
Proceedings of the Euro-Par 2016: Parallel Processing, 2016

Seamless HPC Integration of Data-Intensive KNIME Workflows via UNICORE.
Proceedings of the Euro-Par 2016: Parallel Processing Workshops, 2016

2015
Editorial for the fifth international conference on energy-aware high performance computing.
Computer Science - R&D, 2015

HAEC-SIM: a simulation framework for highly adaptive energy-efficient computing platforms.
Proceedings of the 8th International Conference on Simulation Tools and Techniques, 2015

Folding Methods for Event Timelines in Performance Analysis.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 2015

Performance Portable Applications for Hardware Accelerators: Lessons Learned from SPEC ACCEL.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 2015

Performance Analysis for Target Devices with the OpenMP Tools Interface.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 2015

Cache Coherence Protocol and Memory Performance of the Intel Haswell-EP Architecture.
Proceedings of the 44th International Conference on Parallel Processing, 2015

Managing Complexity in Distributed Data Life Cycles Enhancing Scientific Discovery.
Proceedings of the 11th IEEE International Conference on e-Science, 2015

Run-Time Exploitation of Application Dynamism for Energy-Efficient Exascale Computing (READEX).
Proceedings of the 18th IEEE International Conference on Computational Science and Engineering, 2015

2014
Editorial for the Fourth International Conference on Energy-Aware High Performance Computing.
Computer Science - R&D, 2014

Co-Design of Systems and Applications for Exascale (Dagstuhl Perspectives Workshop 12212).
Dagstuhl Manifestos, 2014

Standards-based metadata management for molecular simulations.
Concurrency and Computation: Practice and Experience, 2014

Optimizing I/O forwarding techniques for extreme-scale event tracing.
Cluster Computing, 2014

Energy-Efficient Databases Using Sweet Spot Frequencies.
Proceedings of the 7th IEEE/ACM International Conference on Utility and Cloud Computing, 2014

SPEC ACCEL: A Standard Application Suite for Measuring Hardware Accelerator Performance.
Proceedings of the High Performance Computing Systems. Performance Modeling, Benchmarking, and Simulation, 2014

HDEEM: high definition energy efficiency monitoring.
Proceedings of the 2nd International Workshop on Energy Efficient Supercomputing, 2014

Energy-Efficient Data Processing at Sweet Spot Frequencies.
Proceedings of the On the Move to Meaningful Internet Systems: OTM 2014 Workshops, 2014

Meta-Metaworkflows for Combining Quantum Chemistry and Molecular Dynamics in the MoSGrid Science Gateway.
Proceedings of the 6th International Workshop on Science Gateways, 2014

MPI Runtime Error Detection with MUST: A Scalable and Crash-Safe Approach.
Proceedings of the 43rd International Conference on Parallel Processing Workshops, 2014

Scalable high-quality 1D partitioning.
Proceedings of the International Conference on High Performance Computing & Simulation, 2014

Selective runtime monitoring: Non-intrusive elimination of high-frequency functions.
Proceedings of the International Conference on High Performance Computing & Simulation, 2014

SCADOPT: An Open-Source HPC Framework for Solving PDE Constrained Optimization Problems Using AD.
Proceedings of the 2014 IEEE International Conference on High Performance Computing and Communications, 2014

Architectural Implications for Exascale based on Big Data Workflow Requirements.
Proceedings of the Big Data and High Performance Computing, 2014

ScaDS Dresden/Leipzig: Ein serviceorientiertes Kompetenzzentrum für Big Data.
Proceedings of the 44. Jahrestagung der Gesellschaft für Informatik, Informatik 2014, Big Data, 2014

Modeling communication delays for network coding and routing for error-prone transmission.
Proceedings of the Third International Conference on Future Generation Communication Technologies (FGCT 2014), 2014

Analysis of Parallel Applications on a High Performance-Low Energy Computer.
Proceedings of the Euro-Par 2014: Parallel Processing Workshops, 2014

Memory Usage Optimizations for Online Event Analysis.
Proceedings of the Solving Software Challenges for Exascale, 2014

Towards Detailed Exascale Application Analysis - Selective Monitoring and Visualisation.
Proceedings of the Solving Software Challenges for Exascale, 2014

Towards Generic Metadata Management in Distributed Science Gateway Infrastructures.
Proceedings of the 14th IEEE/ACM International Symposium on Cluster, 2014

2013
A fast spectral element solver combining static condensation and multigrid techniques.
J. Comput. Physics, 2013

HPC in Germany: Infrastructure, Operations and Politics.
it - Information Technology, 2013

Towards I/O analysis of HPC systems and a generic architecture to collect access patterns.
Computer Science - R&D, 2013

Performance and quality of service of data and video movement over a 100 Gbps testbed.
Future Generation Comp. Syst., 2013

Distributed wait state tracking for runtime MPI deadlock detection.
Proceedings of the International Conference for High Performance Computing, 2013

Radiative signatures of the relativistic Kelvin-Helmholtz instability.
Proceedings of the International Conference for High Performance Computing, 2013

Runtime message uniquification for accurate communication analysis on incomplete MPI event traces.
Proceedings of the 20th European MPI Users's Group Meeting, 2013

Runtime MPI collective checking with tree-based overlay networks.
Proceedings of the 20th European MPI Users's Group Meeting, 2013

Power measurement techniques on standard compute nodes: A quantitative comparison.
Proceedings of the 2012 IEEE International Symposium on Performance Analysis of Systems & Software, 2013

Hierarchical Memory Buffering Techniques for an In-Memory Event Tracing Extension to the Open Trace Format 2.
Proceedings of the 42nd International Conference on Parallel Processing, 2013

Integration of a Highly Scalable, Multi-FPGA-Based Hardware Accelerator in Common Cluster Infrastructures.
Proceedings of the 42nd International Conference on Parallel Processing, 2013

Intralayer Communication for Tree-Based Overlay Networks.
Proceedings of the 42nd International Conference on Parallel Processing, 2013

Alignment-Based Metrics for Trace Comparison.
Proceedings of the Euro-Par 2013 Parallel Processing, 2013

Vorschlag einer Architektur für Software Defined Networks.
Proceedings of the 6. DFN-Forum Kommunikationstechnologien, 2013

2012
Enhanced Encoding Techniques for the Open Trace Format 2.
Proceedings of the International Conference on Computational Science, 2012

Collecting Distributed Performance Data with Dataheap: Generating and Exploiting a Holistic System View.
Proceedings of the International Conference on Computational Science, 2012

Flexible workload generation for HPC cluster efficiency benchmarking.
Computer Science - R&D, 2012

Co-Design of Systems and Applications for Exascale (Dagstuhl Perspectives Worksop 12212).
Dagstuhl Reports, 2012

MPI Runtime Error Detection with MUST: Advanced Error Reports.
Proceedings of the Tools for High Performance Computing 2012, 2012

GTI: A Generic Tools Infrastructure for Event-Based Tools in Parallel Systems.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

Enabling event tracing at leadership-class scale through I/O forwarding middleware.
Proceedings of the 21st International Symposium on High-Performance Parallel and Distributed Computing, 2012


Strategies for Real-Time Event Reduction.
Proceedings of the Euro-Par 2012: Parallel Processing Workshops, 2012

HPC File Systems in Wide Area Networks: Understanding the Performance of Lustre over WAN.
Proceedings of the Euro-Par 2012 Parallel Processing - 18th International Conference, 2012

Pathways to servers of the future.
Proceedings of the 2012 Design, Automation & Test in Europe Conference & Exhibition, 2012

2011
Coarse Grained Parallelized Scientific Applications on a Cost Efficient Intel Atom Based Cluster.
Proceedings of the International Conference on Computational Science, 2011

Linux Cluster in Theory and Practice: A Novel Approach in Teaching Cluster Computing Based on the Intel Atom Platform.
Proceedings of the International Conference on Computational Science, 2011

Trace-based performance analysis for the petascale simulation code FLASH.
IJHPCA, 2011

The International Exascale Software Project roadmap.
IJHPCA, 2011

Score-P: A Joint Performance Measurement Run-Time Infrastructure for Periscope, Scalasca, TAU, and Vampir.
Proceedings of the Tools for High Performance Computing 2011, 2011

Open Trace Format 2: The Next Generation of Scalable Trace Formats and Support Libraries.
Proceedings of the Applications, Tools and Techniques on the Road to Exascale Computing, Proceedings of the conference ParCo 2011, 31 August, 2011

Enhancing the Functionality of a GridSim-Based Scheduler for Effective Use with Large-Scale Scientific Applications.
Proceedings of the 10th International Symposium on Parallel and Distributed Computing, 2011

Deadlock-Free Oblivious Routing for Arbitrary Topologies.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Scout: A Source-to-Source Transformator for SIMD-Optimizations.
Proceedings of the Euro-Par 2011: Parallel Processing Workshops - CCPI, CGWS, HeteroPar, HiBB, HPCVirt, HPPC, HPSS, MDGS, ProPer, Resilience, UCHPC, VHPC, Bordeaux, France, August 29, 2011

Synthetische Lasttests auf dem 100-Gigabit-Testbed zwischen der TU Dresden und der TU Bergakademie Freiberg.
Proceedings of the 4. DFN-Forum Kommunikationstechnologien, 2011

2010
Preface.
Concurrency and Computation: Practice and Experience, 2010

Preface.
Concurrency and Computation: Practice and Experience, 2010

Highly Scalable Dynamic Load Balancing in the Atmospheric Modeling System COSMO-SPECS+FD4.
Proceedings of the Applied Parallel and Scientific Computing, 2010

Efficient Pattern Based I/O Analysis of Parallel Programs.
Proceedings of the 39th International Conference on Parallel Processing, 2010

2009
Performance at Exascale.
IJHPCA, 2009

Tools for scalable parallel program analysis: Vampir NG, MARMOT, and DeWiz.
IJCSE, 2009

A framework for detailed multiphase cloud modeling on HPC systems.
Proceedings of the Parallel Computing: From Multicores and GPU's to Petascale, 2009

An Interface for Integrated MPI Correctness Checking.
Proceedings of the Parallel Computing: From Multicores and GPU's to Petascale, 2009

Comparing cache architectures and coherency protocols on x86-64 multicore SMP systems.
Proceedings of the 42st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-42 2009), 2009

Pattern Matching and I/O Replay for POSIX I/O in Parallel Programs.
Proceedings of the Euro-Par 2009 Parallel Processing, 2009

2008
Improved Performance for Nodal Spectral Element Operators.
IJHPCA, 2008

Internal Timer Synchronization for Parallel Event Tracing.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2008

The Vampir Performance Analysis Tool-Set.
Proceedings of the Tools for High Performance Computing, 2008

Trace-Based Analysis and Optimization for the Semtex CFD Application - Hidden Remote Memory Accesses and I/O Performance.
Proceedings of the Euro-Par 2008 Workshops, 2008

Event Tracing and Visualization for Cell Broadband Engine Systems.
Proceedings of the Euro-Par 2008, 2008

2007
Analyzing Cache Bandwidth on the Intel Core 2 Architecture.
Proceedings of the Parallel Computing: Architectures, 2007

Developing Scalable Applications with Vampir, VampirServer and VampirTrace.
Proceedings of the Parallel Computing: Architectures, 2007

Analyzing Mutual Influences of High Performance Computing Programs on SGI Altix 3700 and 4700 Systems with PARbench.
Proceedings of the Parallel Computing: Architectures, 2007

Analysis of Linux Scheduling with VAMPIR.
Proceedings of the Computational Science - ICCS 2007, 7th International Conference, Beijing, China, May 27, 2007

Memory Allocation Tracing with VampirTrace.
Proceedings of the Computational Science - ICCS 2007, 7th International Conference, Beijing, China, May 27, 2007

Topic 2 Performance Prediction and Evaluation.
Proceedings of the Euro-Par 2007, 2007

Computational Steering and Online Visualization of Scientific Applications on Large-Scale HPC Systems within e-Science Infrastructures.
Proceedings of the Third International Conference on e-Science and Grid Computing, 2007

2006
Compressible memory data structures for event-based trace analysis.
Future Generation Comp. Syst., 2006

Open trace - The open trace format (OTF) and open tracing for HPC.
Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

M09 - Program analysis tools for massively parallel applications: how to achieve highest performance.
Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

Visualization of Repetitive Patterns in Event Traces.
Proceedings of the Applied Parallel Computing. State of the Art in Scientific Computing, 2006

Introducing the Open Trace Format (OTF).
Proceedings of the Computational Science, 2006

Analyzing the Interaction of OpenMP Programs Within Multiprogramming Environments on a Sun Fire E25K System with PARbench.
Proceedings of the Euro-Par 2006, Parallel Processing, 12th International Euro-Par Conference, Dresden, Germany, August 28, 2006

Optimizing OpenMP Parallelized DGEMM Calls on SGI Altix 3700.
Proceedings of the Euro-Par 2006, Parallel Processing, 12th International Euro-Par Conference, Dresden, Germany, August 28, 2006

2005
Monitoring cache behavior on parallel SMP architectures and related programming tools.
Future Generation Comp. Syst., 2005

High Performance Event Trace Visualization.
Proceedings of the 13th Euromicro Workshop on Parallel, 2005

Performance Comparison and Optimization: Case Studies using BenchIT.
Proceedings of the Parallel Computing: Current & Future Issues of High-End Computing, 2005

Scheduling issues on IBM p690: Performance Analysis with the PARbench Environment.
Proceedings of the Parallel Computing: Current & Future Issues of High-End Computing, 2005

Tracing the Cache Behaviour of Data Structures in Fortran Applications.
Proceedings of the Parallel Computing: Current & Future Issues of High-End Computing, 2005

Construction and Compression of Complete Call Graphs for Post-Mortem Program Trace Analysis.
Proceedings of the 34th International Conference on Parallel Processing (ICPP 2005), 2005

New Algorithms for Performance Trace Analysis Based on Compressed Complete Call Graphs.
Proceedings of the Computational Science, 2005

Statistical Methods for Automatic Performance Bottleneck Detection in MPI Based Programs.
Proceedings of the Computational Science, 2005

Knowledge Based Automatic Scalability Analysis and Extrapolation for MPI Programs.
Proceedings of the Euro-Par 2005, Parallel Processing, 11th International Euro-Par Conference, Lisbon, Portugal, August 30, 2005

05501 Abstracts Collection - Automatic Performance Analysis.
Proceedings of the Automatic Performance Analysis, 12.-16. December 2005, 2005

05501 Summary - Automatic Performance Analysis.
Proceedings of the Automatic Performance Analysis, 12.-16. December 2005, 2005

2004
Grid-Computing.
Informatik Spektrum, 2004

Performance Analysis with BenchIT: Portable, Flexible, Easy to Use.
Proceedings of the 1st International Conference on Quantitative Evaluation of Systems (QEST 2004), 2004

Detection of Collective MPI Operation Patterns.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2004

Pattern Matching of Collective MPI Operations.
Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, 2004

A Parallel PSPG Finite Element Method for Direct Simulation of Incompressible Flow.
Proceedings of the Euro-Par 2004 Parallel Processing, 2004

Topic 2: Performance Evaluation.
Proceedings of the Euro-Par 2004 Parallel Processing, 2004

Optimizing Cache Access: A Tool for Source-to-Source Transformations and Real-Life Compiler Tests.
Proceedings of the Euro-Par 2004 Parallel Processing, 2004

Tools for Scalable Parallel Program Analysis - Vampir VNG and DeWiz.
Proceedings of the Distributed and Parallel Systems: Cluster and Grid Computing (DAPSYS 2004, 2004

2003
BenchIT - Performance Measurements and Comparison for Scientific Applications.
Proceedings of the Parallel Computing: Software Technology, 2003

Scalable Performance Analysis of Parallel Systems: Concepts and Experiences.
Proceedings of the Parallel Computing: Software Technology, 2003

Performance Analysis of a Parallel Application in the GRID.
Proceedings of the Computational Science - ICCS 2003, 2003

A Distributed Performance Analysis Architecture for Clusters.
Proceedings of the 2003 IEEE International Conference on Cluster Computing (CLUSTER 2003), 2003

2002
VGV: Supporting Performance Analysis of Object-Oriented Mixed MPI/OpenMP Parallel Applications.
Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS 2002), 2002

2001
An Integrated Performance Visualizer for MPI/OpenMP Programs.
Proceedings of the OpenMP Shared Memory Parallel Programming, 2001

Performance Optimization for Large Scale Computing: The Scalable VAMPIR Approach.
Proceedings of the Computational Science - ICCS 2001, 2001

An Hierarchical MPI Communication Model for the Parallelized Solution of Multiple Integrals.
Proceedings of the High-Performance Computing and Networking, 9th International Conference, 2001

Group-Based Performance Analysis for Multithreaded SMP Cluster Applications.
Proceedings of the Euro-Par 2001: Parallel Processing, 2001

2000
Performance Tuning on Parallel Systems: All Problems Solved?
Proceedings of the Applied Parallel Computing, 2000

An Efficient Parallel Linear Solver with a Cascadic Conjugate Gradient Method: Experience with Reality.
Proceedings of the Euro-Par 2000, Parallel Processing, 6th International Euro-Par Conference, Munich, Germany, August 29, 2000

Performance Evaluation and Prediction.
Proceedings of the Euro-Par 2000, Parallel Processing, 6th International Euro-Par Conference, Munich, Germany, August 29, 2000

1999
A New Approach for Parallel Multigrid Adaption.
Proceedings of the Ninth SIAM Conference on Parallel Processing for Scientific Computing, 1999

MG - A toolbox for parallel grid adaption and implementing multigrid solvers unstructured.
Proceedings of the Parallel Computing: Fundamentals & Applications, 1999

Effective performance problem detection of MPI programs on MPP systems: From the global view to the details.
Proceedings of the Parallel Computing: Fundamentals & Applications, 1999

Three-dimensional direct numerical simulation of flow problems with electromagnetic control on parallel systems.
Proceedings of the Parallel Computing: Fundamentals & Applications, 1999

1997
Metacomputing in a Regional ATM-Testbed - Experience with Reality.
Proceedings of the Parallel Computing: Fundamentals, 1997

1995
Effektive Nutzung von Parallelrechnern in Rechenzentrumsumgebungen.
Proceedings of the Organisation und Betrieb von DV-Versorungssystemen, 1995

1993
Ein verteiltes Scheduler-System für Mehrprozessorrechner mit gemeinsamem Speicher: Untersuchungen zur Ablaufplanung von parallelen Programmen.
PhD thesis, 1993

1991
Benchmarking parallel programs in a multiprogramming environment: the PAR-Bench system.
Parallel Computing, 1991

Parallel programs and background load: efficiency studies with the PAR-Bench system.
Proceedings of the 5th international conference on Supercomputing, 1991

1990
Exploiting autotasking on a CRAY Y-MP: an improved software interface to multitasking.
Parallel Computing, 1990

Parallelizing QCD with dynamical fermions on a Cray multiprocessor system.
Parallel Computing, 1990

Prinzipien der Parallelverarbeitung auf Rechnern mit gemeinsamem Speicher.
Proceedings of the GI, 1990

1989
Multitasking: experience with applications on a CRAY X-MP.
Parallel Computing, 1989

A comparison of parallel processing on Cray X-MP AND IBM 3090 VF multiprocessors.
Proceedings of the 3rd international conference on Supercomputing, 1989

1988
Using multiple CPUs for problem solving: experiences in multitasking on the CRAY X-MP/48.
Parallel Computing, 1988

Three-dimensional numerical simulations of the czochralski bulk flow on a CRAY X-MP multiprocessor architecture.
Proceedings of the 2nd international conference on Supercomputing, 1988


  Loading...