Christoph W. Kessler

According to our database1, Christoph W. Kessler authored at least 126 papers between 1991 and 2019.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

Homepages:

On csauthors.net:

Bibliography

2019
Parallelization of Hierarchical Matrix Algorithms for Electromagnetic Scattering Problems.
Proceedings of the High-Performance Modelling and Simulation for Big Data Applications, 2019

Extending smart containers for data locality-aware skeleton programming.
Concurrency and Computation: Practice and Experience, 2019

Global optimization of operand transfer fusion in heterogeneous computing.
Proceedings of the 22nd International Workshop on Software and Compilers for Embedded Systems, 2019

2018
MeterPU: a generic measurement abstraction API - Enabling energy-tuned skeleton backend selection.
The Journal of Supercomputing, 2018

SkePU 2: Flexible and Type-Safe Skeleton Programming for Heterogeneous Parallel Systems.
International Journal of Parallel Programming, 2018

EXA2PRO programming environment: architecture and applications.
Proceedings of the 18th International Conference on Embedded Computer Systems: Architectures, 2018

Lazy Allocation and Transfer Fusion Optimization for GPU-Based Heterogeneous Systems.
Proceedings of the 26th Euromicro International Conference on Parallel, 2018

Ensuring Memory Consistency in Heterogeneous Systems Based on Access Mode Declarations.
Proceedings of the 2018 International Conference on High Performance Computing & Simulation, 2018

2017
Benchmarking OpenCL, OpenACC, OpenMP, and CUDA: Programming Productivity, Performance, and Energy Consumption.
Proceedings of the 2017 Workshop on Adaptive Resource Management and Scheduling for Cloud Computing, 2017

Asymmetric Crown Scheduling.
Proceedings of the 25th Euromicro International Conference on Parallel, 2017

VectorPU: A Generic and Efficient Data-container and Component Model for Transparent Data Transfer on GPU-based Heterogeneous Systems.
Proceedings of the 8th Workshop and 6th Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures and Design Tools and Architectures for Multicore Embedded Computing Platforms, 2017

2016
Smart Containers and Skeleton Programming for GPU-Based Systems.
International Journal of Parallel Programming, 2016

Energy-Optimized Static Scheduling for Many-Cores with Task Parallelization, DVFS and Core Consolidation.
Proceedings of the 19th International Workshop on Software and Compilers for Embedded Systems, 2016

An Extensible Platform Description Language Supporting Retargetable Toolchains and Adaptive Execution.
Proceedings of the 19th International Workshop on Software and Compilers for Embedded Systems, 2016

Efficient Execution of SkePU Skeleton Programs on the Low-Power Multicore Processor Myriad2.
Proceedings of the 24th Euromicro International Conference on Parallel, 2016

2015
Performance-aware composition framework for GPU-based systems.
The Journal of Supercomputing, 2015

MeterPU: A Generic Measurement Abstraction API Enabling Energy-Tuned Skeleton Backend Selection.
Proceedings of the 2015 IEEE TrustCom/BigDataSE/ISPA, 2015

Fast Crown Scheduling Heuristics for Energy-Efficient Mapping and Scaling of Moldable Streaming Tasks on Many-Core Systems.
Proceedings of the 18th International Workshop on Software and Compilers for Embedded Systems, 2015

Portable Parallelization of the EDGE CFD Application for GPU-based Systems using the SkePU Skeleton Programming Library.
Proceedings of the Parallel Computing: On the Road to Exascale, 2015

Improving Energy-Efficiency of Static Schedules by Core Consolidation and Switching Off Unused Cores.
Proceedings of the Parallel Computing: On the Road to Exascale, 2015

Optimized variant-selection code generation for loops on heterogeneous multicore systems.
Proceedings of the Parallel Computing: On the Road to Exascale, 2015

Mimer and Schedeval: Tools for Comparing Static Schedulers for Streaming Applications on Manycore Architectures.
Proceedings of the 44th International Conference on Parallel Processing Workshops, 2015

XPDL: Extensible Platform Description Language to Support Energy Modeling and Optimization.
Proceedings of the 44th International Conference on Parallel Processing Workshops, 2015

2014
Fast Crown Scheduling Heuristics for Energy-Efficient Mapping and Scaling of Moldable Streaming Tasks on Manycore Systems.
TACO, 2014

NUMA Computing with Hardware and Software Co-Support on Configurable Emulated Shared Memory Architectures.
IJNC, 2014

The PEPPHER composition tool: performance-aware composition for GPU-based systems.
Computing, 2014

Pruning Strategies in Adaptive Off-Line Tuning for Optimized Composition of Components on Heterogeneous Systems.
Proceedings of the 43rd International Conference on Parallel Processing Workshops, 2014

Global Optimization of Execution Mode Selection for the Reconfigurable PRAM-NUMA Multicore Architecture REPLICA.
Proceedings of the Second International Symposium on Computing and Networking, 2014

Optimized Selection of Runtime Mode for the Reconfigurable PRAM-NUMA Architecture REPLICA Using Machine-Learning.
Proceedings of the Euro-Par 2014: Parallel Processing Workshops, 2014

A Quantitative Comparison of PRAM based Emulated Shared Memory Architectures to Current Multicore CPUs and GPUs.
Proceedings of the ARCS 2014, 2014

2013
Compiling for VLIW DSPs.
Proceedings of the Handbook of Signal Processing Systems, 2013

Extensible Recognition of Algorithmic Patterns in DSP Programs for Automatic Parallelization.
International Journal of Parallel Programming, 2013

Crown scheduling: Energy-efficient resource allocation, mapping and discrete frequency scaling for collections of malleable streaming tasks.
Proceedings of the 2013 23rd International Workshop on Power and Timing Modeling, 2013

Hardware and Software Support for NUMA Computing on Configurable Emulated Shared Memory Architectures.
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

A Framework for Performance-Aware Composition of Applications for GPU-Based Systems.
Proceedings of the 42nd International Conference on Parallel Processing, 2013

Adaptive Implementation Selection in the SkePU Skeleton Programming Library.
Proceedings of the Advanced Parallel Processing Technologies, 2013

2012
Integrated Code Generation for Loops.
ACM Trans. Embedded Comput. Syst., 2012

Engineering Parallel Sorting for the Intel SCC.
Proceedings of the International Conference on Computational Science, 2012

Executing PRAM Programs on GPUs.
Proceedings of the International Conference on Computational Science, 2012

Optimized On-Chip-Pipelining for Memory-Intensive Computations on Multi-Core Processors with Explicit Memory Hierarchy.
J. UCS, 2012

Optimized composition of performance-aware parallel components.
Concurrency and Computation: Practice and Experience, 2012

Adaptive Off-Line Tuning for Optimized Composition of Components for Heterogeneous Many-Core Systems.
Proceedings of the High Performance Computing for Computational Science, 2012

Poster: Leveraging PEPPHER Technology for Performance Portable Supercomputing.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Abstract: Leveraging PEPPHER Technology for Performance Portable Supercomputing.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012

The PEPPHER Composition Tool: Performance-Aware Dynamic Composition of Applications for GPU-Based Systems.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Modelling Power Consumption of the Intel SCC.
Proceedings of the 6th Many-core Applications Research Community (MARC) Symposium. Proceedings of the 6th MARC Symposium, 2012

Design of the Language Replica for Hybrid PRAM-NUMA Many-core Architectures.
Proceedings of the 10th IEEE International Symposium on Parallel and Distributed Processing with Applications, 2012

Programmability and performance portability aspects of heterogeneous multi-/manycore systems.
Proceedings of the 2012 Design, Automation & Test in Europe Conference & Exhibition, 2012

Flexible Scheduling and Thread Allocation for Synchronous Parallel Tasks.
Proceedings of the ARCS 2012 Workshops, 28. Februar - 2. März 2012, München, Germany, 2012

2011
PEPPHER: Efficient and Productive Usage of Hybrid Computing Systems.
IEEE Micro, 2011

Programmiertechniken für den Cell-Prozessor (Programming Techniques for the Cell Processor).
it - Information Technology, 2011

Comparing Machine Learning Approaches for Context-Aware Composition.
Proceedings of the Software Composition - 10th International Conference, SC 2011, Zurich, 2011

Balancing CPU Load for Irregular MPI Applications.
Proceedings of the Applications, Tools and Techniques on the Road to Exascale Computing, Proceedings of the conference ParCo 2011, 31 August, 2011

Flexible Runtime Support for Efficient Skeleton Programming on Heterogeneous GPU-based Systems.
Proceedings of the Applications, Tools and Techniques on the Road to Exascale Computing, Proceedings of the conference ParCo 2011, 31 August, 2011

The PEPPHER Approach to Programmability and Performance Portability for Heterogeneous many-core Architectures.
Proceedings of the Applications, Tools and Techniques on the Road to Exascale Computing, Proceedings of the conference ParCo 2011, 31 August, 2011

Investigation of main memory bandwidth on Intel Single-Chip Cloud Computer.
Proceedings of the 3rd Many-core Applications Research Community (MARC) Symposium. Proceedings of the 3rd MARC Symposium, 2011

Case Study of Efficient Parallel Memory Access Programming for the Embedded Heterogeneous Multicore DSP Architecture ePUMA.
Proceedings of the International Conference on Complex, 2011

2010
Theory and Algorithms for Parallel Computation.
Proceedings of the Euro-Par 2010 - Parallel Processing, 16th International Euro-Par Conference, Ischia, Italy, August 31, 2010

Optimized On-Chip-Pipelined Mergesort on the Cell/B.E.
Proceedings of the Euro-Par 2010 - Parallel Processing, 16th International Euro-Par Conference, Ischia, Italy, August 31, 2010

Program Composition and Optimization: An Introduction.
Proceedings of the Program Composition and Optimization: Autotuning, Scheduling, Metaprogramming and Beyond, 09.05., 2010

10191 Executive Summary - Program Composition and Optimization : Autotuning, Scheduling, Metaprogramming and Beyond.
Proceedings of the Program Composition and Optimization: Autotuning, Scheduling, Metaprogramming and Beyond, 09.05., 2010

10191 Abstracts Collection - Program Composition and Optimization : Autotuning, Scheduling, Metaprogramming and Beyond.
Proceedings of the Program Composition and Optimization: Autotuning, Scheduling, Metaprogramming and Beyond, 09.05., 2010

Platform-independent modeling of explicitly parallel programs.
Proceedings of the ARCS '10, 2010

Compiling for VLIW DSPs.
Proceedings of the Handbook of Signal Processing Systems, 2010

2009
Message from the PDSEC-09 workshop chairs.
Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

Integrated Modulo Scheduling for Clustered VLIW Architectures.
Proceedings of the High Performance Embedded Architectures and Compilers, 2009

2008
Automatic parallelization of simulation code for equation-based models with software pipelining and measurements on three platforms.
SIGARCH Computer Architecture News, 2008

Optimized on-chip pipelining of memory-intensive computations on the cell BE.
SIGARCH Computer Architecture News, 2008

Profile-Guided Composition.
Proceedings of the Software Composition, 7th International Symposium, 2008

Optimal vs. heuristic integrated code generation for clustered VLIW architectures.
Proceedings of the 11th International Workshop on Software and Compilers for Embedded Systems, 2008

Optimized Pipelined Parallel Merge Sort on the Cell BE.
Proceedings of the Euro-Par 2008 Workshops, 2008

Hybrid Parallel Sort on the Cell Processor.
Proceedings of the 9th Workshop on Parallel Systems and Algorithms (PASA) held at the 21st Conference on the Architecture of Computing Systems (ARCS), 2008

2007
Classification and generation of schedules for VLIW processors.
Concurrency and Computation: Practice and Experience, 2007

A Survey of Reasoning in Parallelization.
Proceedings of the 8th ACIS International Conference on Software Engineering, 2007

A Framework for Performance-Aware Composition of Explicitly Parallel Components.
Proceedings of the Parallel Computing: Architectures, 2007

A Formal Framework for Automated Round-Trip Software Engineering in Static Aspect Weaving and Transformations.
Proceedings of the 29th International Conference on Software Engineering (ICSE 2007), 2007

2006
Optimal integrated code generation for VLIW architectures.
Concurrency and Computation: Practice and Experience, 2006

NestStepModelica - Mathematical Modeling and Bulk-Synchronous Parallel Simulation.
Proceedings of the Applied Parallel Computing. State of the Art in Scientific Computing, 2006

Automated Round-trip Software Engineering in Aspect Weaving Systems.
Proceedings of the 21st IEEE/ACM International Conference on Automated Software Engineering (ASE 2006), 2006

Crosscutting Concerns in Parallelization by Invasive Software Composition and Aspect Weaving.
Proceedings of the 39th Hawaii International International Conference on Systems Science (HICSS-39 2006), 2006

Optimal Integrated VLIW Code Generation with Integer Linear Programming.
Proceedings of the Euro-Par 2006, Parallel Processing, 12th International Euro-Par Conference, Dresden, Germany, August 28, 2006

Load balancing of irregular parallel divide-and-conquer algorithms in group-SPMD programming environments.
Proceedings of the ARCS 2006, 2006

2005
05101 Abstracts Collection - Scheduling for Parallel Architectures: Theory, Applications, Challenges.
Proceedings of the Scheduling for Parallel Architectures: Theory, Applications, Challenges, 2005

05101 Executive Summary - Scheduling for Parallel Architectures: Theory, Applications, Challenges.
Proceedings of the Scheduling for Parallel Architectures: Theory, Applications, Challenges, 2005

Parallelisation of Sequential Programs by Invasive Composition and Aspect Weaving.
Proceedings of the Advanced Parallel Processing Technologies, 6th International Workshop, 2005

2004
Managing distributed shared arrays in a bulk-synchronous parallel programming environment.
Concurrency and Computation: Practice and Experience, 2004

A practical access to the theory of parallel algorithms.
Proceedings of the 35th SIGCSE Technical Symposium on Computer Science Education, 2004

Towards a Bulk-Synchronous Distributed Shared Memory Programming Environment for Grids.
Proceedings of the Applied Parallel Computing, 2004

Topic 10: Parallel Programming: Models, Methods and Programming Languages.
Proceedings of the Euro-Par 2004 Parallel Processing, 2004

Exploiting Symmetries for Optimal Integrated Code Generation.
Proceedings of the International Conference on Embedded Systems and Applications, 2004

2002
Optimal integrated code generation for clustered VLIW architectures.
Proceedings of the 2002 Joint Conference on Languages, 2002

Mid-term course evaluations with muddy cards.
Proceedings of the 7th Annual SIGCSE Conference on Innovation and Technology in Computer Science Education, 2002

A dialog between authors and teachers.
Proceedings of the 7th Annual SIGCSE Conference on Innovation and Technology in Computer Science Education, 2002

2001
A Dynamic Programming Approach to Optimal Integrated Code Generation.
Proceedings of The Workshop on Languages, 2001

Practical PRAM programming.
Wiley series on parallel and distributed computing, Wiley, 2001

2000
NestStep: Nested Parallelism and Virtual Shared Memory for the BSP Model.
The Journal of Supercomputing, 2000

Two program comprehension tools for automatic parallelization.
IEEE Concurrency, 2000

1999
The SPARAMAT Approach to Automatic Comprehension of Sparse Matrix Computations
Universität Trier, Mathematik/Informatik, Forschungsbericht, 1999

NestStep: Nested Parallelism and Virtual Memory for the BSP Model.
Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, 1999

The SPARAMAT Approach to Automatic Comprehension of Sparse Matrix Computations.
Proceedings of the 7th International Workshop on Program Comprehension (IWPC '99), May 5-7, 1999, 1999

ForkLight: A Control-Synchronous Parallel Programming Language.
Proceedings of the High-Performance Computing and Networking, 7th International Conference, 1999

1998
ForkLight: A Control-Synchronous Parallel Programming Language
Universität Trier, Mathematik/Informatik, Forschungsbericht, 1998

1997
Two Program Comprehension Tools for Automatic Parallelization: A Comparative Study
Universität Trier, Mathematik/Informatik, Forschungsbericht, 1997

Practical PRAM Programming with Fork95 - A Tutorial
Universität Trier, Mathematik/Informatik, Forschungsbericht, 1997

The Fork95 parallel programming language: Design, implementation, application.
International Journal of Parallel Programming, 1997

Language and library support for practical PRAM programming.
Proceedings of the Fifth Euromicro Workshop on Parallel and Distributed Processing (PDP '97), 1997

Applicability of Program Comprehension to Sparse Matrix Computations.
Proceedings of the Euro-Par '97 Parallel Processing, 1997

Language Support for Synchronous Parallel Critical Sections.
Proceedings of the 1997 Advances in Parallel and Distributed Computing Conference (APDC '97), 1997

1996
Scheduling Expression DAGs for Minimal Register Need
Universität Trier, Mathematik/Informatik, Forschungsbericht, 1996

Pattern-Driven Automatic Parallelization.
Scientific Programming, 1996

A Library of Basic PRAM Algorithms and its Implementation in FORK.
Proceedings of the 8th Annual ACM Symposium on Parallel Algorithms and Architectures, 1996

Scheduling Expression DAGs for Minimal Register Need.
Proceedings of the Programming Languages: Implementations, 1996

Program comprehension engines for automatic parallelization: a comparative study.
Proceedings of the Software Engineering for Parallel and Distributed Systems, 1996

Parallel Fourier-Motzkin Elimination.
Proceedings of the Euro-Par '96 Parallel Processing, 1996

1995
Language Support for Synchronous Parallel Critical Sections
Universität Trier, Mathematik/Informatik, Forschungsbericht, 1995

Integrating Synchronous and Asynchronous Paradigms: The Fork95 Parallel Programming Language
Universität Trier, Mathematik/Informatik, Forschungsbericht, 1995

Generating Optimal Contiguous Evaluations for Expression DAGs.
Comput. Lang., 1995

Pattern-driven automatic program transformation and parallelization.
Proceedings of the 3rd Euromicro Workshop on Parallel and Distributed Processing (PDP '95), 1995

Optimal Continguous Expression DAG Evaluations.
Proceedings of the Fundamentals of Computation Theory, 10th International Symposium, 1995

1994
Integrating Scalable Parallel Libraries and Automatically Parallelizing Compilers
Universität Trier, Mathematik/Informatik, Forschungsbericht, 1994

Automatische Parallelisierung numerischer Programme durch Mustererkennung.
PhD thesis, 1994

Knowledge-Based Automatic Parallelization by Pattern Recognition.
Proceedings of the Automatic Parallelization: New Approaches to Code Generation, 1994

1993
Efficient Register Allocation for Large Basic Blocks.
Proceedings of the Programming Language Implementation and Logic Programming, 1993

Automatic Parallelization by Pattern-Matching.
Proceedings of the Parallel Computation, 1993

1991
A Randomized Heuristic Approach to Register Allocation
Proceedings of the Programming Language Implementation and Logic Programming, 1991

Scheduling Vector Straight Line Code on Vector Processors.
Proceedings of the Code Generation, 1991


  Loading...