Lukas Sommer

Orcid: 0000-0003-1918-3911

According to our database1, Lukas Sommer authored at least 31 papers between 2017 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
On Demand Specialization of SYCL Kernels with Specialization Constant Length Allocations (SCLA).
Proceedings of the 12th International Workshop on OpenCL and SYCL, 2024

Experiences Building an MLIR-Based SYCL Compiler.
Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2024

2023
User-driven Online Kernel Fusion for SYCL.
ACM Trans. Archit. Code Optim., June, 2023

Leveraging MLIR for Better SYCL Compilation (Poster).
Proceedings of the 2023 International Workshop on OpenCL, 2023

Technical Talk: A SYCL Extension for User-Driven Online Kernel Fusion.
Proceedings of the 2023 International Workshop on OpenCL, 2023

2022
Exploiting High-Bandwidth Memory for FPGA-Acceleration of Inference on Sum-Product Networks.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2022

SPNC: An Open-Source MLIR-Based Compiler for Fast Sum-Product Network Inference on CPUs and GPUs.
Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2022

2021
Programming Heterogeneous Systems with General and Domain-Specific Frameworks.
PhD thesis, 2021

The TaPaSCo Open-Source Toolflow.
J. Signal Process. Syst., 2021

Efficient Operator Sharing Modulo Scheduling for Sum-Product Network Inference on FPGAs.
Proceedings of the Embedded Computer Systems: Architectures, Modeling, and Simulation, 2021

SPNC: Fast Sum-Product Network Inference.
Proceedings of the Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2021

A Framework for the Automatic Generation of FPGA-based Near-Data Processing Accelerators in Smart Storage Systems.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops, 2021

Optimizing a Hardware Network Stack to Realize an In-Network ML Inference Application.
Proceedings of the IEEE/ACM International Workshop on Heterogeneous High-performance Reconfigurable Computing, 2021

SPNC: Accelerating Sum-Product Network Inference on CPUs and GPUs.
Proceedings of the 32nd IEEE International Conference on Application-specific Systems, 2021

2020
Improving Job Launch Rates in the TaPaSCo FPGA Middleware by Hardware/Software-Co-Design.
Proceedings of the 2020 IEEE/ACM International Workshop on Runtime and Operating Systems for Supercomputers, 2020

Using Parallel Programming Models for Automotive Workloads on Heterogeneous Systems - a Case Study.
Proceedings of the 28th Euromicro International Conference on Parallel, 2020

OpenMP Device Offloading to FPGAs Using the Nymble Infrastructure.
Proceedings of the OpenMP: Portable Multi-Level Parallelism on Modern Systems, 2020

Comparison of Arithmetic Number Formats for Inference in Sum-Product Networks on FPGAs.
Proceedings of the 28th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2020

OpenMP Device Offloading for Embedded Heterogeneous Platforms - Work-in-Progress.
Proceedings of the 20th International Conference on Embedded Software, 2020

Extending High-Level Synthesis with High-Performance Computing Performance Visualization.
Proceedings of the IEEE International Conference on Cluster Computing, 2020

2019
Exact and Practical Modulo Scheduling for High-Level Synthesis.
ACM Trans. Reconfigurable Technol. Syst., 2019

SpExSim: assessing kernel suitability for C-based high-level hardware synthesis.
J. Supercomput., 2019

High-Throughput Multi-Threaded Sum-Product Network Inference in the Reconfigurable Cloud.
Proceedings of the 2019 IEEE/ACM International Workshop on Heterogeneous High-performance Reconfigurable Computing, 2019

Resource-Efficient Logarithmic Number Scale Arithmetic for SPN Inference on FPGAs.
Proceedings of the International Conference on Field-Programmable Technology, 2019

SkyCastle: A Resource-Aware Multi-Loop Scheduler for High-Level Synthesis.
Proceedings of the International Conference on Field-Programmable Technology, 2019

DAPHNE - An automotive benchmark suite for parallel programming models on embedded heterogeneous platforms: work-in-progress.
Proceedings of the International Conference on Embedded Software Companion, 2019

Extending LLVM for Lightweight SPMD Vectorization: Using SIMD and Vector Instructions Easily from Any Language.
Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2019

2018
Automatic Mapping of the Sum-Product Network Inference Problem to FPGA-Based Accelerators.
Proceedings of the 36th IEEE International Conference on Computer Design, 2018

Dependence Graph Preprocessing for Faster Exact Modulo Scheduling in High-Level Synthesis.
Proceedings of the 28th International Conference on Field Programmable Logic and Applications, 2018

2017
Synthesis of interleaved multithreaded accelerators from OpenMP loops.
Proceedings of the International Conference on ReConFigurable Computing and FPGAs, 2017

OpenMP device offloading to FPGA accelerators.
Proceedings of the 28th IEEE International Conference on Application-specific Systems, 2017


  Loading...