Mehdi Goli

Orcid: 0000-0002-2774-0821

Affiliations:
  • Codeplay Software Ltd., Edinburgh, UK
  • University of the West of Scotland, Paisley, UK (former)
  • Robert Gordon University, Aberdeen, UK (PhD 2015)


According to our database1, Mehdi Goli authored at least 34 papers between 2012 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
On Demand Specialization of SYCL Kernels with Specialization Constant Length Allocations (SCLA).
Proceedings of the 12th International Workshop on OpenCL and SYCL, 2024

Experiences Building an MLIR-Based SYCL Compiler.
Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2024

2023
User-driven Online Kernel Fusion for SYCL.
ACM Trans. Archit. Code Optim., June, 2023

Segmentation of Drone Collision Hazards in Airborne RADAR Point Clouds Using PointNet.
CoRR, 2023

Improving performance of SYCL applications on CPU architectures using LLVM-directed compilation flow.
Concurr. Comput. Pract. Exp., 2023

Accelerating Neural Networks Using Open Standard Software on RISC-V.
Proceedings of the High Performance Computing, 2023

Technical Talk: A SYCL Extension for User-Driven Online Kernel Fusion.
Proceedings of the 2023 International Workshop on OpenCL, 2023

Porting SYCL accelerated neural network frameworks to edge devices.
Proceedings of the 2023 International Workshop on OpenCL, 2023

Building a Reusable and Extensible Automatic Compiler Infrastructure for Reconfigurable Devices.
Proceedings of the 33rd International Conference on Field-Programmable Logic and Applications, 2023

A Performance Analysis of Leading Many-Core Technologies for Cellular Automata Execution.
Proceedings of the Euro-Par 2023: Parallel Processing Workshops - Euro-Par 2023 International Workshops, Limassol, Cyprus, August 28, 2023

2022
Towards performance portability of AI graphs using SYCL.
Proceedings of the IEEE/ACM International Workshop on Performance, 2022

Towards performance portability of AI models using SYCL-DNN.
Proceedings of the IWOCL'22: International Workshop on OpenCL, Bristol, United Kingdom, May 10, 2022

Benchmarking a Proof-of-Concept Performance Portable SYCL-based Fast Fourier Transformation Library.
Proceedings of the IWOCL'22: International Workshop on OpenCL, Bristol, United Kingdom, May 10, 2022

2021
Performance portability through machine learning guided kernel selection in SYCL libraries.
Parallel Comput., 2021

Achieving Near-Native Runtime Performance and Cross-Platform Performance Portability for Random Number Generation Through SYCL Interoperability.
Proceedings of the Accelerator Programming Using Directives - 8th International Workshop, 2021

oneAPI Open-Source Math Library Interface.
Proceedings of the International Workshop on Performance, 2021

Toward Performance Portability of Highly Parametrizable TRSM Algorithm Using SYCL.
Proceedings of the IWOCL'21: International Workshop on OpenCL, Munich Germany, April, 2021, 2021

2020
Programming Heterogeneous Parallel Machines Using Refactoring and Monte-Carlo Tree Search.
Int. J. Parallel Program., 2020

Towards Cross-Platform Performance Portability of DNN Models using SYCL.
Proceedings of the IEEE/ACM International Workshop on Performance, 2020

2019
Cross-Platform Performance Portability Using Highly Parametrized SYCL Kernels.
CoRR, 2019

2018
Formalised Composition and Interaction for Heterogeneous Structured Parallelism.
Int. J. Parallel Program., 2018

TensorFlow Acceleration on ARM Hikey Board.
Proceedings of the International Workshop on OpenCL, 2018

2017
Autonomic Coordination of Skeleton-Based Applications Over CPU/GPU Multi-Core Architectures.
Int. J. Parallel Program., 2017

SYCL-BLAS: Combining Expression Trees and Kernel Fusion on Heterogeneous Systems.
Proceedings of the Parallel Computing is Everywhere, 2017

Accelerated Machine Learning Using TensorFlow and SYCL on OpenCL Devices.
Proceedings of the 5th International Workshop on OpenCL, 2017

SYCL-BLAS: Leveraging Expression Trees for Linear Algebra.
Proceedings of the 5th International Workshop on OpenCL, 2017

2016
VisionCPP: A SYCL-based Computer Vision Framework.
Proceedings of the 4th International Workshop on OpenCL, 2016

2015
Autonomic behavioural framework for structural parallelism over heterogeneous multi-core systems.
PhD thesis, 2015

2014
Parallel patterns for heterogeneous CPU/GPU architectures: Structured parallelism from cluster to cloud.
Future Gener. Comput. Syst., 2014

<i>N</i>-body computations using skeletal frameworks on multicore CPU/graphics processing unit architectures: an empirical performance evaluation.
Concurr. Comput. Pract. Exp., 2014

2013
Heterogeneous Algorithmic Skeletons for Fast Flow with Seamless Coordination over Hybrid Architectures.
Proceedings of the 21st Euromicro International Conference on Parallel, 2013

Mapping parallel programs to heterogeneous CPU/GPU architectures using a Monte Carlo Tree Search.
Proceedings of the IEEE Congress on Evolutionary Computation, 2013

2012
A new vertical fragmentation algorithm based on ant collective behavior in distributed database systems.
Knowl. Inf. Syst., 2012

Streaming Dynamic Coarse-Grained CPU/GPU Workloads with Heterogeneous Pipelines in FastFlow.
Proceedings of the 14th IEEE International Conference on High Performance Computing and Communication & 9th IEEE International Conference on Embedded Software and Systems, 2012


  Loading...