Nikolaos Bellas

Orcid: 0000-0002-9522-9136

According to our database1, Nikolaos Bellas authored at least 70 papers between 1996 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Accelerating Machine Learning Inference on GPUs with SYCL.
Proceedings of the 12th International Workshop on OpenCL and SYCL, 2024

2023
Reconfigurable System-on-Chip Architectures for Robust Visual SLAM on Humanoid Robots.
ACM Trans. Embed. Comput. Syst., March, 2023

Low Power Hardware Architecture for Sampling-free Bayesian Neural Networks inference.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2023

2022
The Impact of CPU Voltage Margins on Power-Constrained Execution.
IEEE Trans. Sustain. Comput., 2022

Dynamic Management of CPU Resources Towards Energy Efficient and Profitable Datacentre Operation.
Proceedings of the Job Scheduling Strategies for Parallel Processing, 2022

FPGA Roofline modeling and its Application to Visual SLAM.
Proceedings of the 32nd International Conference on Field-Programmable Logic and Applications, 2022

FPGA Accelerators for Robust Visual SLAM on Humanoid Robots.
Proceedings of the FPGA '22: The 2022 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Virtual Event, USA, 27 February 2022, 2022

2021
Artificial neural networks for online error detection.
CoRR, 2021

MapVisual: A Visualization Tool for Memory Access Patterns.
CoRR, 2021

Architectures for SLAM and Augmented Reality Computing.
Proceedings of the 31st International Conference on Field-Programmable Logic and Applications, 2021

FPGA Architectures for Approximate Dense SLAM Computing.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2021

Exploring the potential of context-aware dynamic CPU undervolting.
Proceedings of the CF '21: Computing Frontiers Conference, 2021

2020
Dynamic Undervolting to Improve Energy Efficiency on Multicore X86 CPUs.
IEEE Trans. Parallel Distributed Syst., 2020

Increasing the Profit of Cloud Providers through DRAM Operation at Reduced Margins.
Proceedings of the 20th IEEE/ACM International Symposium on Cluster, 2020

2019
Comparative Performance Analysis of Vulkan Implementations of Computational Applications.
Proceedings of the International Workshop on OpenCL, 2019

Exploiting CPU Voltage Margins to Increase the Profit of Cloud Infrastructure Providers.
Proceedings of the 19th IEEE/ACM International Symposium on Cluster, 2019

2018
Exploring the Effects of Code Optimizations on CPU Frequency Margins.
Proceedings of the High Performance Computing, 2018

A Framework for Evaluating Software on Reduced Margins Hardware.
Proceedings of the 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2018


2017
Significance-Aware Program Execution on Unreliable Hardware.
ACM Trans. Archit. Code Optim., 2017

Edge and Cloud Provider Cost Minimization by Exploiting Extended Voltage and Frequency Margins.
Proceedings of the Parallel Computing is Everywhere, 2017

A programming model and runtime system for approximation-aware heterogeneous computing.
Proceedings of the 27th International Conference on Field Programmable Logic and Applications, 2017

2016
OpenDwarfs: Characterization of Dwarf-Based Benchmarks on Fixed and Reconfigurable Architectures.
J. Signal Process. Syst., 2016

Exploiting Significance of Computations for Energy-Constrained Approximate Computing.
Int. J. Parallel Program., 2016

SoCLog: A real-time, automatically generated logging and profiling mechanism for FPGA-based Systems On Chip.
Proceedings of the 26th International Conference on Field Programmable Logic and Applications, 2016

Towards automatic significance analysis for approximate computing.
Proceedings of the 2016 International Symposium on Code Generation and Optimization, 2016

2015
Enhancing Design Space Exploration by Extending CPU/GPU Specifications onto FPGAs.
ACM Trans. Embed. Comput. Syst., 2015

A programming model and runtime system for significance-aware energy-efficient computing.
Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2015

Energy Minimization on Heterogeneous Systems through Approximate Computing.
Proceedings of the Parallel Computing: On the Road to Exascale, 2015

Exploring Automatically Generated Platforms in High Performance FPGAs.
Proceedings of the Parallel Computing: On the Road to Exascale, 2015

A significance-driven programming framework for energy-constrained approximate computing.
Proceedings of the 12th ACM International Conference on Computing Frontiers, 2015

2014
Energy Efficiency through Significance-Based Computing.
Computer, 2014

A Grammar Induction Method for Clustering of Operations in Complex FPGA Designs.
Proceedings of the 22nd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2014

GemFI: A Fault Injection Tool for Studying the Behavior of Applications on Unreliable Substrates.
Proceedings of the 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2014

On the characterization of OpenCL dwarfs on fixed and reconfigurable platforms.
Proceedings of the IEEE 25th International Conference on Application-Specific Systems, 2014

2013
On the Portability of the OpenCL Dwarfs on Fixed and Reconfigurable Parallel Platforms.
Proceedings of the 19th IEEE International Conference on Parallel and Distributed Systems, 2013

2012
Shortening Design Time through Multiplatform Simulations with a Portable OpenCL Golden-model: The LDPC Decoder Case.
Proceedings of the 2012 IEEE 20th Annual International Symposium on Field-Programmable Custom Computing Machines, 2012

2011
Implementation of the AVS video decoder on a heterogeneous dual-core SIMD processor.
IEEE Trans. Consumer Electron., 2011

AVS video decoder on multicore systems: Optimizations and tradeoffs.
Proceedings of the 2011 IEEE International Conference on Multimedia and Expo, 2011

Massively parallel programming models used as hardware description languages: The OpenCL case.
Proceedings of the 2011 IEEE/ACM International Conference on Computer-Aided Design, 2011

GLOpenCL: OpenCL support on hardware- and software-managed cache multicores.
Proceedings of the High Performance Embedded Architectures and Compilers, 2011

Implementation and Performance Analysis of SEAL Encryption on FPGA, GPU and Multi-core Processors.
Proceedings of the IEEE 19th Annual International Symposium on Field-Programmable Custom Computing Machines, 2011

Synthesis of Platform Architectures from OpenCL Programs.
Proceedings of the IEEE 19th Annual International Symposium on Field-Programmable Custom Computing Machines, 2011

Implementation and Performance Comparison of the Motion Compensation Kernel of the AVS Video Decoder on FPGA, GPU and Multicore Processors.
Proceedings of the IEEE 19th Annual International Symposium on Field-Programmable Custom Computing Machines, 2011

Significance driven computation on next-generation unreliable platforms.
Proceedings of the 48th Design Automation Conference, 2011

2010
Special issue on embedded vision.
Comput. Vis. Image Underst., 2010

Fisheye lens distortion correction on multicore and hardware accelerator platforms.
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Mapping and optimization of the AVS video decoder on a high performance chip multiprocessor.
Proceedings of the 2010 IEEE International Conference on Multimedia and Expo, 2010

2009
Implementation of a wide-angle lens distortion correction algorithm on the cell broadband engine.
Proceedings of the 23rd international conference on Supercomputing, 2009

Proteus: An architectural synthesis tool based on the stream programming paradigm.
Proceedings of the 19th International Conference on Field Programmable Logic and Applications, 2009

Real-Time Fisheye Lens Distortion Correction Using Automatically Generated Streaming Accelerators.
Proceedings of the FCCM 2009, 2009

2008
Presynthesis Area Estimation of Reconfigurable Streaming Accelerators.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2008

2007
Mapping streaming architectures on reconfigurable platforms.
SIGARCH Comput. Archit. News, 2007

An Architectural Framework for Automated Streaming Kernel Selection.
Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

2006
FPGA implementation of a license plate recognition SoC using automatically generated streaming accelerators.
Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

An Image Processing Pipeline with Digital Compensation of Low Cost Optics for Mobile Telephony.
Proceedings of the 2006 IEEE International Conference on Multimedia and Expo, 2006

Pre-Synthesis Area Estimation of Reconfigurable Streaming Accelerators.
Proceedings of the 2006 International Conference on Field Programmable Logic and Applications (FPL), 2006

Pre-synthesis Queue Size Estimation of Streaming Data Flow Graphs.
Proceedings of the 14th IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM 2006), 2006

Template-Based Generation of Streaming Accelators from a High Level Presentation.
Proceedings of the 14th IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM 2006), 2006

Reconfigurable Streaming Architectures for Embedded Smart Cameras.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2006

2005
A Low - Power VLSI Architecture for Intra Prediction in H.264.
Proceedings of the Advances in Informatics, 2005

2003
A programmable, high performance vector array unit used for real-time motion estimation.
Proceedings of the 2003 IEEE International Conference on Multimedia and Expo, 2003

2000
Architectural and compiler techniques for energy reduction in high-performance microprocessors.
IEEE Trans. Very Large Scale Integr. Syst., 2000

Using dynamic cache management techniques to reduce energy in general purpose processors.
IEEE Trans. Very Large Scale Integr. Syst., 2000

1999
Architectural and Compiler Techniques for Energy Reduction in High-Performance Microprocessors
PhD thesis, 1999

Using dynamic cache management techniques to reduce energy in a high-performance processor.
Proceedings of the 1999 International Symposium on Low Power Electronics and Design, 1999

An analytical, transistor-level energy model for SRAM-based caches.
Proceedings of the 1999 International Symposium on Circuits and Systems, ISCAS 1999, Orlando, Florida, USA, May 30, 1999

Energy and Performance Improvements in Microprocessor Design Using a Loop Cache.
Proceedings of the IEEE International Conference On Computer Design, 1999

1998
Architectural and compiler support for energy reduction in the memory hierarchy of high performance microprocessors.
Proceedings of the 1998 International Symposium on Low Power Electronics and Design, 1998

1996
Algorithm-Based Error Detection Schemes for Iterative Solution of Partial Differential Equations.
IEEE Trans. Computers, 1996


  Loading...