Ana Lucia Varbanescu

Orcid: 0000-0002-4932-1900

Affiliations:
  • University of Twente, Enschede, The Netherlands
  • Delft University of Technology, The Netherlands


According to our database1, Ana Lucia Varbanescu authored at least 106 papers between 2006 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Model Parallelism on Distributed Infrastructure: A Literature Review from Theory to LLM Case-Studies.
CoRR, 2024

Lessons Learned Migrating CUDA to SYCL: A HEP Case Study with ROOT RDataFrame.
CoRR, 2024

Migrating CUDA to SYCL: A HEP Case Study with ROOT RDataFrame.
Proceedings of the 12th International Workshop on OpenCL and SYCL, 2024

2023
Finding Morton-Like Layouts for Multi-Dimensional Arrays Using Evolutionary Algorithms.
CoRR, 2023

Reduced Simulations for High-Energy Physics, a Middle Ground for Data-Driven Physics Research.
CoRR, 2023

Graph-Optimizer: Towards Predictable Large-Scale Graph Processing Workloads.
Proceedings of the Companion of the 2023 ACM/SPEC International Conference on Performance Engineering, 2023

Systematically Exploring High-Performance Representations of Vector Fields Through Compile-Time Composition.
Proceedings of the 2023 ACM/SPEC International Conference on Performance Engineering, 2023

Graph Greenifier: Towards Sustainable and Energy-Aware Massive Graph Processing in the Computing Continuum.
Proceedings of the Companion of the 2023 ACM/SPEC International Conference on Performance Engineering, 2023

Estimating the Energy Consumption of Applications in the Computing Continuum with iFogSim.
Proceedings of the High Performance Computing, 2023

Performance Engineering for Graduate Students: a View from Amsterdam.
Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023

Analyzing Digital Services Across the Compute Continuum Using iFogSim.
Proceedings of the 29th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications, 2023

MassiveClicks: A Massively-Parallel Framework for Efficient Click Models Training.
Proceedings of the Euro-Par 2023: Parallel Processing Workshops - Euro-Par 2023 International Workshops, Limassol, Cyprus, August 28, 2023

The Graph-Massivizer Approach Toward a European Sustainable Data Center Digital Twin.
Proceedings of the 47th IEEE Annual Computers, Software, and Applications Conference, 2023

2022
Future Computer Systems and Networking Research in the Netherlands: A Manifesto.
CoRR, 2022

ParClick: A Scalable Algorithm for EM-based Click Models.
Proceedings of the WWW '22: The ACM Web Conference 2022, Virtual Event, Lyon, France, April 25, 2022

Isolating GPU Architectural Features Using Parallelism-Aware Microbenchmarks.
Proceedings of the ICPE '22: ACM/SPEC International Conference on Performance Engineering, Bejing, China, April 9, 2022

The Cost of Reinforcement Learning for Game Engines: The AZ-Hive Case-study.
Proceedings of the ICPE '22: ACM/SPEC International Conference on Performance Engineering, Bejing, China, April 9, 2022

Building a Fine-Grained Analytical Performance Model for Complex Scientific Simulations.
Proceedings of the Parallel Processing and Applied Mathematics, 2022

Modelling Performance Loss due to Thread Imbalance in Stochastic Variable-Length SIMT Workloads.
Proceedings of the 30th International Symposium on Modeling, 2022

Design-Space Exploration for Decision-Support Software.
Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering, 2022

Heterogeneous GPU and FPGA computing: a VexCL case-study.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2022

Efficient trimming for strongly connected components calculation.
Proceedings of the CF '22: 19th ACM International Conference on Computing Frontiers, Turin, Italy, May 17, 2022

2021
Analytical Performance Estimation for Large-Scale Reconfigurable Dataflow Platforms.
ACM Trans. Reconfigurable Technol. Syst., 2021

The future is big graphs: a community view on graph processing systems.
Commun. ACM, 2021

Mimicking the Human Approach in the Game of Hive.
Proceedings of the IEEE Symposium Series on Computational Intelligence, 2021

2020
Designing and building application-centric parallel memories.
Concurr. Comput. Pract. Exp., 2020

A Sampling-Based Tool for Scaling Graph Datasets.
Proceedings of the ICPE '20: ACM/SPEC International Conference on Performance Engineering, 2020

DDLBench: Towards a Scalable Benchmarking Infrastructure for Distributed Deep Learning.
Proceedings of the Fourth IEEE/ACM Workshop on Deep Learning on Supercomputers, 2020

μ-Genie: A Framework for Memory-Aware Spatial Processor Architecture Co-Design Exploration.
Proceedings of the 23rd Euromicro Conference on Digital System Design, 2020

2019
Scalability model for the LOFAR direction independent pipeline.
Astron. Comput., 2019

2018
HLS Support for Polymorphic Parallel Memories.
Proceedings of the IFIP/IEEE International Conference on Very Large Scale Integration, 2018

Building High-Performance, Easy-to-Use Polymorphic Parallel Memories with HLS.
Proceedings of the VLSI-SoC: Design and Engineering of Electronics Systems Based on New Computing Paradigms, 2018

A Beginner's Guide to Estimating and Improving Performance Portability.
Proceedings of the High Performance Computing, 2018

Mix-and-Match: A Model-Driven Runtime Optimisation Strategy for BFS on GPUs.
Proceedings of the 8th IEEE/ACM Workshop on Irregular Applications: Architectures and Algorithms, 2018

EXTRA: an open platform for reconfigurable architectures.
Proceedings of the 18th International Conference on Embedded Computer Systems: Architectures, 2018

MAX-PolyMem: High-Bandwidth Polymorphic Parallel Memories for DFEs.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium Workshops, 2018

Performance Estimation for Exascale Reconfigurable Dataflow Platforms.
Proceedings of the International Conference on Field-Programmable Technology, 2018

Performance Prediction for Large-Scale Heterogeneous Platforms.
Proceedings of the 26th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2018

Towards Application-Centric Parallel Memories.
Proceedings of the Euro-Par 2018: Parallel Processing Workshops, 2018

Exploring HPC and Big Data Convergence: A Graph Processing Study on Intel Knights Landing.
Proceedings of the IEEE International Conference on Cluster Computing, 2018

2017
Using Graph Properties to Speed-up GPU-based Graph Traversal: A Model-driven Approach.
CoRR, 2017

A Performance-centric Approach for Complex Decision Support.
Proceedings of the 8th ACM/SPEC on International Conference on Performance Engineering, 2017

A NoC-based custom FPGA configuration memory architecture for ultra-fast micro-reconfiguration.
Proceedings of the International Conference on Field Programmable Technology, 2017

2016
Workload Partitioning for Accelerating Applications on Heterogeneous Platforms.
IEEE Trans. Parallel Distributed Syst., 2016

The landscape of GPGPU performance modeling tools.
Parallel Comput., 2016

Dynamic Load Balancing for High-Performance Graph Processing on Hybrid CPU-GPU Platforms.
Proceedings of the 6th Workshop on Irregular Applications: Architecture and Algorithms, 2016

EXTRA: Towards the exploitation of eXascale technology for reconfigurable architectures.
Proceedings of the 11th International Symposium on Reconfigurable Communication-centric Systems-on-Chip, 2016

A Tool for Bottleneck Analysis and Performance Prediction for GPU-Accelerated Applications.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016

Heterogeneous computing with accelerators: an overview with examples.
Proceedings of the 2016 Forum on Specification and Design Languages, 2016

Synthetic Graph Generation for Systematic Exploration of Graph Structural Properties.
Proceedings of the Euro-Par 2016: Parallel Processing Workshops, 2016

Speed-Up Computational Finance Simulations with OpenCL on Intel Xeon Phi.
Proceedings of the Euro-Par 2016: Parallel Processing Workshops, 2016

Towards the Next Generation of Large-Scale Network Archives.
Proceedings of the Euro-Par 2016: Parallel Processing Workshops, 2016

Using colored petri nets for GPGPU performance modeling.
Proceedings of the ACM International Conference on Computing Frontiers, CF'16, 2016

Design and Experimental Evaluation of Distributed Heterogeneous Graph-Processing Systems.
Proceedings of the IEEE/ACM 16th International Symposium on Cluster, 2016

2015
Evaluating vector data type usage in OpenCL kernels.
Concurr. Comput. Pract. Exp., 2015

Can Portability Improve Performance?: An Empirical Study of Parallel Graph Analytics.
Proceedings of the 6th ACM/SPEC International Conference on Performance Engineering, Austin, TX, USA, January 31, 2015

Computing the Pseudo-Inverse of a Graph's Laplacian Using GPUs.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 2015

Matchmaking Applications and Partitioning Strategies for Efficient Execution on Heterogeneous Platforms.
Proceedings of the 44th International Conference on Parallel Processing, 2015

Quantifying the Performance Impact of Graph Structure on Neighbour Iteration Strategies for PageRank.
Proceedings of the Euro-Par 2015: Parallel Processing Workshops, 2015

Towards Community Detection on Heterogeneous Platforms.
Proceedings of the Euro-Par 2015: Parallel Processing Workshops, 2015

FiNS: A Framework for Accelerating Nested Simulations on Heterogeneous Platforms.
Proceedings of the Euro-Par 2015: Parallel Processing Workshops, 2015

EXTRA: Towards an Efficient Open Platform for Reconfigurable High Performance Computing.
Proceedings of the 18th IEEE International Conference on Computational Science and Engineering, 2015

Fast packet forwarding engine based on software circuits.
Proceedings of the 12th ACM International Conference on Computing Frontiers, 2015

Improving Application Performance by Efficiently Utilizing Heterogeneous Many-core Platforms.
Proceedings of the 15th IEEE/ACM International Symposium on Cluster, 2015

An Empirical Performance Evaluation of GPU-Enabled Graph-Processing Systems.
Proceedings of the 15th IEEE/ACM International Symposium on Cluster, 2015

2014
Cross-Loop Optimization of Arithmetic Intensity for Finite Element Local Assembly.
ACM Trans. Archit. Code Optim., 2014

Aristotle: A performance impact indicator for the OpenCL kernels using local memory.
Sci. Program., 2014

COFFEE: an Optimizing Compiler for Finite Element Local Assembly.
CoRR, 2014

Benchmarking graph-processing platforms: a vision.
Proceedings of the ACM/SPEC International Conference on Performance Engineering, 2014

Test-driving Intel Xeon Phi.
Proceedings of the ACM/SPEC International Conference on Performance Engineering, 2014

Towards Benchmarking IaaS and PaaS Clouds for Graph Analytics.
Proceedings of the Big Data Benchmarking - 5th International Workshop, 2014

Parallel Computation of Non-Bonded Interactions in Drug Discovery: Nvidia GPUs vs. Intel Xeon Phi.
Proceedings of the International Work-Conference on Bioinformatics and Biomedical Engineering, 2014

How Well Do Graph-Processing Platforms Perform? An Empirical Performance Evaluation and Analysis.
Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014

Improving performance by matching imbalanced workloads with heterogeneous platforms.
Proceedings of the 2014 International Conference on Supercomputing, 2014

Grover: Looking for Performance Improvement by Disabling Local Memory Usage in OpenCL Kernels.
Proceedings of the 43rd International Conference on Parallel Processing, 2014

Look before You Leap: Using the Right Hardware Resources to Accelerate Applications.
Proceedings of the 2014 IEEE International Conference on High Performance Computing and Communications, 2014

Optimizing a Calibration Software for Radio Astronomy.
Proceedings of the 2014 IEEE International Conference on High Performance Computing and Communications, 2014

An Empirical Evaluation of GPGPU Performance Models.
Proceedings of the Euro-Par 2014: Parallel Processing Workshops, 2014

KMA: A Dynamic Memory Manager for OpenCL.
Proceedings of the Seventh Workshop on General Purpose Processing Using GPUs, 2014

2013
An application-centric evaluation of OpenCL on multi-core CPUs.
Parallel Comput., 2013

An Empirical Study of Intel Xeon Phi.
CoRR, 2013

Performance Traps in OpenCL for CPUs.
Proceedings of the 21st Euromicro International Conference on Parallel, 2013

ELMO: A User-Friendly API to Enable Local Memory in OpenCL Kernels.
Proceedings of the 21st Euromicro International Conference on Parallel, 2013

Topic 9: Parallel and Distributed Programming - (Introduction).
Proceedings of the Euro-Par 2013 Parallel Processing, 2013

Glinda: a framework for accelerating imbalanced applications on heterogeneous platforms.
Proceedings of the Computing Frontiers Conference, 2013

Sesame: A User-Transparent Optimizing Framework for Many-Core Processors.
Proceedings of the 13th IEEE/ACM International Symposium on Cluster, 2013

2012
Parallel application characterization with quantitative metrics.
Concurr. Comput. Pract. Exp., 2012

Radio Astronomy Beam Forming on Many-Core Architectures.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

Performance Gaps between OpenMP and OpenCL for Multi-core CPUs.
Proceedings of the 41st International Conference on Parallel Processing Workshops, 2012

Accelerating Cost Aggregation for Real-Time Stereo Matching.
Proceedings of the 18th IEEE International Conference on Parallel and Distributed Systems, 2012

2011
Towards an Effective Unified Programming Model for Many-Cores.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

OCL-BodyScan: A Case Study for Application-centric Programming of Many-Core Processors.
Proceedings of the International Conference on Parallel Processing, 2011

A Comprehensive Performance Comparison of CUDA and OpenCL.
Proceedings of the International Conference on Parallel Processing, 2011

An Auto-tuning Solution to Data Streams Clustering in OpenCL.
Proceedings of the 14th IEEE International Conference on Computational Science and Engineering, 2011

2010
On the effective parallel programming of multi-core processors.
PhD thesis, 2010

Performance Impact of Task Mapping on the Cell BE Multicore Processor.
Proceedings of the Computer Architecture, 2010

2009
Building high-resolution sky images using the Cell/B.E.
Sci. Program., 2009

Evaluating application mapping scenarios on the Cell/B.E.
Concurr. Comput. Pract. Exp., 2009

Introduction to Mastering Cell BE and GPU Execution Platforms.
Proceedings of the Embedded Computer Systems: Architectures, 2009

Evaluating multi-core platforms for HPC data-intensive kernels.
Proceedings of the 6th Conference on Computing Frontiers, 2009

2008
Radioastronomy Image Synthesis on the Cell/B.E..
Proceedings of the Euro-Par 2008, 2008

2007
Multicore Surprises: Lessons Learned from Optimizing Sweep3D on the Cell Broadband Engine.
Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

An Effective Strategy for Porting C++ Applications on Cell.
Proceedings of the 2007 International Conference on Parallel Processing (ICPP 2007), 2007

Digital Media Indexing on the Cell Processor.
Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007

2006
SP@CE - An SP-Based Programming Model for Consumer Electronics Streaming Applications.
Proceedings of the Languages and Compilers for Parallel Computing, 2006

PAM-SoC: A Toolchain for Predicting MPSoC Performance.
Proceedings of the Euro-Par 2006, Parallel Processing, 12th International Euro-Par Conference, Dresden, Germany, August 28, 2006


  Loading...