Ana Lucia Varbanescu

According to our database1, Ana Lucia Varbanescu authored at least 76 papers between 2006 and 2018.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

Homepages:

On csauthors.net:

Bibliography

2018
HLS Support for Polymorphic Parallel Memories.
Proceedings of the IFIP/IEEE International Conference on Very Large Scale Integration, 2018

Building High-Performance, Easy-to-Use Polymorphic Parallel Memories with HLS.
Proceedings of the VLSI-SoC: Design and Engineering of Electronics Systems Based on New Computing Paradigms, 2018

A Beginner's Guide to Estimating and Improving Performance Portability.
Proceedings of the High Performance Computing, 2018

Mix-and-Match: A Model-Driven Runtime Optimisation Strategy for BFS on GPUs.
Proceedings of the 8th IEEE/ACM Workshop on Irregular Applications: Architectures and Algorithms, 2018

EXTRA: an open platform for reconfigurable architectures.
Proceedings of the 18th International Conference on Embedded Computer Systems: Architectures, 2018

MAX-PolyMem: High-Bandwidth Polymorphic Parallel Memories for DFEs.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium Workshops, 2018

Performance Estimation for Exascale Reconfigurable Dataflow Platforms.
Proceedings of the International Conference on Field-Programmable Technology, 2018

Performance Prediction for Large-Scale Heterogeneous Platforms.
Proceedings of the 26th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2018

Towards Application-Centric Parallel Memories.
Proceedings of the Euro-Par 2018: Parallel Processing Workshops, 2018

Exploring HPC and Big Data Convergence: A Graph Processing Study on Intel Knights Landing.
Proceedings of the IEEE International Conference on Cluster Computing, 2018

2017
Using Graph Properties to Speed-up GPU-based Graph Traversal: A Model-driven Approach.
CoRR, 2017

A Performance-centric Approach for Complex Decision Support.
Proceedings of the 8th ACM/SPEC on International Conference on Performance Engineering, 2017

A NoC-based custom FPGA configuration memory architecture for ultra-fast micro-reconfiguration.
Proceedings of the International Conference on Field Programmable Technology, 2017

2016
Workload Partitioning for Accelerating Applications on Heterogeneous Platforms.
IEEE Trans. Parallel Distrib. Syst., 2016

The landscape of GPGPU performance modeling tools.
Parallel Computing, 2016

Dynamic Load Balancing for High-Performance Graph Processing on Hybrid CPU-GPU Platforms.
Proceedings of the 6th Workshop on Irregular Applications: Architecture and Algorithms, 2016

EXTRA: Towards the exploitation of eXascale technology for reconfigurable architectures.
Proceedings of the 11th International Symposium on Reconfigurable Communication-centric Systems-on-Chip, 2016

A Tool for Bottleneck Analysis and Performance Prediction for GPU-Accelerated Applications.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016

Heterogeneous computing with accelerators: an overview with examples.
Proceedings of the 2016 Forum on Specification and Design Languages, 2016

Synthetic Graph Generation for Systematic Exploration of Graph Structural Properties.
Proceedings of the Euro-Par 2016: Parallel Processing Workshops, 2016

Speed-Up Computational Finance Simulations with OpenCL on Intel Xeon Phi.
Proceedings of the Euro-Par 2016: Parallel Processing Workshops, 2016

Towards the Next Generation of Large-Scale Network Archives.
Proceedings of the Euro-Par 2016: Parallel Processing Workshops, 2016

Using colored petri nets for GPGPU performance modeling.
Proceedings of the ACM International Conference on Computing Frontiers, CF'16, 2016

Design and Experimental Evaluation of Distributed Heterogeneous Graph-Processing Systems.
Proceedings of the IEEE/ACM 16th International Symposium on Cluster, 2016

2015
Evaluating vector data type usage in OpenCL kernels.
Concurrency and Computation: Practice and Experience, 2015

Can Portability Improve Performance?: An Empirical Study of Parallel Graph Analytics.
Proceedings of the 6th ACM/SPEC International Conference on Performance Engineering, Austin, TX, USA, January 31, 2015

Computing the Pseudo-Inverse of a Graph's Laplacian Using GPUs.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 2015

Matchmaking Applications and Partitioning Strategies for Efficient Execution on Heterogeneous Platforms.
Proceedings of the 44th International Conference on Parallel Processing, 2015

Quantifying the Performance Impact of Graph Structure on Neighbour Iteration Strategies for PageRank.
Proceedings of the Euro-Par 2015: Parallel Processing Workshops, 2015

Towards Community Detection on Heterogeneous Platforms.
Proceedings of the Euro-Par 2015: Parallel Processing Workshops, 2015

FiNS: A Framework for Accelerating Nested Simulations on Heterogeneous Platforms.
Proceedings of the Euro-Par 2015: Parallel Processing Workshops, 2015

EXTRA: Towards an Efficient Open Platform for Reconfigurable High Performance Computing.
Proceedings of the 18th IEEE International Conference on Computational Science and Engineering, 2015

Fast packet forwarding engine based on software circuits.
Proceedings of the 12th ACM International Conference on Computing Frontiers, 2015

Improving Application Performance by Efficiently Utilizing Heterogeneous Many-core Platforms.
Proceedings of the 15th IEEE/ACM International Symposium on Cluster, 2015

An Empirical Performance Evaluation of GPU-Enabled Graph-Processing Systems.
Proceedings of the 15th IEEE/ACM International Symposium on Cluster, 2015

2014
Cross-Loop Optimization of Arithmetic Intensity for Finite Element Local Assembly.
TACO, 2014

Aristotle: A performance impact indicator for the OpenCL kernels using local memory.
Scientific Programming, 2014

COFFEE: an Optimizing Compiler for Finite Element Local Assembly.
CoRR, 2014

Benchmarking graph-processing platforms: a vision.
Proceedings of the ACM/SPEC International Conference on Performance Engineering, 2014

Test-driving Intel Xeon Phi.
Proceedings of the ACM/SPEC International Conference on Performance Engineering, 2014

Towards Benchmarking IaaS and PaaS Clouds for Graph Analytics.
Proceedings of the Big Data Benchmarking - 5th International Workshop, 2014

Parallel Computation of Non-Bonded Interactions in Drug Discovery: Nvidia GPUs vs. Intel Xeon Phi.
Proceedings of the International Work-Conference on Bioinformatics and Biomedical Engineering, 2014

How Well Do Graph-Processing Platforms Perform? An Empirical Performance Evaluation and Analysis.
Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014

Improving performance by matching imbalanced workloads with heterogeneous platforms.
Proceedings of the 2014 International Conference on Supercomputing, 2014

Grover: Looking for Performance Improvement by Disabling Local Memory Usage in OpenCL Kernels.
Proceedings of the 43rd International Conference on Parallel Processing, 2014

Look before You Leap: Using the Right Hardware Resources to Accelerate Applications.
Proceedings of the 2014 IEEE International Conference on High Performance Computing and Communications, 2014

Optimizing a Calibration Software for Radio Astronomy.
Proceedings of the 2014 IEEE International Conference on High Performance Computing and Communications, 2014

An Empirical Evaluation of GPGPU Performance Models.
Proceedings of the Euro-Par 2014: Parallel Processing Workshops, 2014

KMA: A Dynamic Memory Manager for OpenCL.
Proceedings of the Seventh Workshop on General Purpose Processing Using GPUs, 2014

2013
An application-centric evaluation of OpenCL on multi-core CPUs.
Parallel Computing, 2013

An Empirical Study of Intel Xeon Phi.
CoRR, 2013

Performance Traps in OpenCL for CPUs.
Proceedings of the 21st Euromicro International Conference on Parallel, 2013

ELMO: A User-Friendly API to Enable Local Memory in OpenCL Kernels.
Proceedings of the 21st Euromicro International Conference on Parallel, 2013

Topic 9: Parallel and Distributed Programming - (Introduction).
Proceedings of the Euro-Par 2013 Parallel Processing, 2013

Glinda: a framework for accelerating imbalanced applications on heterogeneous platforms.
Proceedings of the Computing Frontiers Conference, 2013

Sesame: A User-Transparent Optimizing Framework for Many-Core Processors.
Proceedings of the 13th IEEE/ACM International Symposium on Cluster, 2013

2012
Parallel application characterization with quantitative metrics.
Concurrency and Computation: Practice and Experience, 2012

Radio Astronomy Beam Forming on Many-Core Architectures.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

Performance Gaps between OpenMP and OpenCL for Multi-core CPUs.
Proceedings of the 41st International Conference on Parallel Processing Workshops, 2012

Accelerating Cost Aggregation for Real-Time Stereo Matching.
Proceedings of the 18th IEEE International Conference on Parallel and Distributed Systems, 2012

2011
Towards an Effective Unified Programming Model for Many-Cores.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

OCL-BodyScan: A Case Study for Application-centric Programming of Many-Core Processors.
Proceedings of the International Conference on Parallel Processing, 2011

A Comprehensive Performance Comparison of CUDA and OpenCL.
Proceedings of the International Conference on Parallel Processing, 2011

An Auto-tuning Solution to Data Streams Clustering in OpenCL.
Proceedings of the 14th IEEE International Conference on Computational Science and Engineering, 2011

2010
On the effective parallel programming of multi-core processors.
PhD thesis, 2010

Performance Impact of Task Mapping on the Cell BE Multicore Processor.
Proceedings of the Computer Architecture, 2010

2009
Building high-resolution sky images using the Cell/B.E.
Scientific Programming, 2009

Evaluating application mapping scenarios on the Cell/B.E.
Concurrency and Computation: Practice and Experience, 2009

Introduction to Mastering Cell BE and GPU Execution Platforms.
Proceedings of the Embedded Computer Systems: Architectures, 2009

Evaluating multi-core platforms for HPC data-intensive kernels.
Proceedings of the 6th Conference on Computing Frontiers, 2009

2008
Radioastronomy Image Synthesis on the Cell/B.E..
Proceedings of the Euro-Par 2008, 2008

2007
Multicore Surprises: Lessons Learned from Optimizing Sweep3D on the Cell Broadband Engine.
Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

An Effective Strategy for Porting C++ Applications on Cell.
Proceedings of the 2007 International Conference on Parallel Processing (ICPP 2007), 2007

Digital Media Indexing on the Cell Processor.
Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007

2006
SP@CE - An SP-Based Programming Model for Consumer Electronics Streaming Applications.
Proceedings of the Languages and Compilers for Parallel Computing, 2006

PAM-SoC: A Toolchain for Predicting MPSoC Performance.
Proceedings of the Euro-Par 2006, Parallel Processing, 12th International Euro-Par Conference, Dresden, Germany, August 28, 2006


  Loading...