Antoniu Pop

Mikel Luján

Dataset, August, 2024

LeanBin: Harnessing Lifting and Recompilation to Debloat Binaries.

[BibT_eX]

[DOI]

Igor Wodiany

Mikel Luján

Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering, 2024

2020

AfterOMPT: An OMPT-Based Tool for Fine-Grained Tracing of Tasks and Loops.

[BibT_eX]

[DOI]

Proceedings of the OpenMP: Portable Multi-Level Parallelism on Modern Systems, 2020

2019

Low-Precision Neural Network Decoding of Polar Codes.

[BibT_eX]

[DOI]

Igor Wodiany

Proceedings of the 20th IEEE International Workshop on Signal Processing Advances in Wireless Communications, 2019

2018

Type Information Elimination from Objects on Architectures with Tagged Pointers Support.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2018

Leveraging Data-Flow Task Parallelism for Locality-Aware Dynamic Scheduling on Heterogeneous Platforms.

[BibT_eX]

[DOI]

Osman Seckin Simsek

Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium Workshops, 2018

Automated Analysis of Task-Parallel Execution Behavior Via Artificial Neural Networks.

[BibT_eX]

[DOI]

Richard Neill

Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium Workshops, 2018

2017

Fuse: Accurate Multiplexing of Hardware Performance Counters Across Executions.

[BibT_eX]

[DOI]

Richard Neill

ACM Trans. Archit. Code Optim., 2017

Accurate and Complete Hardware Profiling for OpenMP - Multiplexing Hardware Events Across Executions.

[BibT_eX]

[DOI]

Richard Neill

Andrei Sergeevich Terechko

Proceedings of the Scaling OpenMP for Exascale Performance and Portability, 2017

MaxSim: A simulation platform for managed applications.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Symposium on Performance Analysis of Systems and Software, 2017

Paving the Way Towards a Highly Energy-Efficient and Highly Integrated Compute Node for the Exascale Revolution: The ExaNoDe Approach.

[BibT_eX]

[DOI]

Proceedings of the Euromicro Conference on Digital System Design, 2017

2016

NUMA-aware scheduling and memory allocation for data-flow task-parallel applications.

[BibT_eX]

[DOI]

Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2016

Language-Centric Performance Analysis of OpenMP Programs with Aftermath.

[BibT_eX]

[DOI]

Proceedings of the OpenMP: Memory, Devices, and Tasks, 2016

Interactive visualization of cross-layer performance anomalies in dynamic task-parallel applications and systems.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Symposium on Performance Analysis of Systems and Software, 2016

Scalable Task Parallelism for NUMA: A Uniform Abstraction for Coordinated Scheduling and Memory Management.

[BibT_eX]

[DOI]

Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, 2016

2015

Effective Barrier Synchronization on Intel Xeon Phi Coprocessor.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2015: Parallel Processing, 2015

2014

Compiler/Runtime Framework for Dynamic Dataflow Parallelization of Tiled Programs.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2014

Topology-Aware and Dependence-Aware Scheduling and Memory Allocation for Task-Parallel Languages.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2014

TERAFLUX: Harnessing dataflow in next generation teradevices.

[BibT_eX]

[DOI]

Microprocess. Microsystems, 2014

Automatic Detection of Performance Anomalies in Task-Parallel Programs.

[BibT_eX]

[DOI]

CoRR, 2014

Energy-aware parallelization flow and toolset for C code.

[BibT_eX]

[DOI]

Alexandru Sutii

Proceedings of the 17th International Workshop on Software and Compilers for Embedded Systems, 2014

2013

OpenStream: Expressiveness and data-flow compilation of OpenMP streaming programs.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2013

OpenStream: a data-flow approach to solving the von Neumann bottlenecks.

[BibT_eX]

[DOI]

Proceedings of the International Workshop on Software and Compilers for Embedded Systems, 2013

Correct and Efficient Bounded FIFO Queues.

[BibT_eX]

[DOI]

Proceedings of the 25th International Symposium on Computer Architecture and High Performance Computing, 2013

Correct and efficient work-stealing for weak memory models.

[BibT_eX]

[DOI]

Nhat Minh Lê

Andrei Sergeevich Terechko

Francesco Zappa Nardelli

Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2013

The TERAFLUX Project: Exploiting the DataFlow Paradigm in Next Generation Teradevices.

[BibT_eX]

[DOI]

Proceedings of the 2013 Euromicro Conference on Digital System Design, 2013

EU FP7-288307 Pharaon Project: Parallel and Heterogeneous Architecture for Real-Time Applications.

[BibT_eX]

[DOI]

Miguel Glassee

Daniel Calvo

Eduardo de las Heras

Proceedings of the 2013 Euromicro Conference on Digital System Design, 2013

2012

Automatic Extraction of Coarse-Grained Data-Flow Threads from Imperative Programs.

[BibT_eX]

[DOI]

Feng Li

IEEE Micro, 2012

2011

ACOTES Project: Advanced Compiler Technologies for Embedded Streaming.

[BibT_eX]

[DOI]

Int. J. Parallel Program., 2011

A stream-computing extension to OpenMP.

[BibT_eX]

[DOI]