Alexandre E. Eichenberger

Proceedings of the Proceedings 27th International Conference on Extending Database Technology, 2024

2023

Serving Deep Learning Model in Relational Databases.

[BibT_eX]

[DOI]

CoRR, 2023

2021

Intelligent Adaptation of Hardware Knobs for Improving Performance and Power Consumption.

[BibT_eX]

[DOI]

Pradip Bose

Miquel Moretó

IEEE Trans. Computers, 2021

2020

Hybrid CPU/GPU tasks optimized for concurrency in OpenMP.

[BibT_eX]

[DOI]

Alexey Bataev

Leopold Grinberg

John K. O'Brien

IBM J. Res. Dev., 2020

An open-source solution to performance portability for Summit and Sierra supercomputers.

[BibT_eX]

[DOI]

Alexey Bataev

John K. O'Brien

IBM J. Res. Dev., 2020

Compiling ONNX Neural Network Models Using MLIR.

[BibT_eX]

[DOI]

Tung D. Le

Tong Chen

CoRR, 2020

Language to Network: Conditional Parameter Adaptation with Natural Language Descriptions.

[BibT_eX]

[DOI]

Tian Jin

Zhun Liu

Shengjia Yan

Louis-Philippe Morency

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019

POSTER: CogR: Exploiting Program Structures for Machine-Learning Based Runtime Solutions.

[BibT_eX]

[DOI]

Hyojin Sung

Tong Chen

Kevin K. O'Brien

Proceedings of the 28th International Conference on Parallel Architectures and Compilation Techniques, 2019

2017

Implementing implicit OpenMP data sharing on GPUs.

[BibT_eX]

[DOI]

Proceedings of the Fourth Workshop on the LLVM Compiler Infrastructure in HPC, 2017

libPRISM: an intelligent adaptation of prefetch and SMT levels.

[BibT_eX]

[DOI]

Pradip Bose

Proceedings of the International Conference on Supercomputing, 2017

Efficient Fork-Join on GPUs Through Warp Specialization.

[BibT_eX]

[DOI]

Arpith Chacko Jacob

Hyojin Sung

Samuel F. Antão

Proceedings of the 24th IEEE International Conference on High Performance Computing, 2017

2016

Performance Analysis and Optimization of Clang's OpenMP 4.5 GPU Support.

[BibT_eX]

[DOI]

Proceedings of the 7th International Workshop on Performance Modeling, 2016

Offloading Support for OpenMP in Clang and LLVM.

[BibT_eX]

[DOI]

Samuel F. Antão

Alexey Bataev

Proceedings of the Third Workshop on the LLVM Compiler Infrastructure in HPC, 2016

Early Experiences Porting Three Applications to OpenMP 4.5.

[BibT_eX]

[DOI]

Bronis R. de Supinski

Erik W. Draeger

Proceedings of the OpenMP: Memory, Devices, and Tasks, 2016

A Proposal to OpenMP for Addressing the CPU Oversubscription Challenge.

[BibT_eX]

[DOI]

Yonghong Yan

Jeff R. Hammond

Chunhua Liao

Proceedings of the OpenMP: Memory, Devices, and Tasks, 2016

2015

Integrating GPU support for OpenMP offloading directives into Clang.

[BibT_eX]

[DOI]

Samuel Antão

Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC, 2015

Performance analysis of OpenMP on a GPU using a CORAL proxy application.

[BibT_eX]

[DOI]

Samuel F. Antão

Proceedings of the 6th International Workshop on Performance Modeling, 2015

Exploiting Fine- and Coarse-Grained Parallelism Using a Directive Based Approach.

[BibT_eX]

[DOI]

Ravi Nair

Proceedings of the OpenMP: Heterogenous Execution and Data Movements, 2015

Towards Task-Parallel Reductions in OpenMP.

[BibT_eX]

[DOI]

Bronis R. de Supinski

Stephen Olivier

Kelvin Li

Proceedings of the OpenMP: Heterogenous Execution and Data Movements, 2015

2014

Coordinating GPU threads for OpenMP 4.0 in LLVM.

[BibT_eX]

[DOI]

Samuel Antão

Proceedings of the 2014 LLVM Compiler Infrastructure in HPC, 2014

Author retrospective for optimum modulo schedules for minimum register requirements.

[BibT_eX]

[DOI]

Proceedings of the ACM International Conference on Supercomputing 25th Anniversary Volume, 2014

2013

Experimenting with low-overhead OpenMP runtime on IBM Blue Gene/Q.

[BibT_eX]

[DOI]

IBM J. Res. Dev., 2013

OMPT: An OpenMP Tools Application Programming Interface for Performance Analysis.

[BibT_eX]

[DOI]

John M. Mellor-Crummey

Proceedings of the OpenMP in the Era of Low Power Devices and Accelerators, 2013

2012

The Design of OpenMP Thread Affinity.

[BibT_eX]

[DOI]

Lakshminarayanan Renganarayanan

Christian Terboven

Michael Wong

Dieter an Mey

Proceedings of the OpenMP in a Heterogeneous World - 8th International Workshop on OpenMP, 2012

2010

Automatic creation of tile size selection models.

[BibT_eX]

[DOI]

Tomofumi Yuki

Sanjay V. Rajopadhye

Charles Anderson

Lakshminarayanan Renganarayanan

Proceedings of the CGO 2010, 2010

2009

Compact multi-dimensional kernel extraction for register tiling.

[BibT_eX]

[DOI]

Uday Bondhugula

Salem Derisavi

Proceedings of the ACM/IEEE Conference on High Performance Computing, 2009

Exploiting Parallelism with Dependence-Aware Scheduling.

[BibT_eX]

[DOI]

Xiaotong Zhuang

Yangchun Luo

Kathryn M. O'Brien

Proceedings of the PACT 2009, 2009

2008

Hybrid access-specific software cache techniques for the cell BE architecture.

[BibT_eX]

[DOI]

Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, 2008

2006

Using advanced compiler technology to exploit the performance of the Cell Broadband Engine<sup>TM</sup> architecture.

[BibT_eX]

[DOI]

IBM Syst. J., 2006

2005

An integrated simdization framework using virtual vectors.

[BibT_eX]

[DOI]

Peng Wu

Amy Wang

Peng Zhao

Proceedings of the 19th Annual International Conference on Supercomputing, 2005

Efficient SIMD Code Generation for Runtime Alignment and Length Conversion.

[BibT_eX]

[DOI]

Peng Wu

Amy Wang

Proceedings of the 3nd IEEE / ACM International Symposium on Code Generation and Optimization (CGO 2005), 2005

Optimizing Compiler for the CELL Processor.

[BibT_eX]

[DOI]

Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques (PACT 2005), 2005

2004

Vectorization for SIMD architectures with alignment constraints.

[BibT_eX]

[DOI]

Peng Wu

Proceedings of the ACM SIGPLAN 2004 Conference on Programming Language Design and Implementation 2004, 2004

2002

An Experimental Study of Algorithms for Weighted Completion Time Scheduling.

[BibT_eX]

[DOI]

Algorithmica, 2002

2001

Scheduling Superblocks with Bound-Based Branch Trade-Offs.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2001

2000

An integrated approach to accelerate data and predicate computations in hyperblocks.

[BibT_eX]

[DOI]

Suman Maradani

Proceedings of the 33rd Annual IEEE/ACM International Symposium on Microarchitecture, 2000

Lower Bounds on Precedence-Constrained Scheduling for Parallel Processors.

[BibT_eX]

[DOI]

Proceedings of the 2000 International Conference on Parallel Processing, 2000

1999

Algorithms for Total Weighted Completion Time Scheduling.

[BibT_eX]

[DOI]

Proceedings of the Tenth Annual ACM-SIAM Symposium on Discrete Algorithms, 1999

Balance Scheduling: Weighting Branch Tradeoffs in Superblocks.

[BibT_eX]

[DOI]

Proceedings of the 32nd Annual IEEE/ACM International Symposium on Microarchitecture, 1999

1998

Effective Cluster Assignment for Modulo Scheduling.

[BibT_eX]

[DOI]

Erik Nystrom

Proceedings of the 31st Annual IEEE/ACM International Symposium on Microarchitecture, 1998

Efficient Edge Profiling for ILP-Processors.

[BibT_eX]

[DOI]

S. M. Lobo

Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques, 1998

1997

Efficient Formulation for Optimal Modulo Schedulers.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGPLAN '97 Conference on Programming Language Design and Implementation (PLDI), 1997

1996

Modulo scheduling, machine representations, and register-sensitive algorithms.

[BibT_eX]

[DOI]

PhD thesis, 1996

Minimizing Register Requirements of a Modulo Schedule via Optimum Stage Scheduling.

[BibT_eX]

[DOI]

Int. J. Parallel Program., 1996

A Reduced Multipipeline Machine Description that Preserves Scheduling Constraints.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGPLAN'96 Conference on Programming Language Design and Implementation (PLDI), 1996

1995

Stage scheduling: a technique to reduce the register requirements of a modulo schedule.

[BibT_eX]

[DOI]

Proceedings of the 28th Annual International Symposium on Microarchitecture, Ann Arbor, Michigan, USA, November 29, 1995

[BibT_eX]

[DOI]

Proceedings of the 28th Annual International Symposium on Microarchitecture, Ann Arbor, Michigan, USA, November 29, 1995

Optimum Modulo Schedules for Minimum Register Requirements.

[BibT_eX]

[DOI]

Proceedings of the 9th international conference on Supercomputing, 1995

Impact of Load Imbalance on the Design of Software Barriers.

[BibT_eX]

Proceedings of the 1995 International Conference on Parallel Processing, 1995

Modeling load imbalance and fuzzy barriers for scalable shared-memory multiprocessors.

[BibT_eX]

[DOI]

Proceedings of the 28th Annual Hawaii International Conference on System Sciences (HICSS-28), 1995

1994

Minimum register requirements for a modulo schedule.

[BibT_eX]

[DOI]