Joseph B. Manzano

Fabrizio Ferrandi

Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

2021

MAPA: multi-accelerator pattern allocation policy for multi-tenant GPU servers.

[BibT_eX]

[DOI]

Kiran Ranganath

Joshua D. Suetterlein

Shuaiwen Leon Song

Daniel Wong

Proceedings of the International Conference for High Performance Computing, 2021

LC-MEMENTO: A Memory Model for Accelerated Architectures.

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for Parallel Computing, 2021

Automated Generation of Integrated Digital and Spiking Neuromorphic Machine Learning Accelerators.

[BibT_eX]

[DOI]

Serena Curzel

Nicolas Bohm Agostini

Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2021

Towards Automatic and Agile AI/ML Accelerator Design with End-to-End Synthesis.

[BibT_eX]

[DOI]

Jeff Jun Zhang

Nicolas Bohm Agostini

Gu-Yeon Wei

David Brooks

Proceedings of the 32nd IEEE International Conference on Application-specific Systems, 2021

2020

SODA: a New Synthesis Infrastructure for Agile Hardware Design of Machine Learning Accelerators.

[BibT_eX]

[DOI]

Marco Minutoli

Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2020

On the Marriage of Asynchronous Many Task Runtimes and Big Data: A Glance.

[BibT_eX]

[DOI]

Proceedings of the 27th IEEE International Conference on High Performance Computing, 2020

Invited: Software Defined Accelerators From Learning Tools Environment.

[BibT_eX]

[DOI]

Marco Minutoli

Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

2019

A Parallel Graph Environment for Real-World Data Analytics Workflows.

[BibT_eX]

[DOI]

Daniel G. Chavarría-Miranda

Maurizio Drocco

John Feo

Jesun Sahariar Firoz

Thejaka Amila Kanewala

Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2019

2018

Characterization of the Impact of Soft Errors on Iterative Methods.

[BibT_eX]

[DOI]

Sriram Krishnamoorthy

Proceedings of the 25th IEEE International Conference on High Performance Computing, 2018

Comparative analysis of soft-error detection strategies: a case study with iterative methods.

[BibT_eX]

[DOI]

Sriram Krishnamoorthy

Proceedings of the 15th ACM International Conference on Computing Frontiers, 2018

2017

Exploring performance and energy tradeoffs for irregular applications: A case study on the Tilera many-core architecture.

[BibT_eX]

[DOI]

Ajay Panyala

Daniel G. Chavarría-Miranda

Mahantesh Halappanavar

J. Parallel Distributed Comput., 2017

User-transparent Distributed TensorFlow.

[BibT_eX]

[DOI]

CoRR, 2017

Verification of the Extended Roofline Model for Asynchronous Many Task Runtimes.

[BibT_eX]

[DOI]

Proceedings of the Third International Workshop on Extreme Scale Programming Models and Middleware, 2017

Designing Scalable Distributed Memory Models: A Case Study.

[BibT_eX]

[DOI]

Proceedings of the Computing Frontiers Conference, 2017

2016

Algorithm and Architecture Independent Benchmarking with SEAK.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016

Asynchronous Runtimes in Action: An Introspective Framework for a Next Gen Runtime.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016

Extending the Roofline Model for Asynchronous Many-Task Runtimes.

[BibT_eX]

[DOI]

Joshua D. Suetterlein

Proceedings of the 2016 IEEE International Conference on Cluster Computing, 2016

Application characterization at scale: lessons learned from developing a distributed open community runtime system for high performance computing.

[BibT_eX]

[DOI]

Proceedings of the ACM International Conference on Computing Frontiers, CF'16, 2016

2015

On the Impact of Execution Models: A Case Study in Computational Chemistry.

[BibT_eX]

[DOI]

Mahantesh Halappanavar

Sriram Krishnamoorthy

Daniel G. Chavarría-Miranda

Abhinav Vishnu

Adolfy Hoisie

Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 2015

Gregarious Data Re-structuring in a Many Core Architecture.

[BibT_eX]

[DOI]

Proceedings of the 17th IEEE International Conference on High Performance Computing and Communications, 2015

Locality aware concurrent start for stencil applications.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual IEEE/ACM International Symposium on Code Generation and Optimization, 2015

Optimizing irregular applications for energy and performance on the Tilera many-core architecture.

[BibT_eX]

[DOI]

Ajay Panyala

Mahantesh Halappanavar

Proceedings of the 12th ACM International Conference on Computing Frontiers, 2015

Power and performance trade-offs for Space Time Adaptive Processing.

[BibT_eX]

[DOI]

Proceedings of the 26th IEEE International Conference on Application-specific Systems, 2015

2014

Jagged Tiling for Intra-tile Parallelism and Fine-Grain Multithreading.

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for Parallel Computing, 2014

ACDT: Architected Composite Data Types trading-in unfettered data access for improved execution.

[BibT_eX]

[DOI]

Proceedings of the 20th IEEE International Conference on Parallel and Distributed Systems, 2014

2012

Fast and Accurate Simulation of the Cray XMT Multithreaded Supercomputer.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2012

The Role of Non-strict Fine-grain Synchronization.

[BibT_eX]

[DOI]

Juergen Ributzka

Guang R. Gao

Proceedings of the Transition of HPC Towards Exascale Computing, 2012

2011

OPELL and PM: A Case Study on Porting Shared Memory Programming Models to Accelerators Architectures.

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for Parallel Computing, 2011

The elephant and the mice: the role of non-strict fine-grain synchronization for modern many-core architectures.

[BibT_eX]

[DOI]

Proceedings of the 25th International Conference on Supercomputing, 2011, Tucson, AZ, USA, May 31, 2011

2010

A Study of a Software Cache Implementation of the OpenMP Memory Model for Multicore and Manycore Architectures.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2010 - Parallel Processing, 16th International Euro-Par Conference, Ischia, Italy, August 31, 2010

2009

TL-DAE: Thread-Level Decoupled Access/Execution for OpenMP on the Cyclops-64 Many-Core Processor.

[BibT_eX]

[DOI]

Ge Gan