Marco A. Z. Alves

Proceedings of the VIII Brazilian Symposium on Computing Systems Engineering, 2018

Freezing Time: A New Approach for Emulating Fast Storage Devices Using VM.

[BibT_eX]

[DOI]

Proceedings of the 26th IEEE International Symposium on Modeling, 2018

An Elastic Multi-Core Allocation Mechanism for Database Systems.

[BibT_eX]

[DOI]

Simone Dominico

Eduardo C. de Almeida

Jorge Augusto Meira

Proceedings of the 34th IEEE International Conference on Data Engineering, 2018

Exploring IoT platform with technologically agnostic processing-in-memory framework.

[BibT_eX]

[DOI]

Paulo Cesar Santos

João Paulo C. de Lima

Proceedings of the Workshop on INTelligent Embedded Systems Architectures and Applications, 2018

HIPE: HMC instruction predication extension applied on database processing.

[BibT_eX]

[DOI]

Diego G. Tomé

Paulo C. Santos

Eduardo C. de Almeida

Proceedings of the 2018 Design, Automation & Test in Europe Conference & Exhibition, 2018

Processing in 3D memories to speed up operations on complex data structures.

[BibT_eX]

[DOI]

Paulo C. Santos

Geraldo F. Oliveira

João Paulo C. de Lima

Antonio C. S. Beck

Proceedings of the 2018 Design, Automation & Test in Europe Conference & Exhibition, 2018

Design space exploration for PIM architectures in 3D-stacked memories.

[BibT_eX]

[DOI]

João Paulo C. de Lima

Proceedings of the 15th ACM International Conference on Computing Frontiers, 2018

2017

Affinity-Based Thread and Data Mapping in Shared Memory Systems.

[BibT_eX]

[DOI]

Eduardo H. M. Cruz

Israel Koren

ACM Comput. Surv., 2017

Trace-Driven Extension for Noxim Simulator.

[BibT_eX]

[DOI]

Ivan Luiz Pedroso Pires

Luiz Carlos Pessoa Albini

Proceedings of the VII Brazilian Symposium on Computing Systems Engineering, 2017

A generic processing in memory cycle accurate simulator under hybrid memory cube architecture.

[BibT_eX]

[DOI]

Proceedings of the 2017 International Conference on Embedded Computer Systems: Architectures, 2017

Operand size reconfiguration for big data processing in memory.

[BibT_eX]

[DOI]

Eduardo C. de Almeida

Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2017

Optimizing memory affinity with a hybrid compiler/OS approach.

[BibT_eX]

[DOI]

Proceedings of the Computing Frontiers Conference, 2017

NIM: An HMC-Based Machine for Neuron Computation.

[BibT_eX]

[DOI]

Proceedings of the Applied Reconfigurable Computing - 13th International Symposium, 2017

2016

Kernel-Based Thread and Data Mapping for Improved Memory Affinity.

[BibT_eX]

[DOI]

Eduardo H. M. Cruz

Anselm Busse

Hans-Ulrich Heiss

IEEE Trans. Parallel Distributed Syst., 2016

A dynamic block-level execution profiler.

[BibT_eX]

[DOI]

Francis B. Moreira

Israel Koren

Parallel Comput., 2016

LAPT: A locality-aware page table for thread and data mapping.

[BibT_eX]

[DOI]

Laércio Lima Pilla

Parallel Comput., 2016

Exploring Cache Size and Core Count Tradeoffs in Systems with Reduced Memory Access Latency.

[BibT_eX]

[DOI]

Proceedings of the 24th Euromicro International Conference on Parallel, 2016

Communication in Shared Memory: Concepts, Definitions, and Efficient Detection.

[BibT_eX]

[DOI]

Proceedings of the 24th Euromicro International Conference on Parallel, 2016

Large vector extensions inside the HMC.

[BibT_eX]

[DOI]

Proceedings of the 2016 Design, Automation & Test in Europe Conference & Exhibition, 2016

2015

Reconfigurable Vector Extensions inside the DRAM.

[BibT_eX]

[DOI]

Proceedings of the 10th International Symposium on Reconfigurable Communication-centric Systems-on-Chip, 2015

Opportunities and Challenges of Performing Vector Operations inside the DRAM.

[BibT_eX]

[DOI]

Proceedings of the 2015 International Symposium on Memory Systems, 2015

HMC and DDR Performance Trade-offs.

[BibT_eX]

[DOI]

Paulo C. Santos

Proceedings of the System Level Design from HW/SW to Memory for Embedded Systems, 2015

SiNUCA: A Validated Micro-Architecture Simulator.

[BibT_eX]

[DOI]

Carlos Villavieja

Francis Birck Moreira

Proceedings of the 17th IEEE International Conference on High Performance Computing and Communications, 2015

Locality and Balance for Communication-Aware Thread Mapping in Multicore Systems.

[BibT_eX]

[DOI]

Mohammad S. Alhakeem

Hans-Ulrich Heiß

Proceedings of the Euro-Par 2015: Parallel Processing, 2015

Saving memory movements through vector processing in the DRAM.

[BibT_eX]

[DOI]

Proceedings of the 2015 International Conference on Compilers, 2015

2014

Dynamic thread mapping of shared memory applications by exploiting cache coherence protocols.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 2014

Profiling and Reducing Micro-Architecture Bottlenecks at the Hardware Level.

[BibT_eX]

[DOI]

Francis B. Moreira

Israel Koren

Proceedings of the 26th IEEE International Symposium on Computer Architecture and High Performance Computing, 2014

Optimizing Memory Locality Using a Locality-Aware Page Table.

[BibT_eX]

[DOI]

Laércio Lima Pilla

Proceedings of the 26th IEEE International Symposium on Computer Architecture and High Performance Computing, 2014

2013

Energy Efficient Last Level Caches via Last Read/Write Prediction.

[BibT_eX]

[DOI]

Carlos Villavieja

Proceedings of the 25th International Symposium on Computer Architecture and High Performance Computing, 2013

2012

Memory-aware Thread and Data Mapping for Hierarchical Multi-core Platforms.

[BibT_eX]

[DOI]

Alexandre Carissimi

Christiane Pousa Ribeiro

Jean-François Méhaut

Int. J. Netw. Comput., 2012

Energy Savings via Dead Sub-Block Prediction.

[BibT_eX]

[DOI]

Yale N. Patt

Proceedings of the IEEE 24th International Symposium on Computer Architecture and High Performance Computing, 2012

2011

High Latency and Contention on Shared L2-Cache for Many-Core Architectures.

[BibT_eX]

[DOI]

Henrique C. Freitas

Antonio Carlos Schneider Beck

Parallel Process. Lett., 2011

Boosting Parallel Applications Performance on Applying DIM Technique in a Multiprocessing Environment.

[BibT_eX]

[DOI]

Mateus B. Rutzig

Int. J. Reconfigurable Comput., 2011

Using Memory Access Traces to Map Threads and Data on Hierarchical Multi-core Platforms.

[BibT_eX]

[DOI]

Alexandre Carissimi

Christiane Pousa Ribeiro

Jean-François Méhaut

Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

2010

Impact of Parallel Workloads on NoC Architecture Design.

[BibT_eX]

[DOI]

Henrique Cota de Freitas

Lucas Mello Schnorr

Proceedings of the 18th Euromicro Conference on Parallel, 2010

TLP and ILP exploitation through a reconfigurable multiprocessor system.

[BibT_eX]

[DOI]

Mateus B. Rutzig

Felipe Lopes Madruga

Antonio Carlos Schneider Beck

Henrique Cota de Freitas

Nicolas Maillard

Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Evaluating Thread Placement Based on Memory Access Patterns for Multi-core Processors.

[BibT_eX]

[DOI]

Felipe Lopes Madruga

Eduardo Rocha Rodrigues

Jörg Schneider

Hans-Ulrich Heiss

Proceedings of the 12th IEEE International Conference on High Performance Computing and Communications, 2010

2009

Performance Evaluation of NoC Architectures for Parallel Workloads.

[BibT_eX]

[DOI]

Henrique C. Freitas

Lucas Mello Schnorr

Proceedings of the Third International Symposium on Networks-on-Chips, 2009

Design of Interleaved Multithreading for Network Processors on Chip.

[BibT_eX]

[DOI]

Henrique C. Freitas

Felipe Lopes Madruga