Maurício Breternitz

Orcid: 0000-0003-1752-6255

Affiliations:
  • Advanced Micro Devices


According to our database1, Maurício Breternitz authored at least 49 papers between 1988 and 2023.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2023
ULEEN: A Novel Architecture for Ultra-low-energy Edge Neural Networks.
ACM Trans. Archit. Code Optim., December, 2023

A conditional branch predictor based on weightless neural networks.
Neurocomputing, October, 2023

Dendrite-inspired Computing to Improve Resilience of Neural Networks to Faults in Emerging Memory Technologies.
Proceedings of the IEEE International Conference on Rebooting Computing, 2023

An FPGA-Based Weightless Neural Network for Edge Network Intrusion Detection.
Proceedings of the 2023 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2023

COIN: Combinational Intelligent Networks.
Proceedings of the 34th IEEE International Conference on Application-specific Systems, 2023

2022
A WiSARD-based conditional branch predictor.
Proceedings of the 30th European Symposium on Artificial Neural Networks, 2022


Distributive Thermometer: A New Unary Encoding for Weightless Neural Networks.
Proceedings of the 30th European Symposium on Artificial Neural Networks, 2022

LogicWiSARD: Memoryless Synthesis of Weightless Neural Networks.
Proceedings of the 33rd IEEE International Conference on Application-specific Systems, 2022

Weightless Neural Networks for Efficient Edge Inference.
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2022

2021
Efficiency and scalability of multi-lane capsule networks (MLCN).
J. Parallel Distributed Comput., 2021

Smart selection of optimizations in dynamic compilers.
Concurr. Comput. Pract. Exp., 2021

2020
Weightless Neural Networks as Memory Segmented Bloom Filters.
Neurocomputing, 2020

A unified model for accelerating unsupervised iterative re-ranking algorithms.
Concurr. Comput. Pract. Exp., 2020

2019
The Multi-Lane Capsule Network.
IEEE Signal Process. Lett., 2019

The Multi-Lane Capsule Network (MLCN).
CoRR, 2019

Memory Efficient Weightless Neural Network using Bloom Filter.
Proceedings of the 27th European Symposium on Artificial Neural Networks, 2019

2018
ComP-net: command processor networking for efficient intra-kernel communications on GPUs.
Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques, 2018

2017
GPU triggered networking for intra-kernel communications.
Proceedings of the International Conference for High Performance Computing, 2017

2016
HadoopCL2: Motivating the Design of a Distributed, Heterogeneous Programming System With Machine-Learning Applications.
IEEE Trans. Parallel Distributed Syst., 2016

Extended task queuing: active messages for heterogeneous systems.
Proceedings of the International Conference for High Performance Computing, 2016

PY-PITS: A Scalable Python Runtime System for the Computation of Partially Idempotent Tasks.
Proceedings of the 2016 International Symposium on Computer Architecture and High Performance Computing Workshops, 2016

2014
Adaptive global power optimization for Web servers.
J. Supercomput., 2014

Microcode Compression Using Structured-Constrained Clustering.
Int. J. Parallel Program., 2014

Implementation and evaluation of deep neural networks (DNN) on mainstream heterogeneous systems.
Proceedings of the Asia-Pacific Workshop on Systems, 2014

2013
Image Re-ranking Acceleration on GPUs.
Proceedings of the 25th International Symposium on Computer Architecture and High Performance Computing, 2013

HadoopCL: MapReduce on Distributed Heterogeneous Platforms through Seamless Integration of Hadoop and OpenCL.
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

2012
Cloud Workload Analysis with SWAT.
Proceedings of the IEEE 24th International Symposium on Computer Architecture and High Performance Computing, 2012

Efficient Image Re-Ranking Computation on GPUs.
Proceedings of the 10th IEEE International Symposium on Parallel and Distributed Processing with Applications, 2012

2011
Structure-Constrained Microcode Compression.
Proceedings of the 23rd International Symposium on Computer Architecture and High Performance Computing, 2011

LAR-CC: Large atomic regions with conditional commits.
Proceedings of the CGO 2011, 2011

2010
TAO: two-level atomicity for dynamic binary optimizations.
Proceedings of the CGO 2010, 2010

2008
A Segmented Bloom Filter Algorithm for Efficient Predictors.
Proceedings of the 20th International Symposium on Computer Architecture and High Performance Computing, 2008

2007
Impacts of Multiprocessor Configurations on Workloads in Bioinformatics.
Proceedings of the 19th Symposium on Computer Architecture and High Performance Computing (SBAC-PAD 2007), 2007

StarDBT: An Efficient Multi-platform Dynamic Binary Translation System.
Proceedings of the Advances in Computer Systems Architecture, 2007

2006
Clustering-Based Microcode Compression.
Proceedings of the 24th International Conference on Computer Design (ICCD 2006), 2006

2005
Enhanced code density of embedded CISC processors with echo technology.
Proceedings of the 3rd IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, 2005

2004
The Accuracy of Initial Prediction in Two-Phase Dynamic Binary Translators.
Proceedings of the 2nd IEEE / ACM International Symposium on Code Generation and Optimization (CGO 2004), 2004

Continuous Trip Count Profiling for Loop Optimizations in Two-Phase Dynamic Binary Translato.
Proceedings of the 8th Annual Workshop on Interaction between Compilers and Computer Architecture (INTERACT-8 2004), 2004

2003
Compilation, Architectural Support, and Evaluation of SIMD Graphics Pipeline Programs on a General-Purpose CPU.
Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques (PACT 2003), 27 September, 2003

1997
Enhanced Compression Techniques to Simplify Programm Decompression and Execution.
Proceedings of the Proceedings 1997 International Conference on Computer Design: VLSI in Computers & Processors, 1997

1996
Design Tradeoffs and Experience with Motorola PowerPC? Migration Tool.
Proceedings of the 1996 International Conference on Computer Design (ICCD '96), 1996

Motorola PowerPC Migration Tools - Emulation and Translation.
Proceedings of the Forty-First IEEE Computer Society International Conference: Technologies for the Information Superhighway, 1996

1995
Solutions and debugging for data consistency in multiprocessors with noncoherent caches.
Int. J. Parallel Program., 1995

1994
An Optimal Asynchronous Scheduling Algorithm for Software Cache Consistence.
Proceedings of the 27th Annual Hawaii International Conference on System Sciences (HICSS-27), 1994

1991
Implementation Optimization Techniques for Architecture Synthesis of Application-Specific Processors.
Proceedings of the 24th Annual IEEE/ACM International Symposium on Microarchitecture, 1991

1990
Architecture Synthesis of High-Performance Application-Specific Processors.
Proceedings of the 27th ACM/IEEE Design Automation Conference. Orlando, 1990

1988
Organization of array data for concurrent memory access.
Proceedings of the 21st Annual Workshop and Symposium on Microprogramming and Microarchitecture, 1988, San Diego, California, USA, November 28, 1988

The White Dwarf: A High-Performance Application-Specific Processor.
Proceedings of the 15th Annual International Symposium on Computer Architecture, 1988


  Loading...