Davide Rossi

Orcid: 0000-0002-0651-5393

Affiliations:
  • University of Bologna, Bologna, Italy


According to our database1, Davide Rossi authored at least 180 papers between 2005 and 2025.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
A Reliable, Time-Predictable Heterogeneous SoC for AI-Enhanced Mixed-Criticality Edge Applications.
IEEE Trans. Circuits Syst. II Express Briefs, November, 2025

Maestro: A 302 GFLOPS/W and 19.8GFLOPS RISC-V Vector-Tensor Architecture for Wearable Ultrasound Edge Computing.
IEEE Trans. Circuits Syst. I Regul. Pap., November, 2025

Co-designing a Programmable RISC-V Accelerator for MPC-based Energy and Thermal Management of Many-Core HPC Processors.
CoRR, October, 2025

Unleashing OpenTitan's Potential: a Silicon-Ready Embedded Secure Element for Root of Trust and Cryptographic Offloading.
ACM Trans. Embed. Comput. Syst., September, 2025

AXI-REALM: Safe, Modular and Lightweight Traffic Monitoring and Regulation for Heterogeneous Mixed-Criticality Systems.
IEEE Trans. Computers, September, 2025

Parallelization is All System Identification Needs: End-to-End Vibration Diagnostics on a Multicore RISC-V Edge Device.
IEEE Internet Things J., July, 2025

A Flexible Template for Edge Generative AI With High-Accuracy Accelerated Softmax and GELU.
IEEE J. Emerg. Sel. Topics Circuits Syst., June, 2025

CVA6S+: A Superscalar RISC-V Core with High-Throughput Memory Architecture.
CoRR, May, 2025

Occamy: A 432-Core Dual-Chiplet Dual-HBM2E 768-DP-GFLOP/s RISC-V System for 8-to-64-bit Dense and Sparse Computing in 12-nm FinFET.
IEEE J. Solid State Circuits, April, 2025

Hybrid Modular Redundancy: Exploring Modular Redundancy Approaches in RISC-V Multi-core Computing Clusters for Reliable Processing in Space.
ACM Trans. Cyber Phys. Syst., January, 2025

Occamy: A 432-Core Dual-Chiplet Dual-HBM2E 768-DP-GFLOP/s RISC-V System for 8-to-64-bit Dense and Sparse Computing in 12nm FinFET.
CoRR, January, 2025

PACE: An Optimal Piecewise Polynomial Approximation Unit for Flexible and Efficient Transformer Non-linearity Acceleration.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2025

HMR-NEureka: Hybrid Modular Redundancy DNN Acceleration in Heterogeneous RISC-V SoCs.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2025

Towards Reliable Systems: A Scalable Approach to AXI4 Transaction Monitoring.
Proceedings of the Design, Automation & Test in Europe Conference, 2025

SoftEx: A Low Power and Flexible Softmax Accelerator with Fast Approximate Exponentiation.
Proceedings of the Design, Automation & Test in Europe Conference, 2025

FractalSync: Lightweight Scalable Global Synchronization of Massive Bulk Synchronous Parallel AI Accelerators.
Proceedings of the 22nd ACM International Conference on Computing Frontiers, 2025

Ramping Up Open-Source RISC-V Cores: Assessing the Energy Efficiency of Superscalar, Out-of-Order Execution.
Proceedings of the 22nd ACM International Conference on Computing Frontiers, 2025

2024
TOP: Towards Open & Predictable Heterogeneous SoCs.
IEEE Trans. Computers, December, 2024

Siracusa: A 16 nm Heterogenous RISC-V SoC for Extended Reality With At-MRAM Neural Engine.
IEEE J. Solid State Circuits, July, 2024

A Heterogeneous RISC-V Based SoC for Secure Nano-UAV Navigation.
IEEE Trans. Circuits Syst. I Regul. Pap., May, 2024

ControlPULP: A RISC-V On-Chip Parallel Power Controller for Many-Core HPC Processors with FPGA-Based Hardware-In-The-Loop Power and Thermal Emulation.
Int. J. Parallel Program., April, 2024

Marsellus: A Heterogeneous RISC-V AI-IoT End-Node SoC With 2-8 b DNN Acceleration and 30%-Boost Adaptive Body Biasing.
IEEE J. Solid State Circuits, January, 2024

Open-Source Heterogeneous SoCs for AI: The PULP Platform Experience.
CoRR, 2024

A Flexible Template for Edge Generative AI with High-Accuracy Accelerated Softmax & GELU.
CoRR, 2024

Culsans: An Efficient Snoop-based Coherency Unit for the CVA6 Open Source RISC-V application processor.
CoRR, 2024

Occamy: A 432-Core 28.1 DP-GFLOP/s/W 83% FPU Utilization Dual-Chiplet, Dual-HBM2E RISC-V-Based Accelerator for Stencil and Sparse Linear Algebra Computations with 8-to-64-bit Floating-Point Support in 12nm FinFET.
Proceedings of the IEEE Symposium on VLSI Technology and Circuits 2024, 2024

TitanCFI: Toward Enforcing Control-Flow Integrity in the Root -of- Trust.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2024

Assessing the Performance of OpenTitan as Cryptographic Accelerator in Secure Open-Hardware System-on-Chips.
Proceedings of the 21st ACM International Conference on Computing Frontiers, 2024

A Gigabit, DMA-enhanced Open-Source Ethernet Controller for Mixed-Criticality Systems.
Proceedings of the 21st ACM International Conference on Computing Frontiers, 2024

QR-PULP: Streamlining QR Decomposition for RISC-V Parallel Ultra-Low-Power Platforms.
Proceedings of the 21st ACM International Conference on Computing Frontiers, 2024

Spatzformer: An Efficient Reconfigurable Dual-Core RISC-V V Cluster for Mixed Scalar-Vector Workloads.
Proceedings of the 35th IEEE International Conference on Application-specific Systems, 2024

2023
RedMule: A mixed-precision matrix-matrix operation engine for flexible and energy-efficient on-chip linear algebra and TinyML training acceleration.
Future Gener. Comput. Syst., December, 2023

CVA6 RISC-V Virtualization: Architecture, Microarchitecture, and Design Space Exploration.
IEEE Trans. Very Large Scale Integr. Syst., November, 2023

Graphene-Based Wireless Agile Interconnects for Massive Heterogeneous Multi-Chip Processors.
IEEE Wirel. Commun., August, 2023

Scalable Hierarchical Instruction Cache for Ultralow-Power Processors Clusters.
IEEE Trans. Very Large Scale Integr. Syst., April, 2023

Energy Efficiency of Opportunistic Refreshing for Gain-Cell Embedded DRAM.
IEEE Trans. Circuits Syst. I Regul. Pap., April, 2023

Dustin: A 16-Cores Parallel Ultra-Low-Power Cluster With 2b-to-32b Fully Flexible Bit-Precision and Vector Lockstep Execution Mode.
IEEE Trans. Circuits Syst. I Regul. Pap., 2023

Scalable Hierarchical Instruction Cache for Ultra-Low-Power Processors Clusters.
CoRR, 2023

Marsellus: A Heterogeneous RISC-V AI-IoT End-Node SoC with 2-to-8b DNN Acceleration and 30%-Boost Adaptive Body Biasing.
CoRR, 2023

Echoes: a 200 GOPS/W Frequency Domain SoC with FFT Processor and I2S DSP for Flexible Data Acquisition from Microphone Arrays.
CoRR, 2023

DARKSIDE: A Heterogeneous RISC-V Compute Cluster for Extreme-Edge On-Chip DNN Inference and Training.
CoRR, 2023

A 3 TOPS/W RISC-V Parallel Cluster for Inference of Fine-Grain Mixed-Precision Quantized Neural Networks.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2023

A 12.4TOPS/W @ 136GOPS AI-IoT System-on-Chip with 16 RISC-V, 2-to-8b Precision-Scalable DNN Acceleration and 30%-Boost Adaptive Body Biasing.
Proceedings of the IEEE International Solid- State Circuits Conference, 2023

ECHOES: a 200 GOPS/W Frequency Domain SoC with FFT Processor and I<sup>2</sup>S DSP for Flexible Data Acquisition from Microphone Arrays.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2023

Cyber Security aboard Micro Aerial Vehicles: An OpenTitan-based Visual Communication Use Case.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2023

Shaheen: An Open, Secure, and Scalable RV64 SoC for Autonomous Nano-UAVs.
Proceedings of the 35th IEEE Hot Chips Symposium, 2023

Siracusa: A Low-Power On-Sensor RISC-V SoC for Extended Reality Visual Processing in 16nm CMOS.
Proceedings of the 49th IEEE European Solid State Circuits Conference, 2023

Reducing Load-Use Dependency-Induced Performance Penalty in the Open-Source RISC-V CVA6 CPU.
Proceedings of the 26th Euromicro Conference on Digital System Design, 2023

HULK-V: a Heterogeneous Ultra-low-power Linux capable RISC-V SoC.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2023

TransLib: A Library to Explore Transprecision Floating-Point Arithmetic on Multi-Core IoT End-Nodes.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2023

End-to-End DNN Inference on a Massively Parallel Analog In Memory Computing Architecture.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2023

PATRONoC: Parallel AXI Transport Reducing Overhead for Networks-on-Chip targeting Multi-Accelerator DNN Platforms at the Edge.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

2022
A Low-Power Transprecision Floating-Point Cluster for Efficient Near-Sensor Data Analytics.
IEEE Trans. Parallel Distributed Syst., 2022

Vega: A Ten-Core SoC for IoT Endnodes With DNN Acceleration and Cognitive Wake-Up From MRAM-Based State-Retentive Sleep Mode.
IEEE J. Solid State Circuits, 2022

A Heterogeneous In-Memory Computing Cluster for Flexible End-to-End Inference of Real-World Deep Neural Networks.
IEEE J. Emerg. Sel. Topics Circuits Syst., 2022

ControlPULP: A RISC-V Power Controller for HPC Processors with Parallel Control-Law Computation Acceleration.
Proceedings of the Embedded Computer Systems: Architectures, Modeling, and Simulation, 2022

Kraken: A Direct Event/Frame-Based Multi-sensor Fusion SoC for Ultra-Efficient Visual Processing in Nano-UAVs.
Proceedings of the 2022 IEEE Hot Chips 34 Symposium, 2022

Darkside: 2.6GFLOPS, 8.7mW Heterogeneous RISC-V Cluster for Extreme-Edge On-Chip DNN Inference and Training.
Proceedings of the 48th IEEE European Solid State Circuits Conference, 2022

RedMulE: A Compact FP16 Matrix-Multiplication Accelerator for Adaptive Deep Learning on RISC-V-Based Ultra-Low-Power SoCs.
Proceedings of the 2022 Design, Automation & Test in Europe Conference & Exhibition, 2022

Scale up your In-Memory Accelerator: Leveraging Wireless-on-Chip Communication for AIMC-based CNN Inference.
Proceedings of the 4th IEEE International Conference on Artificial Intelligence Circuits and Systems, 2022

2021
Arnold: An eFPGA-Augmented RISC-V SoC for Flexible and Low-Power IoT End Nodes.
IEEE Trans. Very Large Scale Integr. Syst., 2021

Analytical Modeling of Jitter in Bang-Bang CDR Circuits Featuring Phase Interpolation.
IEEE Trans. Very Large Scale Integr. Syst., 2021

A Fully Integrated 5-mW, 0.8-Gbps Energy-Efficient Chip-to-Chip Data Link for Ultralow-Power IoT End-Nodes in 65-nm CMOS.
IEEE Trans. Very Large Scale Integr. Syst., 2021

Energy-Efficient Hardware-Accelerated Synchronization for Shared-L1-Memory Multiprocessor Clusters.
IEEE Trans. Parallel Distributed Syst., 2021

A 0.5GHz 0.35mW LDO-Powered Constant-Slope Phase Interpolator With 0.22% INL.
IEEE Trans. Circuits Syst. II Express Briefs, 2021

DORY: Automatic End-to-End Deployment of Real-World DNNs on Low-Cost IoT MCUs.
IEEE Trans. Computers, 2021

Vega: A 10-Core SoC for IoT End-Nodes with DNN Acceleration and Cognitive Wake-Up From MRAM-Based State-Retentive Sleep Mode.
CoRR, 2021

A Fully-Integrated 5mW, 0.8Gbps Energy-Efficient Chip-to-Chip Data Link for Ultra-Low-Power IoT End-Nodes in 65-nm CMOS.
CoRR, 2021

Hardware-In-The Loop Emulation for Agile Co-Design of Parallel Ultra-Low Power IoT Processors.
Proceedings of the 29th IFIP/IEEE International Conference on Very Large Scale Integration, 2021

4.4 A 1.3TOPS/W @ 32GOPS Fully Integrated 10-Core SoC for IoT End-Nodes with 1.7μW Cognitive Wake-Up From MRAM-Based State-Retentive Sleep Mode.
Proceedings of the IEEE International Solid-State Circuits Conference, 2021

GVSoC: A Highly Configurable, Fast and Accurate Full-Platform Simulator for RISC-V based IoT Processors.
Proceedings of the 39th IEEE International Conference on Computer Design, 2021

A 1.15 TOPS/W, 16-Cores Parallel Ultra-Low Power Cluster with 2b-to-32b Fully Flexible Bit-Precision and Vector Lockstep Execution Mode.
Proceedings of the 47th ESSCIRC 2021, 2021


XpulpNN: Enabling Energy Efficient and Flexible Inference of Quantized Neural Networks on RISC-V based IoT End Nodes.
Proceedings of the 28th IEEE Symposium on Computer Arithmetic, 2021

Streamlining the OpenMP Programming Model on Ultra-Low-Power Multi-core MCUs.
Proceedings of the Architecture of Computing Systems - 34th International Conference, 2021

End-to-end 100-TOPS/W Inference With Analog In-Memory Computing: Are We There Yet?
Proceedings of the 3rd IEEE International Conference on Artificial Intelligence Circuits and Systems, 2021

2020
Always-On 674μ W@4GOP/s Error Resilient Binary Neural Networks With Aggressive SRAM Voltage Scaling on a 22-nm IoT End-Node.
IEEE Trans. Circuits Syst., 2020

Modular Design and Optimization of Biomedical Applications for Ultralow Power Heterogeneous Platforms.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2020

A custom processor for protocol-independent packet parsing.
Microprocess. Microsystems, 2020

Performance-aware predictive-model-based on-chip body-bias regulation strategy for an ULP multi-core cluster in 28 nm UTBB FD-SOI.
Integr., 2020

Exploring NEURAghe: A Customizable Template for APSoC-Based CNN Inference at the Edge.
IEEE Embed. Syst. Lett., 2020

Impact of Memory Voltage Scaling on Accuracy and Resilience of Deep Learning Based Edge Devices.
IEEE Des. Test, 2020

XpulpNN: Enabling Energy Efficient and Flexible Inference of Quantized Neural Network on RISC-V based IoT End Nodes.
CoRR, 2020

Graphene-based Wireless Agile Interconnects for Massive Heterogeneous Multi-chip Processors.
CoRR, 2020

A transprecision floating-point cluster for efficient near-sensor data analytics.
CoRR, 2020

Performance-Aware Predictive-Model-Based On-Chip Body-Bias Regulation Strategy for an ULP Multi-Core Cluster in 28nm UTBB FD-SOI.
CoRR, 2020

Always-On 674uW @ 4GOP/s Error Resilient Binary Neural Networks with Aggressive SRAM Voltage Scaling on a 22nm IoT End-Node.
CoRR, 2020

Flexible Software-Defined Packet Processing Using Low-Area Hardware.
IEEE Access, 2020

A Mixed-Precision RISC-V Processor for Extreme-Edge DNN Inference.
Proceedings of the 2020 IEEE Computer Society Annual Symposium on VLSI, 2020

An Energy-Efficient Low-Voltage Swing Transceiver for mW-Range IoT End-Nodes.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2020

TRANSPIRE: An energy-efficient TRANSprecision floating-point Programmable archItectuRE.
Proceedings of the 2020 Design, Automation & Test in Europe Conference & Exhibition, 2020

Energy-Efficient Two-level Instruction Cache Design for an Ultra-Low-Power Multi-core Cluster.
Proceedings of the 2020 Design, Automation & Test in Europe Conference & Exhibition, 2020

XpulpNN: Accelerating Quantized Neural Networks on RISC-V Processors Through ISA Extensions.
Proceedings of the 2020 Design, Automation & Test in Europe Conference & Exhibition, 2020

Enabling mixed-precision quantized neural networks in extreme-edge devices.
Proceedings of the 17th ACM International Conference on Computing Frontiers, 2020

Neuro-PULP: A Paradigm Shift Towards Fully Programmable Platforms for Neural Interfaces.
Proceedings of the 2nd IEEE International Conference on Artificial Intelligence Circuits and Systems, 2020

2019
An Energy-Efficient Integrated Programmable Array Accelerator and Compilation Flow for Near-Sensor Ultralow Power Processing.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2019

BioWolf: A Sub-10-mW 8-Channel Advanced Brain-Computer Interface Platform With a Nine-Core Processor and BLE Connectivity.
IEEE Trans. Biomed. Circuits Syst., 2019

Online Learning and Classification of EMG-Based Gestures on a Parallel Ultra-Low Power Platform Using Hyperdimensional Computing.
IEEE Trans. Biomed. Circuits Syst., 2019

Mr.Wolf: An Energy-Precision Scalable Parallel Ultra Low Power SoC for IoT Edge Processing.
IEEE J. Solid State Circuits, 2019

Hyperdrive: A Multi-Chip Systolically Scalable Binary-Weight CNN Inference Engine.
IEEE J. Emerg. Sel. Topics Circuits Syst., 2019

PULP-NN: Accelerating Quantized Neural Networks on Parallel Ultra-Low-Power RISC-V Processors.
CoRR, 2019

An Explicitly Parallel Architecture for Packet Processing in Software Defined Networks.
Proceedings of the 2019 IEEE Nordic Circuits and Systems Conference, 2019

PULP-NN: A Computing Library for Quantized Neural Network inference at the edge on RISC-V Based Parallel Ultra Low Power Clusters.
Proceedings of the 26th IEEE International Conference on Electronics, Circuits and Systems, 2019

A PULP-based Parallel Power Controller for Future Exascale Systems.
Proceedings of the 26th IEEE International Conference on Electronics, Circuits and Systems, 2019

Reducing Crossbar Costs in the Match-Action Pipeline.
Proceedings of the 20th IEEE International Conference on High Performance Switching and Routing, 2019

Design and Evaluation of SmallFloat SIMD extensions to the RISC-V ISA.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2019

Hardware-Accelerated Energy-Efficient Synchronization and Communication for Ultra-Low-Power Tightly Coupled Clusters.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2019

DORY: Lightweight memory hierarchy management for deep NN inference on IoT endnodes: work-in-progress.
Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis Companion, 2019

2018
NEURAghe: Exploiting CPU-FPGA Synergies for Efficient and Flexible CNN Inference Acceleration on Zynq SoCs.
ACM Trans. Reconfigurable Technol. Syst., 2018

Neurostream: Scalable and Energy Efficient Deep Learning with Smart Memory Cubes.
IEEE Trans. Parallel Distributed Syst., 2018

The Quest for Energy-Efficient I$ Design in Ultra-Low-Power Clustered Many-Cores.
IEEE Trans. Multi Scale Comput. Syst., 2018

A Heterogeneous Multicore System on Chip for Energy Efficient Brain Inspired Computing.
IEEE Trans. Circuits Syst. II Express Briefs, 2018

Synergistic HW/SW Approximation Techniques for Ultralow-Power Parallel Computing.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2018

YodaNN: An Architecture for Ultralow Power Binary-Weight CNN Acceleration.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2018

Power mitigation of a heterogeneous multicore architecture on FPGA/ASIC by DFS/DVFS techniques.
Microprocess. Microsystems, 2018

A sensor fusion approach for drowsiness detection in wearable ultra-low-power systems.
Inf. Fusion, 2018

Low-latency Packet Parsing in Software Defined Networks.
Proceedings of the 2018 IEEE Nordic Circuits and Systems Conference, 2018

Hyperdrive: A Systolically Scalable Binary-Weight CNN Inference Engine for mW IoT End-Nodes.
Proceedings of the 2018 IEEE Computer Society Annual Symposium on VLSI, 2018

Live Demonstration: Body-Bias Based Performance Monitoring and Compensation for a Near-Threshold Multi-Core Cluster in 28nm FD-SOI Technology.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2018

A Transprecision Floating-Point Architecture for Energy-Efficient Embedded Computing.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2018

Sub-mW multi-Gbps chip-to-chip communication Links for Ultra-Low Power IoT end-nodes.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2018

A Heterogeneous Cluster with Reconfigurable Accelerator for Energy Efficient Near-Sensor Data Analytics.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2018

Compressed Sensing Based Seizure Detection for an Ultra Low Power Multi-core Architecture.
Proceedings of the 2018 International Conference on High Performance Computing & Simulation, 2018

A Fully Programmable eFPGA-Augmented SoC for Smart-Power Applications.
Proceedings of the 25th IEEE International Conference on Electronics, Circuits and Systems, 2018

Mr. Wolf: A 1 GFLOP/s Energy-Proportional Parallel Ultra Low Power SoC for IOT Edge Processing.
Proceedings of the 44th IEEE European Solid State Circuits Conference, 2018

A transprecision floating-point platform for ultra-low power computing.
Proceedings of the 2018 Design, Automation & Test in Europe Conference & Exhibition, 2018

Energy proportionality in near-threshold computing servers and cloud data centers: Consolidating or Not?
Proceedings of the 2018 Design, Automation & Test in Europe Conference & Exhibition, 2018

PULP-HD: accelerating brain-inspired high-dimensional computing on a parallel ultra-low power platform.
Proceedings of the 55th Annual Design Automation Conference, 2018

Always-ON visual node with a hardware-software event-based binarized neural network inference engine.
Proceedings of the 15th ACM International Conference on Computing Frontiers, 2018

An Explicitly Parallel Architecture for Packet Parsing in Software Defined Networks.
Proceedings of the 29th IEEE International Conference on Application-specific Systems, 2018

GAP-8: A RISC-V SoC for AI at the Edge of the IoT.
Proceedings of the 29th IEEE International Conference on Application-specific Systems, 2018

2017
Near-Threshold RISC-V Core With DSP Extensions for Scalable IoT Endpoint Devices.
IEEE Trans. Very Large Scale Integr. Syst., 2017

Logic-Base Interconnect Design for Near Memory Computing in the Smart Memory Cube.
IEEE Trans. Very Large Scale Integr. Syst., 2017

An IoT Endpoint System-on-Chip for Secure and Energy-Efficient Near-Sensor Analytics.
IEEE Trans. Circuits Syst. I Regul. Pap., 2017

Energy-Efficient Near-Threshold Parallel Computing: The PULPv2 Cluster.
IEEE Micro, 2017

Increasing the energy efficiency of microcontroller platforms with low-design margin co-processors.
Microprocess. Microsystems, 2017

A Sub-mW IoT-Endnode for Always-On Visual Monitoring and Smart Triggering.
IEEE Internet Things J., 2017

A Self-Aware Architecture for PVT Compensation and Power Nap in Near Threshold Processors.
IEEE Des. Test, 2017

Slow and steady wins the race? A comparison of ultra-low-power RISC-V cores for Internet-of-Things applications.
Proceedings of the 27th International Symposium on Power and Timing Modeling, 2017

μDMA: An autonomous I/O subsystem for IoT end-nodes.
Proceedings of the 27th International Symposium on Power and Timing Modeling, 2017

Temperature and process-aware performance monitoring and compensation for an ULP multi-core cluster in 28nm UTBB FD-SOI technology.
Proceedings of the 27th International Symposium on Power and Timing Modeling, 2017

A wearable EEG-based drowsiness detection system with blink duration and alpha waves analysis.
Proceedings of the 8th International IEEE/EMBS Conference on Neural Engineering, 2017

A 142MOPS/mW integrated programmable array accelerator for smart visual processing.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2017

Efficient mapping of CDFG onto coarse-grained reconfigurable array architectures.
Proceedings of the 22nd Asia and South Pacific Design Automation Conference, 2017

2016
PULP: A Ultra-Low Power Parallel Accelerator for Energy-Efficient and Flexible Embedded Vision.
J. Signal Process. Syst., 2016

Power, Area, and Performance Optimization of Standard Cell Memory Arrays Through Controlled Placement.
ACM Trans. Design Autom. Electr. Syst., 2016

YodaNN: An Ultra-Low Power Convolutional Neural Network Accelerator Based on Binary Weights.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2016

A heterogeneous multi-core system-on-chip for energy efficient brain inspired vision.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2016

Energy-efficient design of an always-on smart visual trigger.
Proceedings of the IEEE International Smart Cities Conference, 2016

Always-on motion detection with application-level error control on a near-threshold approximate computing platform.
Proceedings of the 2016 IEEE International Conference on Electronics, Circuits and Systems, 2016

A 2 MS/s 10A Hall current sensor SoC with digital compressive sensing encoder in 0.16 µm BCD.
Proceedings of the ESSCIRC Conference 2016: 42<sup>nd</sup> European Solid-State Circuits Conference, 2016

Towards near-threshold server processors.
Proceedings of the 2016 Design, Automation & Test in Europe Conference & Exhibition, 2016

Enabling the heterogeneous accelerator model on ultra-low power microcontroller platforms.
Proceedings of the 2016 Design, Automation & Test in Europe Conference & Exhibition, 2016

193 MOPS/mW @ 162 MOPS, 0.32V to 1.15V voltage range multi-core accelerator for energy efficient parallel and sequential digital processing.
Proceedings of the 2016 IEEE Symposium in Low-Power and High-Speed Chips, 2016

Scalable EEG seizure detection on an ultra low power multi-core architecture.
Proceedings of the IEEE Biomedical Circuits and Systems Conference, 2016

Design and Evaluation of a Processing-in-Memory Architecture for the Smart Memory Cube.
Proceedings of the Architecture of Computing Systems - ARCS 2016, 2016

2015
A Modular Shared L2 Memory Design for 3-D Integration.
IEEE Trans. Very Large Scale Integr. Syst., 2015

Synergistic Architecture and Programming Model Support for Approximate Micropower Computing.
Proceedings of the 2015 IEEE Computer Society Annual Symposium on VLSI, 2015

PULP: A parallel ultra low power platform for next generation IoT applications.
Proceedings of the 2015 IEEE Hot Chips 27 Symposium (HCS), 2015

Reducing energy consumption in microcontroller-based platforms with low design margin co-processors.
Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, 2015

High performance AXI-4.0 based interconnect for extensible smart memory cubes.
Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, 2015

Exploring multi-banked shared-L1 program cache on ultra-low power, tightly coupled processor clusters.
Proceedings of the 12th ACM International Conference on Computing Frontiers, 2015

Controlled placement of standard cell memory arrays for high density and low power in 28nm FD-SOI.
Proceedings of the 20th Asia and South Pacific Design Automation Conference, 2015

2014
Multicore Signal Processing Platform With Heterogeneous Configurable Hardware Accelerators.
IEEE Trans. Very Large Scale Integr. Syst., 2014

Energy-efficient vision on the PULP platform for ultra-low power parallel computing.
Proceedings of the 2014 IEEE Workshop on Signal Processing Systems, 2014

Customizing an open source processor to fit in an ultra-low power cluster with a shared L1 memory.
Proceedings of the Great Lakes Symposium on VLSI 2014, GLSVLSI '14, Houston, TX, USA - May 21, 2014

Hybrid memory architecture for voltage scaling in ultra-low power multi-core biomedical processors.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2014

Ultra-low-latency lightweight DMA for tightly coupled multi-core clusters.
Proceedings of the Computing Frontiers Conference, CF'14, 2014

2013
Application Space Exploration of a Heterogeneous Run-Time Configurable Digital Signal Processor.
IEEE Trans. Very Large Scale Integr. Syst., 2013

A variation tolerant architecture for ultra low power multi-processor cluster.
Proceedings of the 2013 23rd International Workshop on Power and Timing Modeling, 2013

Exploiting body biasing for leakage reduction: A case study.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2013

2011
The MORPHEUS Heterogeneous Dynamically Reconfigurable Platform.
Int. J. Parallel Program., 2011

2010
A Heterogeneous Digital Signal Processor for Dynamically Reconfigurable Computing.
IEEE J. Solid State Circuits, 2010

A coarse-grain reconfigurable architecture for multimedia applications supporting subword and floating-point calculations.
J. Syst. Archit., 2010

2009
A multi-core signal processor for heterogeneous reconfigurable computing.
Proceedings of the 2008 IEEE International Symposium on System-on-Chip, 2009

RTL-to-layout implementation of an embedded coarse grained architecture for dynamically reconfigurable computing in systems-on-chip.
Proceedings of the 2008 IEEE International Symposium on System-on-Chip, 2009

A heterogeneous digital signal processor implementation for dynamically reconfigurable computing.
Proceedings of the IEEE Custom Integrated Circuits Conference, 2009

2008
Design space exploration of an open-source, IP-reusable, scalable floating-point engine for embedded applications.
J. Syst. Archit., 2008

Implementation of a floating-point matrix-vector multiplication on a reconfigurable architecture.
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

2005
A FPGA Implementation of An Open-Source Floating-Point Computation System.
Proceedings of the 2005 International Symposium on System-on-Chip, 2005


  Loading...