Luca Benini

Orcid: 0000-0001-8068-3806

Affiliations:
  • University of Bologna, Italy
  • ETH Zurich, Switzerland
  • Università di Bologna, Italy (former)


According to our database1, Luca Benini authored at least 1,397 papers between 1994 and 2024.

Collaborative distances:

Awards

ACM Fellow

ACM Fellow 2016, "For contributions to the design of low power multi-processor systems".

IEEE Fellow

IEEE Fellow 2007, "For contributions to design technologies for low power design of integrated circuits and systems".

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Enabling Efficient Hybrid Systolic Computation in Shared-L1-Memory Manycore Clusters.
IEEE Trans. Very Large Scale Integr. Syst., September, 2024

Hier-3D: A Methodology for Physical Hierarchy Exploration of 3-D ICs.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., July, 2024

Ara2: Exploring Single- and Multi-Core Vector Processing With an Efficient RVV 1.0 Compliant Open-Source Processor.
IEEE Trans. Computers, July, 2024

Siracusa: A 16 nm Heterogenous RISC-V SoC for Extended Reality With At-MRAM Neural Engine.
IEEE J. Solid State Circuits, July, 2024

CV32RT: Enabling Fast Interrupt and Context Switching for RISC-V Microcontrollers.
IEEE Trans. Very Large Scale Integr. Syst., June, 2024

Reducing False Alarms in Wearable Seizure Detection With EEGformer: A Compact Transformer Model for MCUs.
IEEE Trans. Biomed. Circuits Syst., June, 2024

Stargate: Multimodal Sensor Fusion for Autonomous Navigation on Miniaturized UAVs.
IEEE Internet Things J., June, 2024

A Heterogeneous RISC-V Based SoC for Secure Nano-UAV Navigation.
IEEE Trans. Circuits Syst. I Regul. Pap., May, 2024

NanoSLAM: Enabling Fully Onboard SLAM for Tiny Robots.
IEEE Internet Things J., April, 2024

ControlPULP: A RISC-V On-Chip Parallel Power Controller for Many-Core HPC Processors with FPGA-Based Hardware-In-The-Loop Power and Thermal Emulation.
Int. J. Parallel Program., April, 2024

A High-Performance, Energy-Efficient Modular DMA Engine Architecture.
IEEE Trans. Computers, January, 2024

Marsellus: A Heterogeneous RISC-V AI-IoT End-Node SoC With 2-8 b DNN Acceleration and 30%-Boost Adaptive Body Biasing.
IEEE J. Solid State Circuits, January, 2024

Self-Sustaining Ultrawideband Positioning System for Event-Driven Indoor Localization.
IEEE Internet Things J., January, 2024

A Muscle Pennation Angle Estimation Framework From Raw Ultrasound Data for Wearable Biomedical Instrumentation.
IEEE Trans. Instrum. Meas., 2024

HazardNet: A thermal hazard prediction framework for datacenters.
Future Gener. Comput. Syst., 2024

Culsans: An Efficient Snoop-based Coherency Unit for the CVA6 Open Source RISC-V application processor.
CoRR, 2024

GAP9Shield: A 150GOPS AI-capable Ultra-low Power Module for Vision and Ranging Applications on Nano-drones.
CoRR, 2024

Distilling Tiny and Ultra-fast Deep Neural Networks for Autonomous Navigation on Nano-UAVs.
CoRR, 2024

Design and Experimental Investigation of Trikarenos: A Fault-Tolerant 28nm RISC-V-based SoC.
CoRR, 2024

Spatzformer: An Efficient Reconfigurable Dual-Core RISC-V V Cluster for Mixed Scalar-Vector Workloads.
CoRR, 2024

Ultra-Lightweight Collaborative Mapping for Robot Swarms.
CoRR, 2024

Compressed Latent Replays for Lightweight Continual Learning on Spiking Neural Networks.
CoRR, 2024

BISeizuRe: BERT-Inspired Seizure Data Representation to Improve Epilepsy Monitoring.
CoRR, 2024

Basilisk: An End-to-End Open-Source Linux-Capable RISC-V SoC in 130nm CMOS.
CoRR, 2024

Occamy: A 432-Core 28.1 DP-GFLOP/s/W 83% FPU Utilization Dual-Chiplet, Dual-HBM2E RISC-V-based Accelerator for Stencil and Sparse Linear Algebra Computations with 8-to-64-bit Floating-Point Support in 12nm FinFET.
CoRR, 2024

Low Latency Visual Inertial Odometry with On-Sensor Accelerated Optical Flow for Resource-Constrained UAVs.
CoRR, 2024

GAPses: Versatile smart glasses for comfortable and fully-dry acquisition and parallel ultra-low-power processing of EEG and EOG.
CoRR, 2024

SentryCore: A RISC-V Co-Processor System for Safe, Real-Time Control Applications.
CoRR, 2024

Optimizing Foundation Model Inference on a Many-tiny-core Open-source RISC-V Platform.
CoRR, 2024

xTern: Energy-Efficient Ternary Neural Network Inference on RISC-V-Based Edge Systems.
CoRR, 2024

Modeling and Controlling Many-Core HPC Processors: an Alternative to PID and Moving Average Algorithms.
CoRR, 2024

A Passive and Asynchronous Wake-up Receiver for Acoustic Underwater Communication.
CoRR, 2024

SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models.
CoRR, 2024

Insights from Basilisk: Are Open-Source EDA Tools Ready for a Multi-Million-Gate, Linux-Booting RV64 SoC Design?
CoRR, 2024

Basilisk: Achieving Competitive Performance with Open EDA Tools on an Open-Source Linux-Capable RISC-V SoC.
CoRR, 2024

Multi-resolution Rescored ByteTrack for Video Object Detection on Ultra-low-power Embedded Systems.
CoRR, 2024

SARIS: Accelerating Stencil Computations on Energy-Efficient RISC-V Compute Clusters with Indirect Stream Registers.
CoRR, 2024

Optimizing the Deployment of Tiny Transformers on Low-Power MCUs.
CoRR, 2024

Foundation Models for Structural Health Monitoring.
CoRR, 2024

BatDeck: Advancing Nano-drone Navigation with Low-power Ultrasound-based Obstacle Avoidance.
CoRR, 2024

Combining Local and Global Perception for Autonomous Navigation on Nano-UAVs.
CoRR, 2024

Boosting keyword spotting through on-device learnable user speech characteristics.
CoRR, 2024

SzCORE: A Seizure Community Open-source Research Evaluation framework for the validation of EEG-based automated seizure detection algorithms.
CoRR, 2024

A Noisy Beat is Worth 16 Words: a Tiny Transformer for Low-Power Arrhythmia Classification on Microcontrollers.
CoRR, 2024

TOP: Towards Open & Predictable Heterogeneous SoCs.
CoRR, 2024

Data-Driven Power Modeling and Monitoring via Hardware Performance Counters Tracking.
CoRR, 2024

An Extreme-Edge TCN-Based Low-Latency Collision-Avoidance Safety System for Industrial Machinery.
IEEE Access, 2024

Flexible and Fully Quantized Lightweight TinyissimoYOLO for Ultra-Low-Power Edge Systems.
IEEE Access, 2024

Stargate: Multimodal Sensor Fusion for Autonomous Navigation on Miniaturized UAVs.
Dataset, 2024

Exploring the Utility of Graph Methods in HPC Thermal Modeling.
Proceedings of the Companion of the 15th ACM/SPEC International Conference on Performance Engineering, 2024

OSMOSIS: Enabling Multi-Tenancy in Datacenter SmartNICs.
Proceedings of the 2024 USENIX Annual Technical Conference, 2024

A Precision-Optimized Fixed-Point Near-Memory Digital Processing Unit for Analog In-Memory Computing.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2024

3D Partitioning with Pipeline Optimization for Low-Latency Memory Access in Many-Core SoCs.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2024

Fully Onboard Low-Power Localization with Semantic Sensor Fusion on a Nano-UAV using Floor Plans.
Proceedings of the IEEE International Conference on Robotics and Automation, 2024

TeraPool-SDR: An 1.89TOPS 1024 RV-Cores 4MiB Shared-L1 Cluster for Next-Generation Open-Source Software-Defined Radios.
Proceedings of the Great Lakes Symposium on VLSI 2024, 2024

Near-Memory Parallel Indexing and Coalescing: Enabling Highly Efficient Indirect Access for SpMV.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2024

12 mJ Per Class On-Device Online Few-Shot Class-Incremental Learning.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2024

Work in Progress: Linear Transformers for TinyML.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2024

Zero-Shot Classification Using Hyperdimensional Computing.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2024

LRSCwait: Enabling Scalable and Efficient Synchronization in Manycore Systems Through Polling-Free and Retry-Free Operation.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2024

MX: Enhancing RISC-V's Vector ISA for Ultra-Low Overhead, Energy-Efficient Matrix Multiplication.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2024

PELS: A Lightweight and Flexible Peripheral Event Linking System for Ultra-Low Power IoT Processors.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2024

Optimizing Offload Performance in Heterogeneous MPSoCs.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2024

AXI-REALM: A Lightweight and Modular Interconnect Extension for Traffic Regulation and Monitoring of Heterogeneous Real-Time SoCs.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2024

NARS: Neuromorphic Acceleration through Register-Streaming Extensions on RISC-V Cores.
Proceedings of the 21st ACM International Conference on Computing Frontiers, 2024

A Gigabit, DMA-enhanced Open-Source Ethernet Controller for Mixed-Criticality Systems.
Proceedings of the 21st ACM International Conference on Computing Frontiers, 2024

Driving Towards Safety: Online PPG-based Drowsiness Detection with TCNs.
Proceedings of the 6th IEEE International Conference on AI Circuits and Systems, 2024

On-Device Domain Learning for Keyword Spotting on Low-Power Extreme Edge Embedded Systems.
Proceedings of the 6th IEEE International Conference on AI Circuits and Systems, 2024

2023
Robust and Efficient Depth-Based Obstacle Avoidance for Autonomous Miniaturized UAVs.
IEEE Trans. Robotics, December, 2023

Sparse Stream Semantic Registers: A Lightweight ISA Extension Accelerating General Sparse Linear Algebra.
IEEE Trans. Parallel Distributed Syst., December, 2023

MemPool: A Scalable Manycore Architecture With a Low-Latency Shared L1 Memory.
IEEE Trans. Computers, December, 2023

Directly-trained Spiking Neural Networks for Deep Reinforcement Learning: Energy efficient implementation of event-based obstacle avoidance on a neuromorphic accelerator.
Neurocomputing, December, 2023

RedMule: A mixed-precision matrix-matrix operation engine for flexible and energy-efficient on-chip linear algebra and TinyML training acceleration.
Future Gener. Comput. Syst., December, 2023

7 μJ/inference end-to-end gesture recognition from dynamic vision sensor data using ternarized hybrid convolutional neural networks.
Future Gener. Comput. Syst., December, 2023

Reduced precision floating-point optimization for Deep Neural Network On-Device Learning on microcontrollers.
Future Gener. Comput. Syst., December, 2023

FlooNoC: A Multi-Tb/s Wide NoC for Heterogeneous AXI4 Traffic.
IEEE Des. Test, December, 2023

CVA6 RISC-V Virtualization: Architecture, Microarchitecture, and Design Space Exploration.
IEEE Trans. Very Large Scale Integr. Syst., November, 2023

Yun: An Open-Source, 64-Bit RISC-V-Based Vector Processor With Multi-Precision Integer and Floating-Point Support in 65-nm CMOS.
IEEE Trans. Circuits Syst. II Express Briefs, October, 2023

Cheshire: A Lightweight, Linux-Capable RISC-V Host Platform for Domain-Specific Accelerator Plug-In.
IEEE Trans. Circuits Syst. II Express Briefs, October, 2023

Dataset of the HazardNet: A Thermal Hazard Prediction Framework for Datacenters.
Dataset, October, 2023

Systematic Prevention of On-Core Timing Channels by Full Temporal Partitioning.
IEEE Trans. Computers, May, 2023

Scalable Hierarchical Instruction Cache for Ultralow-Power Processors Clusters.
IEEE Trans. Very Large Scale Integr. Syst., April, 2023

A neuro-vector-symbolic architecture for solving Raven's progressive matrices.
Nat. Mac. Intell., April, 2023

Energy-Efficient, Precise UWB-Based 3-D Localization of Sensor Nodes With a Nano-UAV.
IEEE Internet Things J., April, 2023

RUAD: Unsupervised anomaly detection in HPC systems.
Future Gener. Comput. Syst., April, 2023

Towards the Future Generation of Railway Localization Exploiting RTK and GNSS.
Dataset, April, 2023

Lightweight Neural Architecture Search for Temporal Convolutional Networks at the Edge.
IEEE Trans. Computers, March, 2023

Reducing the Energy Consumption of sEMG-Based Gesture Recognition at the Edge Using Transformers and Dynamic Inference.
Sensors, February, 2023

Raw data related to In-memory factorization of holographic perceptual representations.
Dataset, February, 2023

Toward the Future Generation of Railway Localization Exploiting RTK and GNSS.
IEEE Trans. Instrum. Meas., 2023

DNN Is Not All You Need: Parallelizing Non-neural ML Algorithms on Ultra-low-power IoT Processors.
ACM Trans. Embed. Comput. Syst., 2023

Dustin: A 16-Cores Parallel Ultra-Low-Power Cluster With 2b-to-32b Fully Flexible Bit-Precision and Vector Lockstep Execution Mode.
IEEE Trans. Circuits Syst. I Regul. Pap., 2023

TCN-CUTIE: A 1, 036-TOp/s/W, 2.72-µJ/Inference, 12.2-mW All-Digital Ternary Accelerator in 22-nm FDX Technology.
IEEE Micro, 2023

TCNCA: Temporal Convolution Network with Chunked Attention for Scalable Sequence Processing.
CoRR, 2023

A Survey on Design Methodologies for Accelerating Deep Learning on Heterogeneous Architectures.
CoRR, 2023

Stella Nera: Achieving 161 TOp/s/W with Multiplier-free DNN Acceleration based on Approximate Matrix Multiplication.
CoRR, 2023

Ara2: Exploring Single- and Multi-Core Vector Processing with an Efficient RVV1.0 Compliant Open-Source Processor.
CoRR, 2023

RapidChiplet: A Toolchain for Rapid Design Space Exploration of Chiplet Architectures.
CoRR, 2023

Ultra-Efficient On-Device Object Detection on AI-Integrated Smart Glasses with TinyissimoYOLO.
CoRR, 2023

Enhancing Neural Architecture Search with Multiple Hardware Constraints for Deep Learning Model Deployment on Tiny IoT Devices.
CoRR, 2023

Spatz: Clustering Compact RISC-V-Based Vector Units to Maximize Computing Efficiency.
CoRR, 2023

A Wearable Ultra-Low-Power sEMG-Triggered Ultrasound System for Long-Term Muscle Activity Monitoring.
CoRR, 2023

Fully Onboard SLAM for Distributed Mapping with a Swarm of Nano-Drones.
CoRR, 2023

Scalable Hierarchical Instruction Cache for Ultra-Low-Power Processors Clusters.
CoRR, 2023

Flexible and Fully Quantized Ultra-Lightweight TinyissimoYOLO for Ultra-Low-Power Edge Systems.
CoRR, 2023

Design of an energy aware petaflops class high performance cluster based on power architecture.
CoRR, 2023

A Survey on Deep Learning Hardware Accelerators for Heterogeneous HPC Platforms.
CoRR, 2023

FlooNoC: A Multi-Tbps Wide NoC for Heterogeneous AXI4 Traffic.
CoRR, 2023

Marsellus: A Heterogeneous RISC-V AI-IoT End-Node SoC with 2-to-8b DNN Acceleration and 30%-Boost Adaptive Body Biasing.
CoRR, 2023

Echoes: a 200 GOPS/W Frequency Domain SoC with FFT Processor and I2S DSP for Flexible Data Acquisition from Microphone Arrays.
CoRR, 2023

DARKSIDE: A Heterogeneous RISC-V Compute Cluster for Extreme-Edge On-Chip DNN Inference and Training.
CoRR, 2023

Factorizers for Distributed Sparse Block Codes.
CoRR, 2023

Hybrid Modular Redundancy: Exploring Modular Redundancy Approaches in RISC-V Multi-Core Computing Clusters for Reliable Processing in Space.
CoRR, 2023

Experimenting with Emerging ARM and RISC-V Systems for Decentralised Machine Learning.
CoRR, 2023

A Self-Sustainable and Micro-Second Time Synchronized Multi-Node Wireless System for Aerodynamic Monitoring on Wind Turbines.
IEEE Access, 2023

Securing Tiny Transformer-Based Computer Vision Models: Evaluating Real-World Patch Attacks.
Proceedings of the 9th IEEE World Forum on Internet of Things, 2023

Improving Data-Scarce Image Classification Through Multimodal Synthetic Data Pretraining.
Proceedings of the IEEE Sensors Applications Symposium, 2023

Fast Shared-Memory Barrier Synchronization for a 1024-Cores RISC-V Many-Core Cluster.
Proceedings of the Embedded Computer Systems: Architectures, Modeling, and Simulation, 2023

MIMONets: Multiple-Input-Multiple-Output Neural Networks Exploiting Computation in Superposition.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Solving Raven's Progressive Matrices via a Neuro-vector-symbolic Architecture.
Proceedings of the 17th International Workshop on Neural-Symbolic Learning and Reasoning, 2023

AutoCC: Automatic Discovery of Covert Channels in Time-Shared Hardware.
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

Towards a RISC-V Open Platform for Next-generation Automotive ECUs.
Proceedings of the 12th Mediterranean Conference on Embedded Computing, 2023

Event-based Low-Power and Low-Latency Regression Method for Hand Kinematics from Surface EMG.
Proceedings of the 9th International Workshop on Advances in Sensors and Interfaces, 2023

Towards Robust and Efficient On-board Mapping for Autonomous Miniaturized UAVs.
Proceedings of the 9th International Workshop on Advances in Sensors and Interfaces, 2023

Design and Evaluation of a LoRa Controlled Rugged Multisensor Unit for Induced Rockfall Experiments.
Proceedings of the 9th International Workshop on Advances in Sensors and Interfaces, 2023

A Fast and Accurate Optical Flow Camera for Resource-Constrained Edge Applications.
Proceedings of the 9th International Workshop on Advances in Sensors and Interfaces, 2023

ColibriUAV: An Ultra-Fast, Energy-Efficient Neuromorphic Edge Processing UAV-Platform with Event-Based and Frame-Based Cameras.
Proceedings of the 9th International Workshop on Advances in Sensors and Interfaces, 2023

From Nano-Drones to Cars - A RISC-V Open Platform for next-generation Vehicles.
Proceedings of the 9th International Workshop on Advances in Sensors and Interfaces, 2023

A 3 TOPS/W RISC-V Parallel Cluster for Inference of Fine-Grain Mixed-Precision Quantized Neural Networks.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2023

A 12.4TOPS/W @ 136GOPS AI-IoT System-on-Chip with 16 RISC-V, 2-to-8b Precision-Scalable DNN Acceleration and 30%-Boost Adaptive Body Biasing.
Proceedings of the IEEE International Solid- State Circuits Conference, 2023

Precision-aware Latency and Energy Balancing on Multi-Accelerator Platforms for DNN Inference.
Proceedings of the IEEE/ACM International Symposium on Low Power Electronics and Design, 2023

ITA: An Energy-Efficient Attention and Softmax Accelerator for Quantized Transformers.
Proceedings of the IEEE/ACM International Symposium on Low Power Electronics and Design, 2023

ECHOES: a 200 GOPS/W Frequency Domain SoC with FFT Processor and I<sup>2</sup>S DSP for Flexible Data Acquisition from Microphone Arrays.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2023

ColibriES: A Milliwatts RISC-V Based Embedded System Leveraging Neuromorphic and Neural Networks Hardware Accelerators for Low-Latency Closed-loop Control Applications.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2023

Quark: An Integer RISC-V Vector Processor for Sub-Byte Quantized DNN Inference.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2023

A Relative Infrastructure-less Localization Algorithm for Decentralized and Autonomous Swarm Formation.
IROS, 2023

LocalViT: Analyzing Locality in Vision Transformers.
IROS, 2023

Quantitative Evaluation of a Multi-Modal Camera Setup for Fusing Event Data with RGB Images.
Proceedings of the 2023 IEEE SENSORS, Vienna, Austria, October 29 - Nov. 1, 2023, 2023

Deep Neural Network Architecture Search for Accurate Visual Pose Estimation aboard Nano-UAVs.
Proceedings of the IEEE International Conference on Robotics and Automation, 2023

Learning continuous piecewise non-linear activation functions for deep neural networks.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

Trikarenos: A Fault-Tolerant RISC-V-based Microcontroller for CubeSats in 28nm.
Proceedings of the 30th IEEE International Conference on Electronics, Circuits and Systems, 2023

MinPool: A 16-core NUMA-L1 Memory RISC-V Processor Cluster for Always-on Image Processing in 65nm CMOS.
Proceedings of the 30th IEEE International Conference on Electronics, Circuits and Systems, 2023

Multi-sensory Anti-collision Design for Autonomous Nano-swarm Exploration.
Proceedings of the 30th IEEE International Conference on Electronics, Circuits and Systems, 2023

Shaheen: An Open, Secure, and Scalable RV64 SoC for Autonomous Nano-UAVs.
Proceedings of the 35th IEEE Hot Chips Symposium, 2023

PULP Fiction No More - Dependable PULP Systems for Space.
Proceedings of the IEEE European Test Symposium, 2023

Siracusa: A Low-Power On-Sensor RISC-V SoC for Extended Reality Visual Processing in 16nm CMOS.
Proceedings of the 49th IEEE European Solid State Circuits Conference, 2023

Reducing Load-Use Dependency-Induced Performance Penalty in the Open-Source RISC-V CVA6 CPU.
Proceedings of the 26th Euromicro Conference on Digital System Design, 2023

Land & Localize: An Infrastructure-free and Scalable Nano-Drones Swarm with UWB-based Localization.
Proceedings of the 19th International Conference on Distributed Computing in Smart Systems and the Internet of Things, 2023

AXI-Pack: Near-Memory Bus Packing for Bandwidth-Efficient Irregular Workloads.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2023

HULK-V: a Heterogeneous Ultra-low-power Linux capable RISC-V SoC.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2023

MemPool Meets Systolic: Flexible Systolic Computation in a Large Shared-Memory Processor Cluster.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2023

Fully On-board Low-Power Localization with Multizone Time-of-Flight Sensors on Nano-UAVs.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2023

TransLib: A Library to Explore Transprecision Floating-Point Arithmetic on Multi-Core IoT End-Nodes.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2023

Bio-inspired Autonomous Exploration Policies with CNN-based Object Detection on Nano-drones.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2023


An Ultra-Low-Power Serial Implementation for Sigmoid and Tanh Using CORDIC Algorithm.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2023

Energy-efficient Wearable-to-Mobile Offload of ML Inference for PPG-based Heart-Rate Estimation.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2023

End-to-End DNN Inference on a Massively Parallel Analog In Memory Computing Architecture.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2023

Efficient Parallelization of 5G-PUSCH on a Scalable RISC-V Many-Core Processor.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2023

Specialization meets Flexibility: a Heterogeneous Architecture for High-Efficiency, High-flexibility AR/VR Processing.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

PATRONoC: Parallel AXI Transport Reducing Overhead for Networks-on-Chip targeting Multi-Accelerator DNN Platforms at the Edge.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

HexaMesh: Scaling to Hundreds of Chiplets with an Optimized Chiplet Arrangement.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

Sparse Hamming Graph: A Customizable Network-on-Chip Topology.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

HTVM: Efficient Neural Network Deployment On Heterogeneous TinyML Platforms.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

Neuromorphic Optical Flow and Real-time Implementation with Event Cameras.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

BioGAP: a 10-Core FP-capable Ultra-Low Power IoT Processor, with Medical-Grade AFE and BLE Connectivity for Wearable Biosignal Processing.
Proceedings of the IEEE International Conference on Omni-layer Intelligent Systems, 2023

Experimenting with Emerging RISC-V Systems for Decentralised Machine Learning.
Proceedings of the 20th ACM International Conference on Computing Frontiers, 2023

WIP: Automatic DNN Deployment on Heterogeneous Platforms: the GAP9 Case Study.
Proceedings of the International Conference on Compilers, 2023

Online Unsupervised Arm Posture Adaptation for sEMG-based Gesture Recognition on a Parallel Ultra-Low-Power Microcontroller.
Proceedings of the IEEE Biomedical Circuits and Systems Conference, 2023

Enhancing Performance, Calibration Time and Efficiency in Brain-Machine Interfaces through Transfer Learning and Wearable EEG Technology.
Proceedings of the IEEE Biomedical Circuits and Systems Conference, 2023

Skilog: A Smart Sensor System for Performance Analysis and Biofeedback in Ski Jumping.
Proceedings of the IEEE Biomedical Circuits and Systems Conference, 2023

An Adaptive Dynamic Mixing Model for sEMG Real-Time ICA on an Ultra-Low Power Processor.
Proceedings of the IEEE Biomedical Circuits and Systems Conference, 2023

EpiDeNet: An Energy-Efficient Approach to Seizure Detection for Embedded Systems.
Proceedings of the IEEE Biomedical Circuits and Systems Conference, 2023

Towards a Novel Ultrasound System Based on Low-Frequency Feature Extraction From a Fully-Printed Flexible Transducer.
Proceedings of the IEEE Biomedical Circuits and Systems Conference, 2023

Free Bits: Latency Optimization of Mixed-Precision Quantized Neural Networks on the Edge.
Proceedings of the 5th IEEE International Conference on Artificial Intelligence Circuits and Systems, 2023

SALSA: Simulated Annealing based Loop-Ordering Scheduler for DNN Accelerators.
Proceedings of the 5th IEEE International Conference on Artificial Intelligence Circuits and Systems, 2023

Embedded neuromorphic attention model leveraging a novel low-power heterogeneous platform.
Proceedings of the 5th IEEE International Conference on Artificial Intelligence Circuits and Systems, 2023

2022
A Construction Kit for Efficient Low Power Neural Network Accelerator Designs.
ACM Trans. Embed. Comput. Syst., September, 2022

Dataflow Driven Partitioning of Machine Learning Applications for Optimal Energy Use in Batteryless Systems.
ACM Trans. Embed. Comput. Syst., September, 2022

HEROv2 Software Development Kit (SDK) Docker Image.
Dataset, May, 2022

A Low-Power Transprecision Floating-Point Cluster for Efficient Near-Sensor Data Analytics.
IEEE Trans. Parallel Distributed Syst., 2022

HEROv2: Full-Stack Open-Source Research Platform for Heterogeneous Computing.
IEEE Trans. Parallel Distributed Syst., 2022

Leveraging Tactile Sensors for Low Latency Embedded Smart Hands for Prosthetic and Robotic Applications.
IEEE Trans. Instrum. Meas., 2022

Trimming Feature Extraction and Inference for MCU-Based Edge NILM: A Systematic Approach.
IEEE Trans. Ind. Informatics, 2022

Vau Da Muntanialas: Energy-Efficient Multi-Die Scalable Acceleration of RNN Inference.
IEEE Trans. Circuits Syst. I Regul. Pap., 2022

Sub-mW Keyword Spotting on an MCU: Analog Binary Feature Extraction and Binary Neural Networks.
IEEE Trans. Circuits Syst. I Regul. Pap., 2022

Optimizing Random Forest-Based Inference on RISC-V MCUs at the Extreme Edge.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

CUTIE: Beyond PetaOp/s/W Ternary DNN Inference Acceleration With Better-Than-Binary Energy Efficiency.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

An Open-Source Platform for High-Performance Non-Coherent On-Chip Communication.
IEEE Trans. Computers, 2022

An open platform for efficient drone-to-sensor wireless ranging and data harvesting.
Sustain. Comput. Informatics Syst., 2022

Fly, Wake-up, Find: UAV-based Energy-efficient Localization for Distributed Sensor Nodes.
Sustain. Comput. Informatics Syst., 2022

Traffic Load Estimation from Structural Health Monitoring sensors using supervised learning.
Sustain. Comput. Informatics Syst., 2022

Efficient Low-Frequency SSVEP Detection with Wearable EEG Using Normalized Canonical Correlation Analysis.
Sensors, 2022

Vega: A Ten-Core SoC for IoT Endnodes With DNN Acceleration and Cognitive Wake-Up From MRAM-Based State-Retentive Sleep Mode.
IEEE J. Solid State Circuits, 2022

Fully Onboard AI-Powered Human-Drone Pose Estimation on Ultralow-Power Autonomous Flying Nano-UAVs.
IEEE Internet Things J., 2022

Exploring Scalable, Distributed Real-Time Anomaly Detection for Bridge Health Monitoring.
IEEE Internet Things J., 2022

Embedding Temporal Convolutional Networks for Energy-efficient PPG-based Heart Rate Monitoring.
ACM Trans. Comput. Heal., 2022

A Heterogeneous In-Memory Computing Cluster for Flexible End-to-End Inference of Real-World Deep Neural Networks.
IEEE J. Emerg. Sel. Topics Circuits Syst., 2022

Self-sustaining Ultra-wideband Positioning System for Event-driven Indoor Localization.
CoRR, 2022

CONVOLVE: Smart and seamless design of smart edge processors.
CoRR, 2022

TCN-CUTIE: A 1036 TOp/s/W, 2.72 uJ/Inference, 12.2 mW All-Digital Ternary Accelerator in 22 nm FDX Technology.
CoRR, 2022

In-memory factorization of holographic perceptual representations.
CoRR, 2022

Aerosense: A Self-Sustainable And Long-Range Bluetooth Wireless Sensor Node for Aerodynamic and Aeroacoustic Monitoring on Wind Turbines.
CoRR, 2022

WideVision: A Low-Power, Multi-Protocol Wireless Vision Platform for Distributed Surveillance.
Proceedings of the 18th International Conference on Wireless and Mobile Computing, 2022

An Optimized Heart Rate Detection System Based on Low-Power Microcontroller Platforms for Biosignal Processing.
Proceedings of the Advances in System-Integrated Intelligence, 2022

Rule-Based Thermal Anomaly Detection for Tier-0 HPC Systems.
Proceedings of the High Performance Computing. ISC High Performance 2022 International Workshops - Hamburg, Germany, May 29, 2022

Monte Cimone: Paving the Road for the First Generation of RISC-V High-Performance Computers.
Proceedings of the 35th IEEE International System-on-Chip Conference, 2022

Automatic Extraction of Muscle Fascicle Pennation Angle from Raw Ultrasound Data.
Proceedings of the IEEE Sensors Applications Symposium, 2022

Towards the Future Generation of Railway Localization and Signaling Exploiting sub-meter RTK GNSS.
Proceedings of the IEEE Sensors Applications Symposium, 2022

ControlPULP: A RISC-V Power Controller for HPC Processors with Parallel Control-Law Computation Acceleration.
Proceedings of the Embedded Computer Systems: Architectures, Modeling, and Simulation, 2022

PULP-TrainLib: Enabling On-Device Training for RISC-V Multi-core MCUs Through Performance-Driven Autotuning.
Proceedings of the Embedded Computer Systems: Architectures, Modeling, and Simulation, 2022

A Data-Driven Approach to Lightweight DVFS-Aware Counter-Based Power Modeling for Heterogeneous Platforms.
Proceedings of the Embedded Computer Systems: Architectures, Modeling, and Simulation, 2022

PULP: Extreme Energy Efficiency for Extreme Edge AI Acceleration.
Proceedings of the 11th Mediterranean Conference on Embedded Computing, 2022

On-Demand Redundancy Grouping: Selectable Soft-Error Tolerance for a Multicore Cluster.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2022

Soft Tiles: Capturing Physical Implementation Flexibility for Tightly-Coupled Parallel Processing Clusters.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2022

Multi-Complexity-Loss DNAS for Energy-Efficient and Memory-Constrained Deep Neural Networks.
Proceedings of the ISLPED '22: ACM/IEEE International Symposium on Low Power Electronics and Design, Boston, MA, USA, August 1, 2022

Hier-3D: A Hierarchical Physical Design Methodology for Face-to-Face-Bonded 3D ICs.
Proceedings of the ISLPED '22: ACM/IEEE International Symposium on Low Power Electronics and Design, Boston, MA, USA, August 1, 2022

Parallelizing Optical Flow Estimation on an Ultra-Low Power RISC-V Cluster for Nano-UAV Navigation.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2022

Demo Abstract: Towards Reliable Obstacle Avoidance for Nano-UAVs.
Proceedings of the 21st ACM/IEEE International Conference on Information Processing in Sensor Networks, 2022

Spatz: A Compact Vector Processing Unit for High-Performance and Energy-Efficient Shared-L1 Clusters.
Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design, 2022

Towards a Multi-Pixel Time-of-Flight Indoor Navigation System for Nano-Drone Applications.
Proceedings of the IEEE International Instrumentation and Measurement Technology Conference, 2022

Kraken: A Direct Event/Frame-Based Multi-sensor Fusion SoC for Ultra-Efficient Visual Processing in Nano-UAVs.
Proceedings of the 2022 IEEE Hot Chips 34 Symposium, 2022

Channel-wise Mixed-precision Assignment for DNN Inference on Constrained Edge Nodes.
Proceedings of the 13th IEEE International Green and Sustainable Computing Conference, 2022

ViT-LR: Pushing the Envelope for Transformer-Based on-Device Embedded Continual Learning.
Proceedings of the 13th IEEE International Green and Sustainable Computing Conference, 2022

Machine Learning Methodologies to Support HPC Systems Operations: Anomaly Detection.
Proceedings of the Euro-Par 2022: Parallel Processing Workshops, 2022

Analysing Supercomputer Nodes Behaviour with the Latent Representation of Deep Learning Models.
Proceedings of the Euro-Par 2022: Parallel Processing, 2022

In-memory Realization of In-situ Few-shot Continual Learning with a Dynamically Evolving Explicit Memory.
Proceedings of the 48th IEEE European Solid State Circuits Conference, 2022

Darkside: 2.6GFLOPS, 8.7mW Heterogeneous RISC-V Cluster for Extreme-Edge On-Chip DNN Inference and Training.
Proceedings of the 48th IEEE European Solid State Circuits Conference, 2022

A 283 pJ/b 240 Mb/s Floating-Point Baseband Accelerator for Massive MU-MIMO in 22FDX.
Proceedings of the 48th IEEE European Solid State Circuits Conference, 2022

AEPUS: a tool for the Automated Extraction of Pennation angles in Ultrasound images with low Signal-to-noise ratio for plane-wave imaging.
Proceedings of the 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2022

Energy-Efficient Tree-Based EEG Artifact Detection.
Proceedings of the 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2022

A Wireless System for EEG Acquisition and Processing in an Earbud Form Factor with 600 Hours Battery Lifetime.
Proceedings of the 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2022

BioWolf16: a 16-channel, 24-bit, 4kSPS Ultra-Low Power Platform for Wearable Clinical-grade Bio-potential Parallel Processing and Streaming.
Proceedings of the 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2022

RedMulE: A Compact FP16 Matrix-Multiplication Accelerator for Adaptive Deep Learning on RISC-V-Based Ultra-Low-Power SoCs.
Proceedings of the 2022 Design, Automation & Test in Europe Conference & Exhibition, 2022

Ternarized TCN for $\mu \mathrm{J}/\text{Inference}$ Gesture Recognition from DVS Event Frames.
Proceedings of the 2022 Design, Automation & Test in Europe Conference & Exhibition, 2022

SNE: an Energy-Proportional Digital Accelerator for Sparse Event-Based Convolutions.
Proceedings of the 2022 Design, Automation & Test in Europe Conference & Exhibition, 2022

A RDMA Interface for Ultra-Fast Ultrasound Data-Streaming over an Optical Link.
Proceedings of the 2022 Design, Automation & Test in Europe Conference & Exhibition, 2022

MemPool-3D: Boosting Performance and Efficiency of Shared-L1 Memory Many-Core Clusters with 3D Integration.
Proceedings of the 2022 Design, Automation & Test in Europe Conference & Exhibition, 2022

Bioformers: Embedding Transformers for Ultra-Low Power sEMG-based Gesture Recognition.
Proceedings of the 2022 Design, Automation & Test in Europe Conference & Exhibition, 2022

Training Quantised Neural Networks with STE Variants: the Additive Noise Annealing Algorithm.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Constrained Few-shot Class-incremental Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

A 1036 TOp/s/W, 12.2 mW, 2.72 μJ/Inference All Digital TNN Accelerator in 22 nm FDX Technology for TinyML Applications.
Proceedings of the IEEE Symposium in Low-Power and High-Speed Chips, 2022

Semi-supervised anomaly detection on a Tier-0 HPC system.
Proceedings of the CF '22: 19th ACM International Conference on Computing Frontiers, Turin, Italy, May 17, 2022

Reducing neural architecture search spaces with training-free statistics and computational graph clustering.
Proceedings of the CF '22: 19th ACM International Conference on Computing Frontiers, Turin, Italy, May 17, 2022

Meet Monte Cimone: exploring RISC-V high performance compute clusters.
Proceedings of the CF '22: 19th ACM International Conference on Computing Frontiers, Turin, Italy, May 17, 2022

Multi-level anomaly prediction in Tier-0 datacenter: a deep learning approach.
Proceedings of the CF '22: 19th ACM International Conference on Computing Frontiers, Turin, Italy, May 17, 2022

Modeling the Thermal and Power Control Subsystem in HPC Processors.
Proceedings of the IEEE Conference on Control Technology and Applications, 2022

sEMG Neural Spikes Reconstruction for Gesture Recognition on a Low-Power Multicore Processor.
Proceedings of the IEEE Biomedical Circuits and Systems Conference, 2022

A High SNR, Low-latency Dry EMG Acquisition System for Unobtrusive HMI Devices.
Proceedings of the IEEE Biomedical Circuits and Systems Conference, 2022

EEGformer: Transformer-Based Epilepsy Detection on Raw EEG Traces for Low-Channel-Count Wearable Continuous Monitoring Devices.
Proceedings of the IEEE Biomedical Circuits and Systems Conference, 2022

Improving PPG-based Heart-Rate Monitoring with Synthetically Generated Data.
Proceedings of the IEEE Biomedical Circuits and Systems Conference, 2022

A "New Ara" for Vector Computing: An Open Source Highly Efficient RISC-V V 1.0 Vector Processor Design.
Proceedings of the 33rd IEEE International Conference on Application-specific Systems, 2022

MiniFloat-NN and ExSdotp: An ISA Extension and a Modular Open Hardware Unit for Low-Precision Training on RISC-V Cores.
Proceedings of the 29th IEEE Symposium on Computer Arithmetic, 2022

An Energy-Efficient Spiking Neural Network for Finger Velocity Decoding for Implantable Brain-Machine Interface.
Proceedings of the 4th IEEE International Conference on Artificial Intelligence Circuits and Systems, 2022

Tiny-PULP-Dronets: Squeezing Neural Networks for Faster and Lighter Inference on Multi-Tasking Autonomous Nano-Drones.
Proceedings of the 4th IEEE International Conference on Artificial Intelligence Circuits and Systems, 2022

Towards On-device Domain Adaptation for Noise-Robust Keyword Spotting.
Proceedings of the 4th IEEE International Conference on Artificial Intelligence Circuits and Systems, 2022

Adversarially-Trained Tiny Autoencoders for Near-Sensor Continuous Structural Health Monitoring.
Proceedings of the 4th IEEE International Conference on Artificial Intelligence Circuits and Systems, 2022

Scale up your In-Memory Accelerator: Leveraging Wireless-on-Chip Communication for AIMC-based CNN Inference.
Proceedings of the 4th IEEE International Conference on Artificial Intelligence Circuits and Systems, 2022

2021
Efficient image dataset classification difficulty estimation for predicting deep-learning accuracy.
Vis. Comput., 2021

Arnold: An eFPGA-Augmented RISC-V SoC for Flexible and Low-Power IoT End Nodes.
IEEE Trans. Very Large Scale Integr. Syst., 2021

RNN-Based Radio Resource Management on Multicore RISC-V Accelerator Architectures.
IEEE Trans. Very Large Scale Integr. Syst., 2021

A Fully Integrated 5-mW, 0.8-Gbps Energy-Efficient Chip-to-Chip Data Link for Ultralow-Power IoT End-Nodes in 65-nm CMOS.
IEEE Trans. Very Large Scale Integr. Syst., 2021

FPnew: An Open-Source Multiformat Floating-Point Unit Architecture for Energy-Proportional Transprecision Computing.
IEEE Trans. Very Large Scale Integr. Syst., 2021

An SRAM-Based Multibit In-Memory Matrix-Vector Multiplier With a Precision That Scales Linearly in Area, Time, and Power.
IEEE Trans. Very Large Scale Integr. Syst., 2021

HPC Cooling: A Flexible Modeling Tool for Effective Design and Management.
IEEE Trans. Sustain. Comput., 2021

Energy-Efficient Hardware-Accelerated Synchronization for Shared-L1-Memory Multiprocessor Clusters.
IEEE Trans. Parallel Distributed Syst., 2021

LightSpeed: A Compact, High-Speed Optical-Link-Based 3D Optoacoustic Imager.
IEEE Trans. Medical Imaging, 2021

An Ensemble of Hyperdimensional Classifiers: Hardware-Friendly Short-Latency Seizure Detection With Automatic iEEG Electrode Selection.
IEEE J. Biomed. Health Informatics, 2021

RTK-LoRa: High-Precision, Long-Range, and Energy-Efficient Localization for Mobile IoT Devices.
IEEE Trans. Instrum. Meas., 2021

Energy-Efficient PRBS Impedance Spectroscopy on a Digital Versatile Platform.
IEEE Trans. Instrum. Meas., 2021

XpulpNN: Enabling Energy Efficient and Flexible Inference of Quantized Neural Networks on RISC-V Based IoT End Nodes.
IEEE Trans. Emerg. Top. Comput., 2021

The Predictable Execution Model in Practice: Compiling Real Applications for COTS Hardware.
ACM Trans. Embed. Comput. Syst., 2021

Energy Efficient In-Memory Hyperdimensional Encoding for Spatio-Temporal Signal Processing.
IEEE Trans. Circuits Syst. II Express Briefs, 2021

A 0.5GHz 0.35mW LDO-Powered Constant-Slope Phase Interpolator With 0.22% INL.
IEEE Trans. Circuits Syst. II Express Briefs, 2021

A 5 μW Standard Cell Memory-Based Configurable Hyperdimensional Computing Accelerator for Always-on Smart Sensing.
IEEE Trans. Circuits Syst. I Regul. Pap., 2021

Automated Design Space Exploration for Optimized Deployment of DNN on Arm Cortex-A CPUs.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2021

Snitch: A Tiny Pseudo Dual-Issue Processor for Area and Energy Efficient Execution of Floating-Point Intensive Workloads.
IEEE Trans. Computers, 2021

Stream Semantic Registers: A Lightweight RISC-V ISA Extension Achieving Full Compute Utilization in Single-Issue Cores.
IEEE Trans. Computers, 2021

HePREM: A Predictable Execution Model for GPU-based Heterogeneous SoCs.
IEEE Trans. Computers, 2021

Efficient Pipelined Execution of CNNs Based on In-Memory Computing and Graph Homomorphism Verification.
IEEE Trans. Computers, 2021

COUNTDOWN: A Run-Time Library for Performance-Neutral Energy Saving in MPI Applications.
IEEE Trans. Computers, 2021

Guest Editorial: IEEE TC Special Issue On Smart Edge Computing and IoT.
IEEE Trans. Computers, 2021

Sub-100 $\mu$W Multispectral Riemannian Classification for EEG-Based Brain-Machine Interfaces.
IEEE Trans. Biomed. Circuits Syst., 2021

Energy-Positive Activity Recognition - From Kinetic Energy Harvesting to Smart Self-Sustainable Wearable Devices.
IEEE Trans. Biomed. Circuits Syst., 2021

Q-PPG: Energy-Efficient PPG-Based Heart Rate Monitoring on Wearable Devices.
IEEE Trans. Biomed. Circuits Syst., 2021

Robustifying the Deployment of tinyML Models for Autonomous Mini-Vehicles.
Sensors, 2021

RF-Powered Low-Energy Sensor Nodes for Predictive Maintenance in Electromagnetically Harsh Industrial Environments.
Sensors, 2021

Manticore: A 4096-Core RISC-V Chiplet Architecture for Ultraefficient Floating-Point Computing.
IEEE Micro, 2021

TinyRadarNN: Combining Spatial and Temporal Convolutional Neural Networks for Embedded Gesture Recognition With Short Range Radars.
IEEE Internet Things J., 2021

Embedded Streaming Principal Components Analysis for Network Load Reduction in Structural Health Monitoring.
IEEE Internet Things J., 2021

Accelerating Inference of Convolutional Neural Networks Using In-memory Computing.
Frontiers Comput. Neurosci., 2021

A TinyML Platform for On-Device Continual Learning With Quantized Latent Replays.
IEEE J. Emerg. Sel. Topics Circuits Syst., 2021

Improving Autonomous Nano-Drones Performance via Automated End-to-End Optimization and Deployment of DNNs.
IEEE J. Emerg. Sel. Topics Circuits Syst., 2021

Efficient Transform Algorithms for Parallel Ultra-Low-Power IoT End Nodes.
IEEE Embed. Syst. Lett., 2021

Improving Memory Utilization in Convolutional Neural Network Accelerators.
IEEE Embed. Syst. Lett., 2021

Enabling Design Methodologies and Future Trends for Edge AI: Specialization and Codesign.
IEEE Des. Test, 2021

Guest Editors' Introduction: Machine Intelligence at the Edge.
IEEE Des. Test, 2021

Vega: A 10-Core SoC for IoT End-Nodes with DNN Acceleration and Cognitive Wake-Up From MRAM-Based State-Retentive Sleep Mode.
CoRR, 2021

A Fully-Integrated 5mW, 0.8Gbps Energy-Efficient Chip-to-Chip Data Link for Ultra-Low-Power IoT End-Nodes in 65-nm CMOS.
CoRR, 2021

Memory-Aware Partitioning of Machine Learning Applications for Optimal Energy Use in Batteryless Systems.
CoRR, 2021

SmartHand: Towards Embedded Smart Hands for Prosthetic and Robotic Applications.
CoRR, 2021

Structural Health Monitoring system with Narrowband IoT and MEMS sensors.
CoRR, 2021

Implementing CNN Layers on the Manticore Cluster-Based Many-Core Architecture.
CoRR, 2021

Enabling Design Methodologies and Future Trends for Edge AI: Specialization and Co-design.
CoRR, 2021

Fully Onboard AI-powered Human-Drone Pose Estimation on Ultra-low Power Autonomous Flying Nano-UAVs.
CoRR, 2021

DiG: enabling out-of-band scalable high-resolution monitoring for data-center analytics, automation and control (extended).
Clust. Comput., 2021

Near-channel classifier: symbiotic communication and classification in high-dimensional space.
Brain Informatics, 2021

A Sub-mW Dual-Engine ML Inference System-on-Chip for Complete End-to-End Face-Analysis at the Edge.
Proceedings of the 2021 Symposium on VLSI Circuits, Kyoto, Japan, June 13-19, 2021, 2021

Hardware-In-The Loop Emulation for Agile Co-Design of Parallel Ultra-Low Power IoT Processors.
Proceedings of the 29th IFIP/IEEE International Conference on Very Large Scale Integration, 2021

Low-Overhead Early-Stopping Policies for Efficient Random Forests Inference on Microcontrollers.
Proceedings of the VLSI-SoC: Technology Advancement on SoC Design, 2021

Adaptive Random Forests for Energy-Efficient Inference on Microcontrollers.
Proceedings of the 29th IFIP/IEEE International Conference on Very Large Scale Integration, 2021

RVfplib: A Fast and Compact Open-Source Floating-Point Emulation Library for Tiny RISC-V Processors.
Proceedings of the Embedded Computer Systems: Architectures, Modeling, and Simulation, 2021

Battery-Less Face Recognition at the Extreme Edge.
Proceedings of the 19th IEEE International New Circuits and Systems Conference, 2021

4.4 A 1.3TOPS/W @ 32GOPS Fully Integrated 10-Core SoC for IoT End-Nodes with 1.7μW Cognitive Wake-Up From MRAM-Based State-Retentive Sleep Mode.
Proceedings of the IEEE International Solid-State Circuits Conference, 2021

F1: Striking the Balance Between Energy Efficiency & Flexibility: General-Purpose vs Special-Purpose ML Processors.
Proceedings of the IEEE International Solid-State Circuits Conference, 2021

Session 9 Overview: ML Processors From Cloud to Edge Machine Learning Subcommittee.
Proceedings of the IEEE International Solid-State Circuits Conference, 2021

SE2: Going Remote: Challenges and Opportunities to Remote Learning, Work, and Collaboration.
Proceedings of the IEEE International Solid-State Circuits Conference, 2021

TCN Mapping Optimization for Ultra-Low Power Time-Series Edge Inference.
Proceedings of the IEEE/ACM International Symposium on Low Power Electronics and Design, 2021

Mixed-Precision Quantization and Parallel Implementation of Multispectral Riemannian Classification for Brain-Machine Interfaces.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2021

Robust and Energy-Efficient PPG-Based Heart-Rate Monitoring.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2021

Robustifying the Deployment of tinyML Models for Autonomous Mini-Vehicles.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2021

H-Watch: An Open, Connected Platform for AI-Enhanced COVID19 Infection Symptoms Monitoring and Contact Tracing.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2021

Low-Power License Plate Detection and Recognition on a RISC-V Multi-Core MCU-Based Vision System.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2021

Tiny-FPU: Low-Cost Floating-Point Support for Small RISC-V MCU Cores.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2021

ChewBaccaNN: A Flexible 223 TOPS/W BNN Accelerator.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2021

A RISC-V in-network accelerator for flexible high-performance low-power packet processing.
Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021

WindNode: A Long-Lasting And Long-Range Bluetooth Wireless Sensor Node for Pressure and Acoustic Monitoring on Wind Turbines.
Proceedings of the 4th IEEE International Conference on Industrial Cyber-Physical Systems, 2021

GVSoC: A Highly Configurable, Fast and Accurate Full-Platform Simulator for RISC-V based IoT Processors.
Proceedings of the 39th IEEE International Conference on Computer Design, 2021

Banshee: A Fast LLVM-Based RISC-V Binary Translator.
Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2021

Towards Always-on Event-based Cameras for Long-lasting Battery-operated Smart Sensor Nodes.
Proceedings of the IEEE International Instrumentation and Measurement Technology Conference, 2021

Model-based vs. Data-driven Approaches for Anomaly Detection in Structural Health Monitoring: a Case Study.
Proceedings of the IEEE International Instrumentation and Measurement Technology Conference, 2021

A 1.15 TOPS/W, 16-Cores Parallel Ultra-Low Power Cluster with 2b-to-32b Fully Flexible Bit-Precision and Vector Lockstep Execution Mode.
Proceedings of the 47th ESSCIRC 2021, 2021

A 10-core SoC with 20 Fine-Grain Power Domains for Energy-Proportional Data-Parallel Processing over a Wide Voltage and Temperature Range.
Proceedings of the 47th ESSCIRC 2021, 2021

UStEMG: an Ultrasound Transparent Tattoo-based sEMG System for Unobtrusive Parallel Acquisitions of Muscle Electro-mechanics.
Proceedings of the 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2021

Microarchitectural Timing Channels and their Prevention on an Open-Source 64-bit RISC-V Core.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2021

Indirection Stream Semantic Register Architecture for Efficient Sparse-Dense Linear Algebra.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2021

Fünfiiber-Drone: A Modular Open-Platform 18-grams Autonomous Nano-Drone.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2021

FlyDVS: An Event-Driven Wireless Ultra-Low Power Visual Sensor Node.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2021

Analyzing Memory Interference of FPGA Accelerators on Multicore Hosts in Heterogeneous Reconfigurable SoCs.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2021

MemPool: A Shared-L1 Memory Many-Core Cluster with a Low-Latency Interconnect.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2021

RISC-V for Real-time MCUs - Software Optimization and Microarchitectural Gap Analysis.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2021

Prediction of Thermal Hazards in a Real Datacenter Room Using Temporal Convolutional Networks.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2021

Pruning In Time (PIT): A Lightweight Network Architecture Optimizer for Temporal Convolutional Networks.
Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021

sEMG-based Regression of Hand Kinematics with Temporal Convolutional Networks on a Low-Power Edge Microcontroller.
Proceedings of the 2021 IEEE International Conference on Omni-Layer Intelligent Systems, 2021

A Microcontroller is All You Need: Enabling Transformer Execution on Low-Power IoT Endnodes.
Proceedings of the 2021 IEEE International Conference on Omni-Layer Intelligent Systems, 2021

Towards Long-term Non-invasive Monitoring for Epilepsy via Wearable EEG Devices.
Proceedings of the IEEE Biomedical Circuits and Systems Conference, BioCAS 2021, 2021

To Buffer, or Not to Buffer? A Case Study on FFT Accelerators for Ultra-Low-Power Multicore Clusters.
Proceedings of the 32nd IEEE International Conference on Application-specific Systems, 2021

Streamlining the OpenMP Programming Model on Ultra-Low-Power Multi-core MCUs.
Proceedings of the Architecture of Computing Systems - 34th International Conference, 2021

End-to-end 100-TOPS/W Inference With Analog In-Memory Computing: Are We There Yet?
Proceedings of the 3rd IEEE International Conference on Artificial Intelligence Circuits and Systems, 2021

Automated Tuning of End-to-end Neural Flight Controllers for Autonomous Nano-drones.
Proceedings of the 3rd IEEE International Conference on Artificial Intelligence Circuits and Systems, 2021

ECG-TCN: Wearable Cardiac Arrhythmia Detection with a Temporal Convolutional Network.
Proceedings of the 3rd IEEE International Conference on Artificial Intelligence Circuits and Systems, 2021

2020
Ara: A 1-GHz+ Scalable and Energy-Efficient RISC-V Vector Processor With Multiprecision Floating-Point Support in 22-nm FD-SOI.
IEEE Trans. Very Large Scale Integr. Syst., 2020

Countdown Slack: A Run-Time Library to Reduce Energy Footprint in Large-Scale MPI Applications.
IEEE Trans. Parallel Distributed Syst., 2020

Bonseyes AI Pipeline - Bringing AI to You: End-to-end integration of data, algorithms, and deployment tools.
ACM Trans. Internet Things, 2020

BrightNet: A Deep CNN for OLED-Based Point of Care Immunofluorescent Diagnostic Systems.
IEEE Trans. Instrum. Meas., 2020

NB-IoT Versus LoRaWAN: An Experimental Evaluation for Industrial Applications.
IEEE Trans. Ind. Informatics, 2020

Thermal Model Identification of Computing Nodes in High-Performance Computing Systems.
IEEE Trans. Ind. Electron., 2020

CBinfer: Exploiting Frame-to-Frame Locality for Faster Convolutional Network Inference on Video Streams.
IEEE Trans. Circuits Syst. Video Technol., 2020

Always-On 674μ W@4GOP/s Error Resilient Binary Neural Networks With Aggressive SRAM Voltage Scaling on a 22-nm IoT End-Node.
IEEE Trans. Circuits Syst., 2020

CMix-NN: Mixed Low-Precision CNN Library for Memory-Constrained Edge Devices.
IEEE Trans. Circuits Syst. II Express Briefs, 2020

FlexFloat: A Software Library for Transprecision Computing.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2020

Robust Identification of Thermal Models for In-Production High-Performance-Computing Clusters With Machine Learning-Based Data Selection.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2020

Modular Design and Optimization of Biomedical Applications for Ultralow Power Heterogeneous Platforms.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2020

Hyperdimensional Computing With Local Binary Patterns: One-Shot Learning of Seizure Onset and Identification of Ictogenic Brain Regions Using Short-Time iEEG Recordings.
IEEE Trans. Biomed. Eng., 2020

Robust Real-Time Embedded EMG Recognition Framework Using Temporal Convolutional Networks on a Multicore IoT Processor.
IEEE Trans. Biomed. Circuits Syst., 2020

A2Event: A Micro-Watt Programmable Frequency-Time Detector for Always-On Energy-Neutral Sensing.
Sustain. Comput. Informatics Syst., 2020

FANN-on-MCU: An Open-Source Toolkit for Energy-Efficient Neural Network Inference at the Edge of the Internet of Things.
IEEE Internet Things J., 2020

pAElla: Edge AI-Based Real-Time Malware Detection in Data Centers.
IEEE Internet Things J., 2020

Performance-aware predictive-model-based on-chip body-bias regulation strategy for an ULP multi-core cluster in 28 nm UTBB FD-SOI.
Integr., 2020

Binarization Methods for Motor-Imagery Brain-Computer Interface Classification.
IEEE J. Emerg. Sel. Topics Circuits Syst., 2020

Optimizing Temporal Convolutional Network Inference on FPGA-Based Accelerators.
IEEE J. Emerg. Sel. Topics Circuits Syst., 2020

XpulpNN: Enabling Energy Efficient and Flexible Inference of Quantized Neural Network on RISC-V based IoT End Nodes.
CoRR, 2020

PsPIN: A high-performance low-power architecture for flexible in-network compute.
CoRR, 2020

Robust High-dimensional Memory-augmented Neural Networks.
CoRR, 2020

A transprecision floating-point cluster for efficient near-sensor data analytics.
CoRR, 2020

Manticore: A 4096-core RISC-V Chiplet Architecture for Ultra-efficient Floating-point Computing.
CoRR, 2020

Performance-Aware Predictive-Model-Based On-Chip Body-Bias Regulation Strategy for an ULP Multi-Core Cluster in 28nm UTBB FD-SOI.
CoRR, 2020

Always-On 674uW @ 4GOP/s Error Resilient Binary Neural Networks with Aggressive SRAM Voltage Scaling on a 22nm IoT End-Node.
CoRR, 2020

FPnew: An Open-Source Multi-Format Floating-Point Unit Architecture for Energy-Proportional Transprecision Computing.
CoRR, 2020

Robust navigation with tinyML for autonomous mini-vehicles.
CoRR, 2020

Automated Design Space Exploration for optimised Deployment of DNN on Arm Cortex-A CPUs.
CoRR, 2020

Prevention of Microarchitectural Covert Channels on an Open-Source 64-bit RISC-V Core.
CoRR, 2020

Snitch: A 10 kGE Pseudo Dual-Issue Processor for Area and Energy Efficient Execution of Floating-Point Intensive Workloads.
CoRR, 2020

RPR: Random Partition Relaxation for Training; Binary and Ternary Weight Neural Networks.
CoRR, 2020

A Flexible, Low-Power Platform for UAV-Based Data Collection From Remote Sensors.
IEEE Access, 2020

EEG-TCNet: An Accurate Temporal Convolutional Network for Embedded Motor-Imagery Brain-Machine Interfaces.
Proceedings of the 2020 IEEE International Conference on Systems, Man, and Cybernetics, 2020

Q-EEGNet: an Energy-Efficient 8-bit Quantized Parallel EEGNet Implementation for Edge Motor-Imagery Brain-Machine Interfaces.
Proceedings of the IEEE International Conference on Smart Computing, 2020

Memory-Latency-Accuracy Trade-Offs for Continual Learning on a RISC-V Extreme-Edge Node.
Proceedings of the IEEE Workshop on Signal Processing Systems, 2020

LLHD: a multi-level intermediate representation for hardware description languages.
Proceedings of the 41st ACM SIGPLAN International Conference on Programming Language Design and Implementation, 2020

Leveraging Automated Mixed-Low-Precision Quantization for Tiny Edge Microcontrollers.
Proceedings of the IoT Streams for Data-Driven Predictive Maintenance and IoT, Edge, and Mobile for Embedded Machine Learning, 2020

Explainable Deep Learning for Medical Time Series Data.
Proceedings of the Wireless Mobile Communication and Healthcare, 2020

Memory-Driven Mixed Low Precision Quantization for Enabling Deep Network Inference on Microcontrollers.
Proceedings of the Third Conference on Machine Learning and Systems, 2020

An Accurate EEGNet-based Motor-Imagery Brain-Computer Interface for Low-Power Edge Computing.
Proceedings of the 2020 IEEE International Symposium on Medical Measurements and Applications, 2020

A Synergistic Approach to Predictable Compilation and Scheduling on Commodity Multi-Cores.
Proceedings of the 21st ACM SIGPLAN/SIGBED International Conference on Languages, 2020

A Mixed-Precision RISC-V Processor for Extreme-Edge DNN Inference.
Proceedings of the 2020 IEEE Computer Society Annual Symposium on VLSI, 2020

The AMPERE Project: : A Model-driven development framework for highly Parallel and EneRgy-Efficient computation supporting multi-criteria optimization.
Proceedings of the 23rd IEEE International Symposium on Real-Time Distributed Computing, 2020

Integrating event-based dynamic vision sensors with sparse hyperdimensional computing: a low-power accelerator with online learning capability.
Proceedings of the ISLPED '20: ACM/IEEE International Symposium on Low Power Electronics and Design, 2020

Sound event detection with binary neural networks on tightly power-constrained IoT devices.
Proceedings of the ISLPED '20: ACM/IEEE International Symposium on Low Power Electronics and Design, 2020

A Feature Reduction Strategy For Enabling Lightweight Non-Intrusive Load Monitoring On Edge Devices.
Proceedings of the 29th IEEE International Symposium on Industrial Electronics, 2020

An Energy-Efficient Low-Voltage Swing Transceiver for mW-Range IoT End-Nodes.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2020

Live Demonstration: Exploiting Body-Biasing for Static Corner Trimming and Maximum Energy Efficiency Operation in 22nm FDX Technology.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2020

An Energy-efficient Localization System for Imprecisely Positioned Sensor Nodes with Flying UAVs.
Proceedings of the 18th IEEE International Conference on Industrial Informatics, 2020

Towards a compact, high-speed optical linkbased 3D optoacoustic imager.
Proceedings of the 2020 IEEE Sensors, Rotterdam, The Netherlands, October 25-28, 2020, 2020

Ultra-High Frequency (500 MHz) Capacitance Spectroscopy for Nanobiosensing.
Proceedings of the 2020 IEEE Sensors, Rotterdam, The Netherlands, October 25-28, 2020, 2020

Ultra-low energy pest detection for smart agriculture.
Proceedings of the 2020 IEEE Sensors, Rotterdam, The Netherlands, October 25-28, 2020, 2020

A Low Power and Smart Power Unit for Kinetic Self-Sustainable Wearable Devices.
Proceedings of the 27th IEEE International Conference on Electronics, Circuits and Systems, 2020

Energy-Efficient Adaptive Machine Learning on IoT End-Nodes With Class-Dependent Confidence.
Proceedings of the 27th IEEE International Conference on Electronics, Circuits and Systems, 2020

An Open-Source Scalable Thermal and Power Controller for HPC Processors.
Proceedings of the 38th IEEE International Conference on Computer Design, 2020

Enhancing Structural Health Monitoring with Vehicle Identification and Tracking.
Proceedings of the 2020 IEEE International Instrumentation and Measurement Technology Conference, 2020

A 4096-core RISC-V Chiplet Architecture for Ultra-efficient Floating-point Computing.
Proceedings of the IEEE Hot Chips 32 Symposium, 2020

Predicting Hard Disk Failures in Data Centers Using Temporal Convolutional Neural Networks.
Proceedings of the Euro-Par 2020: Parallel Processing Workshops, 2020

Using Low-Power, Low-Cost IoT Processors in Clinical Biosignal Research: an In-depth Feasibility Check.
Proceedings of the 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2020

TRANSPIRE: An energy-efficient TRANSprecision floating-point Programmable archItectuRE.
Proceedings of the 2020 Design, Automation & Test in Europe Conference & Exhibition, 2020

InfiniWolf: Energy Efficient Smart Bracelet for Edge Computing with Dual Source Energy Harvesting.
Proceedings of the 2020 Design, Automation & Test in Europe Conference & Exhibition, 2020

Energy-Efficient Two-level Instruction Cache Design for an Ultra-Low-Power Multi-core Cluster.
Proceedings of the 2020 Design, Automation & Test in Europe Conference & Exhibition, 2020

Compressing Subject-specific Brain-Computer Interface Models into One Model by Superposition in Hyperdimensional Space.
Proceedings of the 2020 Design, Automation & Test in Europe Conference & Exhibition, 2020

XpulpNN: Accelerating Quantized Neural Networks on RISC-V Processors Through ISA Extensions.
Proceedings of the 2020 Design, Automation & Test in Europe Conference & Exhibition, 2020

ATUNs: Modular and Scalable Support for Atomic Operations in a Shared Memory Multiprocessor.
Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

Extending the RISC-V ISA for Efficient RNN-based 5G Radio Resource Management.
Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

XwattPilot: A Full-stack Cloud System Enabling Agile Development of Transprecision Software for Low-power SoCs.
Proceedings of the 2020 IEEE Symposium in Low-Power and High-Speed Chips, 2020

Design of an open-source bridge between non-coherent burst-based and coherent cache-line-based memory systems.
Proceedings of the 17th ACM International Conference on Computing Frontiers, 2020

Combining learning and optimization for transprecision computing.
Proceedings of the 17th ACM International Conference on Computing Frontiers, 2020

Mixed-data-model heterogeneous compilation and OpenMP offloading.
Proceedings of the CC '20: 29th International Conference on Compiler Construction, 2020

BYOC: A "Bring Your Own Core" Framework for Heterogeneous-ISA Research.
Proceedings of the ASPLOS '20: Architectural Support for Programming Languages and Operating Systems, 2020

Temporal Variability Analysis in sEMG Hand Grasp Recognition using Temporal Convolutional Networks.
Proceedings of the 2nd IEEE International Conference on Artificial Intelligence Circuits and Systems, 2020

Neuro-PULP: A Paradigm Shift Towards Fully Programmable Platforms for Neural Interfaces.
Proceedings of the 2nd IEEE International Conference on Artificial Intelligence Circuits and Systems, 2020

Evolvable Hyperdimensional Computing: Unsupervised Regeneration of Associative Memory to Recover Faulty Components.
Proceedings of the 2nd IEEE International Conference on Artificial Intelligence Circuits and Systems, 2020

Binary Models for Motor-Imagery Brain-Computer Interfaces: Sparse Random Projection and Binarized SVM.
Proceedings of the 2nd IEEE International Conference on Artificial Intelligence Circuits and Systems, 2020

2019
StreamDrive: a Dynamic Dataflow Framework for Clustered Embedded Architectures.
J. Signal Process. Syst., 2019

Extending the Lifetime of Nano-Blimps via Dynamic Motor Control.
J. Signal Process. Syst., 2019

The Cost of Application-Class Processing: Energy and Performance Analysis of a Linux-Ready 1.7-GHz 64-Bit RISC-V Core in 22-nm FDSOI Technology.
IEEE Trans. Very Large Scale Integr. Syst., 2019

Self-Sustaining Acoustic Sensor With Programmable Pattern Recognition for Underwater Monitoring.
IEEE Trans. Instrum. Meas., 2019

FPGA Implementation of a Kalman-Based Motion Estimator for Levitated Nanoparticles.
IEEE Trans. Instrum. Meas., 2019

A Broadband Multi-Mode Compressive Sensing Current Sensor SoC in 0.16 µm CMOS.
IEEE Trans. Circuits Syst. I Regul. Pap., 2019

An Energy-Efficient Integrated Programmable Array Accelerator and Compilation Flow for Near-Sensor Ultralow Power Processing.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2019

Exploring Shared Virtual Memory for FPGA Accelerators with a Configurable IOMMU.
IEEE Trans. Computers, 2019

A Scalable Near-Memory Architecture for Training Deep Neural Networks on Large In-Memory Datasets.
IEEE Trans. Computers, 2019

BioWolf: A Sub-10-mW 8-Channel Advanced Brain-Computer Interface Platform With a Nine-Core Processor and BLE Connectivity.
IEEE Trans. Biomed. Circuits Syst., 2019

Online Learning and Classification of EMG-Based Gestures on a Parallel Ultra-Low Power Platform Using Hyperdimensional Computing.
IEEE Trans. Biomed. Circuits Syst., 2019

Slotted ALOHA on LoRaWAN-Design, Analysis, and Deployment.
Sensors, 2019

SmarTEG: An Autonomous Wireless Sensor Node for High Accuracy Accelerometer-Based Monitoring.
Sensors, 2019

Ultrasound as a Tool to Study Muscle-Tendon Functions during Locomotion: A Systematic Review of Applications.
Sensors, 2019

Efficient Biosignal Processing Using Hyperdimensional Computing: Network Templates for Combined Learning and Classification of ExG Signals.
Proc. IEEE, 2019

Combining PREM compilation and static scheduling for high-performance and predictable MPSoC execution.
Parallel Comput., 2019

The ANTAREX domain specific language for high performance computing.
Microprocess. Microsystems, 2019

Mr.Wolf: An Energy-Precision Scalable Parallel Ultra Low Power SoC for IoT Edge Processing.
IEEE J. Solid State Circuits, 2019

Hardware Optimizations of Dense Binary Hyperdimensional Computing: Rematerialization of Hypervectors, Binarized Bundling, and Combinational Associative Memory.
ACM J. Emerg. Technol. Comput. Syst., 2019

A Minimally Invasive Low-Power Platform for Real-Time Brain Computer Interaction Based on Canonical Correlation Analysis.
IEEE Internet Things J., 2019

A 64-mW DNN-Based Visual Navigation Engine for Autonomous Nano-Drones.
IEEE Internet Things J., 2019

Energy and power awareness in hardware schedulers for energy harvesting IoT SoCs.
Integr., 2019

Pricing schemes for energy-efficient HPC systems: Design and exploration.
Int. J. High Perform. Comput. Appl., 2019

EBPC: Extended Bit-Plane Compression for Deep Neural Network Inference and Training Accelerators.
IEEE J. Emerg. Sel. Topics Circuits Syst., 2019

Hyperdrive: A Multi-Chip Systolically Scalable Binary-Weight CNN Inference Engine.
IEEE J. Emerg. Sel. Topics Circuits Syst., 2019

A semisupervised autoencoder-based approach for anomaly detection in high performance computing systems.
Eng. Appl. Artif. Intell., 2019

HR-SAR-Net: A Deep Neural Network for Urban Scene Segmentation from High-Resolution SAR Data.
CoRR, 2019

PULP-NN: Accelerating Quantized Neural Networks on Parallel Ultra-Low-Power RISC-V Processors.
CoRR, 2019

5 Parallel Prism: A topology for pipelined implementations of convolutional neural networks using computational memory.
CoRR, 2019

In-memory hyperdimensional computing.
CoRR, 2019

Ara: A 1 GHz+ Scalable and Energy-Efficient RISC-V Vector Processor with Multi-Precision Floating Point Support in 22 nm FD-SOI.
CoRR, 2019

Additive Noise Annealing and Approximation Properties of Quantized Neural Networks.
CoRR, 2019

Demo Abstract: Pible: Battery-Free Mote for Perpetual Indoor BLE Applications.
CoRR, 2019

The Cost of Application-Class Processing: Energy and Performance Analysis of a Linux-ready 1.7GHz 64bit RISC-V Core in 22nm FDSOI Technology.
CoRR, 2019

Optimally Scheduling CNN Convolutions for Efficient Memory Access.
CoRR, 2019

The ANTAREX Domain Specific Language for High Performance Computing.
CoRR, 2019

Adaptive EMG-based hand gesture recognition using hyperdimensional computing.
CoRR, 2019

Self-Sustainable Smart Ring for Long-Term Monitoring of Blood Oxygenation.
IEEE Access, 2019

EmbedUWB: Low Power Embedded High-Precision and Low Latency UWB Localization.
Proceedings of the 5th IEEE World Forum on Internet of Things, 2019

FANNCortexM: An Open Source Toolkit for Deployment of Multi-layer Neural Networks on ARM Cortex-M Family Microcontrollers : Performance Analysis with Stress Detection.
Proceedings of the 5th IEEE World Forum on Internet of Things, 2019

A 0.80pJ/flop, 1.24Tflop/sW 8-to-64 bit Transprecision Floating-Point Unit for a 64 bit RISC-V Processor in 22nm FD-SOI.
Proceedings of the 27th IFIP/IEEE International Conference on Very Large Scale Integration, 2019

Network-accelerated non-contiguous memory transfers.
Proceedings of the International Conference for High Performance Computing, 2019

An Energy-Efficient IoT node for HMI applications based on an ultra-low power Multicore Processor.
Proceedings of the IEEE Sensors Applications Symposium, 2019


Prediction of Time-to-Solution in Material Science Simulations Using Deep Learning.
Proceedings of the Platform for Advanced Scientific Computing Conference, 2019

Constrained deep neural network architecture search for IoT devices accounting for hardware calibration.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Ultra Low-Power Drowsiness Detection System with BioWolf.
Proceedings of the 2019 9th International IEEE/EMBS Conference on Neural Engineering (NER), 2019

A multi-protocol system for configurable data streaming on IoT healthcare devices.
Proceedings of the IEEE 8th International Workshop on Advances in Sensors and Interfaces, 2019

LoRa vs. LoRa: In-Field Evaluation and Comparison For Long-Lifetime Sensor Nodes.
Proceedings of the IEEE 8th International Workshop on Advances in Sensors and Interfaces, 2019

A RISC-V Based Open Hardware Platform for Always-On Wearable Smart Sensing.
Proceedings of the IEEE 8th International Workshop on Advances in Sensors and Interfaces, 2019

Secure Near-Sensor Analytics: the PULP approach.
Proceedings of the IEEE 8th International Workshop on Advances in Sensors and Interfaces, 2019

NETWIS: A Scalable and Robust Body Sensor Network For Biomedical Application.
Proceedings of the IEEE 8th International Workshop on Advances in Sensors and Interfaces, 2019

NTX: A 260 Gflop/sW Streaming Accelerator for Oblivious Floating-Point Algorithms in 22 nm FD-SOI.
Proceedings of the 2019 International SoC Design Conference, 2019

Towards a Wearable Interface for Food Quality Grading Through ERP Analysis.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2019

Experimental Evaluation on NB-IoT and LoRaWAN for Industrial and IoT Applications.
Proceedings of the 17th IEEE International Conference on Industrial Informatics, 2019

Low Power Embedded Gesture Recognition Using Novel Short-Range Radar Sensors.
Proceedings of the 2019 IEEE SENSORS, Montreal, QC, Canada, October 27-30, 2019, 2019

Paving the Way Toward Energy-Aware and Automated Datacentre.
Proceedings of the 48th International Conference on Parallel Processing, 2019

The Floating Point Trinity: A Multi-modal Approach to Extreme Energy-Efficiency and Performance.
Proceedings of the 26th IEEE International Conference on Electronics, Circuits and Systems, 2019

PULP-NN: A Computing Library for Quantized Neural Network inference at the edge on RISC-V Based Parallel Ultra Low Power Clusters.
Proceedings of the 26th IEEE International Conference on Electronics, Circuits and Systems, 2019

A PULP-based Parallel Power Controller for Future Exascale Systems.
Proceedings of the 26th IEEE International Conference on Electronics, Circuits and Systems, 2019

Thermal Characterization of a Tier0 Datacenter Room in Normal and Thermal Emergency Conditions.
Proceedings of the High Performance Computing in Science and Engineering, 2019

EdgeEye: A Long-Range Energy-Efficient Vision Node For Long-Term Edge Computing.
Proceedings of the Tenth International Green and Sustainable Computing Conference, 2019

On-line Testing for Autonomous Systems driven by RISC-V Processor Design Verification.
Proceedings of the 2019 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems, 2019

An Open Source and Open Hardware Deep Learning-Powered Visual Navigation Engine for Autonomous Nano-UAVs.
Proceedings of the 15th International Conference on Distributed Computing in Sensor Systems, 2019

Design and Evaluation of SmallFloat SIMD extensions to the RISC-V ISA.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2019

NTX: An Energy-efficient Streaming Accelerator for Floating-point Generalized Reduction Workloads in 22 nm FD-SOI.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2019

Learning to infer: RL-based search for DNN primitive selection on Heterogeneous Embedded Systems.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2019

Applications of Computation-In-Memory Architectures based on Memristive Devices.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2019

Hardware-Accelerated Energy-Efficient Synchronization and Communication for Ultra-Low-Power Tightly Coupled Clusters.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2019

Taming Data Caches for Predictable Execution on GPU-based SoCs.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2019

Laelaps: An Energy-Efficient Seizure Detection Algorithm from Long-term Human iEEG Recordings without False Alarms.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2019

DORY: Lightweight memory hierarchy management for deep NN inference on IoT endnodes: work-in-progress.
Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis Companion, 2019

Optimization and deployment of CNNs at the edge: the ALOHA experience.
Proceedings of the 16th ACM International Conference on Computing Frontiers, 2019

Embedding principal component analysis for data reduction in structural health monitoring on low-cost IoT gateways.
Proceedings of the 16th ACM International Conference on Computing Frontiers, 2019

Analysis of Contraction Effort Level in EMG-Based Gesture Recognition Using Hyperdimensional Computing.
Proceedings of the 2019 IEEE Biomedical Circuits and Systems Conference, 2019

An Energy Optimized JPEG Encoder for Parallel Ultra-Low-Power Processing-Platforms.
Proceedings of the Applications in Electronics Pervading Industry, Environment and Society, 2019

Frequency Assignment in High Performance Computing Systems.
Proceedings of the AI*IA 2019 - Advances in Artificial Intelligence, 2019

Hyperdimensional Computing-based Multimodality Emotion Recognition with Physiological Signals.
Proceedings of the IEEE International Conference on Artificial Intelligence Circuits and Systems, 2019

Extended Bit-Plane Compression for Convolutional Neural Network Accelerators.
Proceedings of the IEEE International Conference on Artificial Intelligence Circuits and Systems, 2019

Online Anomaly Detection in HPC Systems.
Proceedings of the IEEE International Conference on Artificial Intelligence Circuits and Systems, 2019

Anomaly Detection Using Autoencoders in High Performance Computing Systems.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
NEURAghe: Exploiting CPU-FPGA Synergies for Efficient and Flexible CNN Inference Acceleration on Zynq SoCs.
ACM Trans. Reconfigurable Technol. Syst., 2018

Quantifying the Impact of Variability and Heterogeneity on the Energy Efficiency for a Next-Generation Ultra-Green Supercomputer.
IEEE Trans. Parallel Distributed Syst., 2018

Neurostream: Scalable and Energy Efficient Deep Learning with Smart Memory Cubes.
IEEE Trans. Parallel Distributed Syst., 2018

The Quest for Energy-Efficient I$ Design in Ultra-Low-Power Clustered Many-Cores.
IEEE Trans. Multi Scale Comput. Syst., 2018

Towards Edge-Aware Spatio-Temporal Filtering in Real-Time.
IEEE Trans. Image Process., 2018

Design and Evaluation of a Low-Power Sensor Device for Induced Rockfall Experiments.
IEEE Trans. Instrum. Meas., 2018

Runtime Support for Multiple Offload-Based Programming Models on Clustered Manycore Accelerators.
IEEE Trans. Emerg. Top. Comput., 2018

Energy-Aware Bio-Signal Compressed Sensing Reconstruction on the WBSN-Gateway.
IEEE Trans. Emerg. Top. Comput., 2018

Efficient, Long-Term Logging of Rich Data Sensors Using Transient Sensor Nodes.
ACM Trans. Embed. Comput. Syst., 2018

A Heterogeneous Multicore System on Chip for Energy Efficient Brain Inspired Computing.
IEEE Trans. Circuits Syst. II Express Briefs, 2018

Synergistic HW/SW Approximation Techniques for Ultralow-Power Parallel Computing.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2018

XNOR Neural Engine: A Hardware Accelerator IP for 21.6-fJ/op Binary Neural Network Inference.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2018

YodaNN: An Architecture for Ultralow Power Binary-Weight CNN Acceleration.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2018

Scheduling-based power capping in high performance computing systems.
Sustain. Comput. Informatics Syst., 2018

Lightweight IO virtualization on MPU enabled microcontrollers.
SIGBED Rev., 2018

On-Demand LoRa: Asynchronous TDMA for Energy Efficient and Low Latency Communication in IoT.
Sensors, 2018

Leveraging Energy Harvesting and Wake-Up Receivers for Long-Term Wireless Sensor Networks.
Sensors, 2018

Long-short range communication network leveraging LoRa™ and wake-up receiver.
Microprocess. Microsystems, 2018

A Multi-Sensor and Parallel Processing SoC for Miniaturized Medical Instrumentation.
IEEE J. Solid State Circuits, 2018

A 0.45-0.7 V 1-6 Gb/s 0.29-0.58 pJ/b Source-Synchronous Transceiver Using Near-Threshold Operation.
IEEE J. Solid State Circuits, 2018

Optimizing memory bandwidth exploitation for OpenVX applications on embedded many-core accelerators.
J. Real Time Image Process., 2018

An Energy Efficient E-Skin Embedded System for Real-Time Tactile Data Decoding.
J. Low Power Electron., 2018

A sensor fusion approach for drowsiness detection in wearable ultra-low-power systems.
Inf. Fusion, 2018

Hardware Transactional Memory Exploration in Coherence-Free Many-Core Architectures.
Int. J. Parallel Program., 2018

A 2.2-µW Cognitive Always-On Wake-Up Circuit for Event-Driven Duty-Cycling of IoT Sensor Nodes.
IEEE J. Emerg. Sel. Topics Circuits Syst., 2018

Exploring Embedding Methods in Binary Hyperdimensional Computing: A Case Study for Motor-Imagery based Brain-Computer Interfaces.
CoRR, 2018

NTX: An Energy-efficient Streaming Accelerator for Floating-point Generalized Reduction Workloads in 22nm FD-SOI.
CoRR, 2018

Robust online identification of thermal models for in-production HPC clusters with machine learning-based data selection.
CoRR, 2018

On the Feasibility of FPGA Acceleration of Molecular Dynamics Simulations.
CoRR, 2018

COUNTDOWN - three, two, one, low power! A Run-time Library for Energy Saving in MPI Communication Primitives.
CoRR, 2018

Dwarf in a Giant: Enabling Scalable, High-Resolution HPC Energy Monitoring for Real-Time Profiling and Analytics.
CoRR, 2018

Ultra Low Power Deep-Learning-powered Autonomous Nano Drones.
CoRR, 2018

KRATOS: An Open Source Hardware-Software Platform for Rapid Research in LPWANs.
Proceedings of the 14th International Conference on Wireless and Mobile Computing, 2018

On-Demand TDMA for Energy Efficient Data Collection with LoRa and Wake-up Receiver.
Proceedings of the 14th International Conference on Wireless and Mobile Computing, 2018

An Open-Source Verification Framework for Open-Source Cores: A RISC-V Case Study.
Proceedings of the IFIP/IEEE International Conference on Very Large Scale Integration, 2018

BinaryEye: A 20 kfps Streaming Camera System on FPGA with Real-Time On-Device Image Recognition Using Binary Neural Networks.
Proceedings of the 13th IEEE International Symposium on Industrial Embedded Systems, 2018

Pible: battery-free mote for perpetual indoor BLE applications: demo abstract.
Proceedings of the 5th Conference on Systems for Built Environments, 2018

Pible: battery-free mote for perpetual indoor BLE applications.
Proceedings of the 5th Conference on Systems for Built Environments, 2018

On the Cost of Freedom From Interference in Heterogeneous SoCs.
Proceedings of the 21st International Workshop on Software and Compilers for Embedded Systems, 2018

Combining microbial fuel cell and ultra-low power event-driven audio detector for zero-power sensing in underwater monitoring.
Proceedings of the 2018 IEEE Sensors Applications Symposium, 2018

An accurate system for optimal state estimation of a levitated nanoparticle.
Proceedings of the 2018 IEEE Sensors Applications Symposium, 2018

Combining PREM compilation and ILP scheduling for high-performance and predictable MPSoC execution.
Proceedings of the 9th International Workshop on Programming Models and Applications for Multicores and Manycores, 2018

Torpor: A Power-Aware HW Scheduler for Energy Harvesting IoT SoCs.
Proceedings of the 28th International Symposium on Power and Timing Modeling, 2018

Hyperdrive: A Systolically Scalable Binary-Weight CNN Inference Engine for mW IoT End-Nodes.
Proceedings of the 2018 IEEE Computer Society Annual Symposium on VLSI, 2018

Design Automation for Binarized Neural Networks: A Quantum Leap Opportunity?
Proceedings of the IEEE International Symposium on Circuits and Systems, 2018

An EMG Gesture Recognition System with Flexible High-Density Sensors and Brain-Inspired High-Dimensional Classifier.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2018

Live Demonstration: Body-Bias Based Performance Monitoring and Compensation for a Near-Threshold Multi-Core Cluster in 28nm FD-SOI Technology.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2018

A Transprecision Floating-Point Architecture for Energy-Efficient Embedded Computing.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2018

Smart Wearable Wristband for EMG based Gesture Recognition Powered by Solar Energy Harvester.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2018

An 826 MOPS, 210uW/MHz Unum ALU in 65 nm.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2018

Modal Analysis of Structures with Low-cost Embedded Systems.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2018

Hydra: An Accelerator for Real-Time Edge-Aware Permeability Filtering in 65nm CMOS.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2018

Sub-mW multi-Gbps chip-to-chip communication Links for Ultra-Low Power IoT end-nodes.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2018

Combining LoRa and RTK to achieve a high precision self-sustaining geo-localization system: poster abstract.
Proceedings of the 17th ACM/IEEE International Conference on Information Processing in Sensor Networks, 2018

A Scalable Framework for Online Power Modelling of High-Performance Computing Nodes in Production.
Proceedings of the 2018 International Conference on High Performance Computing & Simulation, 2018

Architecture-aware design and implementation of CNN algorithms for embedded inference: the ALOHA project.
Proceedings of the 30th International Conference on Microelectronics, 2018

Nanowatt Wake-Up Radios: Discrete-Components and Integrated Architectures.
Proceedings of the 25th IEEE International Conference on Electronics, Circuits and Systems, 2018

Scalable and Efficient Virtual Memory Sharing in Heterogeneous SoCs with TLB Prefetching and MMU-Aware DMA Engine.
Proceedings of the 36th IEEE International Conference on Computer Design, 2018

Rat Cortical Layers Classification extracting Evoked Local Field Potential Images with Implanted Multi-Electrode Sensor.
Proceedings of the 20th IEEE International Conference on e-Health Networking, 2018

A Self-Sustaining Micro-Watt Programmable Smart Audio Sensor for Always-On Sensing.
Proceedings of the Ninth International Green and Sustainable Computing Conference, 2018

Fast and Accurate Multiclass Inference for MI-BCIs Using Large Multiscale Temporal and Spectral Features.
Proceedings of the 26th European Signal Processing Conference, 2018

Slotted ALOHA Overlay on LoRaWAN - A Distributed Synchronization Approach.
Proceedings of the 16th IEEE International Conference on Embedded and Ubiquitous Computing, 2018

An accurate low-cost Crackmeter with LoRaWAN communication and energy harvesting capability.
Proceedings of the 23rd IEEE International Conference on Emerging Technologies and Factory Automation, 2018

ALOHA: an architectural-aware framework for deep learning at the edge.
Proceedings of the Workshop on INTelligent Embedded Systems Architectures and Applications, 2018

Mr. Wolf: A 1 GFLOP/s Energy-Proportional Parallel Ultra Low Power SoC for IOT Edge Processing.
Proceedings of the 44th IEEE European Solid State Circuits Conference, 2018

ANTAREX: A DSL-Based Approach to Adaptively Optimizing and Enforcing Extra-Functional Properties in High Performance Computing.
Proceedings of the 21st Euromicro Conference on Digital System Design, 2018

A transprecision floating-point platform for ultra-low power computing.
Proceedings of the 2018 Design, Automation & Test in Europe Conference & Exhibition, 2018

High speed ASIC implementations of leakage-resilient cryptography.
Proceedings of the 2018 Design, Automation & Test in Europe Conference & Exhibition, 2018

Energy proportionality in near-threshold computing servers and cloud data centers: Consolidating or Not?
Proceedings of the 2018 Design, Automation & Test in Europe Conference & Exhibition, 2018

HePREM: Enabling predictable GPU execution on heterogeneous SoC.
Proceedings of the 2018 Design, Automation & Test in Europe Conference & Exhibition, 2018

PULP-HD: accelerating brain-inspired high-dimensional computing on a parallel ultra-low power platform.
Proceedings of the 55th Annual Design Automation Conference, 2018

XNORBIN: A 95 TOp/s/W hardware accelerator for binary convolutional neural networks.
Proceedings of the 2018 IEEE Symposium in Low-Power and High-Speed Chips, 2018

Quantized NNs as the definitive solution for inference on low-power ARM MCUs?: work-in-progress.
Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis, 2018

Chipmunk: A systolically scalable 0.9 mm<sup>2</sup>, 3.08Gop/s/mW @ 1.2 mW accelerator for near-sensor recurrent neural network inference.
Proceedings of the 2018 IEEE Custom Integrated Circuits Conference, 2018


Always-ON visual node with a hardware-software event-based binarized neural network inference engine.
Proceedings of the 15th ACM International Conference on Computing Frontiers, 2018

QUENN: QUantization engine for low-power neural networks.
Proceedings of the 15th ACM International Conference on Computing Frontiers, 2018

Thermal image-based CNN's for ultra-low power people recognition.
Proceedings of the 15th ACM International Conference on Computing Frontiers, 2018

A Wearable Device for Brain-Machine Interaction with Augmented Reality Head-Mounted Display.
Proceedings of the 13th EAI International Conference on Body Area Networks, 2018

A Cost-Effective Embedded Platform for Scalable Multichannel Biopotential Acquisition.
Proceedings of the 13th EAI International Conference on Body Area Networks, 2018

Embedded Classification of Local Field Potentials Recorded from Rat Barrel Cortex with Implanted Multi-Electrode Array.
Proceedings of the 2018 IEEE Biomedical Circuits and Systems Conference, 2018

One-shot Learning for iEEG Seizure Detection Using End-to-end Binary Operations: Local Binary Patterns with Hyperdimensional Computing.
Proceedings of the 2018 IEEE Biomedical Circuits and Systems Conference, 2018

GAP-8: A RISC-V SoC for AI at the Edge of the IoT.
Proceedings of the 29th IEEE International Conference on Application-specific Systems, 2018

A LoRaWAN Wireless Sensor Network for Data Center Temperature Monitoring.
Proceedings of the Applications in Electronics Pervading Industry, Environment and Society, 2018

Evaluation of NTP/PTP fine-grain synchronization performance in HPC clusters.
Proceedings of the 2nd Workshop on AutotuniNg and aDaptivity AppRoaches for Energy efficient HPC Systems, 2018

HERO: an open-source research platform for HW/SW exploration of heterogeneous manycore systems.
Proceedings of the 2nd Workshop on AutotuniNg and aDaptivity AppRoaches for Energy efficient HPC Systems, 2018

COUNTDOWN: a run-time library for application-agnostic energy saving in MPI communication primitives.
Proceedings of the 2nd Workshop on AutotuniNg and aDaptivity AppRoaches for Energy efficient HPC Systems, 2018

2017
Near-Threshold RISC-V Core With DSP Extensions for Scalable IoT Endpoint Devices.
IEEE Trans. Very Large Scale Integr. Syst., 2017

Logic-Base Interconnect Design for Near Memory Computing in the Smart Memory Cube.
IEEE Trans. Very Large Scale Integr. Syst., 2017

Lightweight Virtual Memory Support for Zero-Copy Sharing of Pointer-Rich Data Structures in Heterogeneous Embedded SoCs.
IEEE Trans. Parallel Distributed Syst., 2017

A Generic Framework for Modeling MAC Protocols in Wireless Sensor Networks.
IEEE/ACM Trans. Netw., 2017

Accelerated Visual Context Classification on a Low-Power Smartwatch.
IEEE Trans. Hum. Mach. Syst., 2017

Efficient Virtual Memory Sharing via On-Accelerator Page Table Walking in Heterogeneous Embedded SoCs.
ACM Trans. Embed. Comput. Syst., 2017

Origami: A 803-GOp/s/W Convolutional Network Accelerator.
IEEE Trans. Circuits Syst. Video Technol., 2017

An IoT Endpoint System-on-Chip for Secure and Energy-Efficient Near-Sensor Analytics.
IEEE Trans. Circuits Syst. I Regul. Pap., 2017

Kinetic AC/DC Converter for Electromagnetic Energy Harvesting in Autonomous Wearable Devices.
IEEE Trans. Circuits Syst. II Express Briefs, 2017

Smart Energy-Efficient Clock Synthesizer for Duty-Cycled Sensor SoCs in 65 nm/28nm CMOS.
IEEE Trans. Circuits Syst. I Regul. Pap., 2017

WARM: Workload-Aware Reliability Management in Linux/Android.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2017

A Synchronization-Based Hybrid-Memory Multi-Core Architecture for Energy-Efficient Biomedical Signal Processing.
IEEE Trans. Computers, 2017

Energy Analysis of Decoders for Rakeness-Based Compressed Sensing of ECG Signals.
IEEE Trans. Biomed. Circuits Syst., 2017

Efficient Sample Delay Calculation for 2-D and 3-D Ultrasound Imaging.
IEEE Trans. Biomed. Circuits Syst., 2017

A Prosthetic Hand Body Area Controller Based on Efficient Pattern Recognition Control Strategies.
Sensors, 2017

Energy-Efficient Context Aware Power Management with Asynchronous Protocol for Body Sensor Network.
Mob. Networks Appl., 2017

Energy-Efficient Near-Threshold Parallel Computing: The PULPv2 Cluster.
IEEE Micro, 2017

Zeroing for HW-efficient compressed sensing architectures targeting data compression in wireless sensor networks.
Microprocess. Microsystems, 2017

Increasing the energy efficiency of microcontroller platforms with low-design margin co-processors.
Microprocess. Microsystems, 2017

An Extended Shared Logarithmic Unit for Nonlinear Function Kernel Acceleration in a 65-nm CMOS Multicore Cluster.
IEEE J. Solid State Circuits, 2017

A Sub-mW IoT-Endnode for Always-On Visual Monitoring and Smart Triggering.
IEEE Internet Things J., 2017

Leakage Bounds for Gaussian Side Channels.
IACR Cryptol. ePrint Arch., 2017

A Hybrid Instruction Prefetching Mechanism for Ultra Low-Power Multicore Clusters.
IEEE Embed. Syst. Lett., 2017

A Self-Aware Architecture for PVT Compensation and Power Nap in Near Threshold Processors.
IEEE Des. Test, 2017

HERO: Heterogeneous Embedded Research Platform for Exploring RISC-V Manycore Accelerators on FPGA.
CoRR, 2017

An 826 MOPS, 210 uW/MHz Unum ALU in 65 nm.
CoRR, 2017

Chipmunk: A Systolically Scalable 0.9 mm<sup>2</sup>, 3.08 Gop/s/mW @ 1.2 mW Accelerator for Near-Sensor Recurrent Neural Network Inference.
CoRR, 2017

Soft-to-Hard Vector Quantization for End-to-End Learned Compression of Images and Neural Networks.
CoRR, 2017

Networks on Chips: 15 Years Later.
Computer, 2017

Modeling and Evaluation of Application-Aware Dynamic Thermal Control in HPC Nodes.
Proceedings of the VLSI-SoC: Opportunities and Challenges Beyond the Internet of Things, 2017

Prediction horizon vs. efficiency of optimal dynamic thermal control policies in HPC nodes.
Proceedings of the 2017 IFIP/IEEE International Conference on Very Large Scale Integration, 2017

On the Accuracy of Near-Optimal CPU-Based Path Planning for UAVs.
Proceedings of the 20th International Workshop on Software and Compilers for Embedded Systems, 2017

The ANTAREX tool flow for monitoring and autotuning energy efficient HPC systems.
Proceedings of the 2017 International Conference on Embedded Computer Systems: Architectures, 2017

Slow and steady wins the race? A comparison of ultra-low-power RISC-V cores for Internet-of-Things applications.
Proceedings of the 27th International Symposium on Power and Timing Modeling, 2017

μDMA: An autonomous I/O subsystem for IoT end-nodes.
Proceedings of the 27th International Symposium on Power and Timing Modeling, 2017

Temperature and process-aware performance monitoring and compensation for an ULP multi-core cluster in 28nm UTBB FD-SOI technology.
Proceedings of the 27th International Symposium on Power and Timing Modeling, 2017

Approximate DIV and SQRT instructions for the RISC-V ISA: An efficiency vs. accuracy analysis.
Proceedings of the 27th International Symposium on Power and Timing Modeling, 2017

Energy Saving and Thermal Management Opportunities in a Workload-Aware MPI Runtime for a Scientific HPC Computing Node.
Proceedings of the Parallel Computing is Everywhere, 2017

Soft-to-Hard Vector Quantization for End-to-End Learning Compressible Representations.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Towards a Novel HMI Paradigm Based on Mixed EEG and Indoor Localization Platforms.
Proceedings of the New Generation of CAS, 2017

Energy Efficient System for Tactile Data Decoding Using an Ultra-Low Power Parallel Platform.
Proceedings of the New Generation of CAS, 2017

A wearable EEG-based drowsiness detection system with blink duration and alpha waves analysis.
Proceedings of the 8th International IEEE/EMBS Conference on Neural Engineering, 2017

Target following on nano-scale Unmanned Aerial Vehicles.
Proceedings of the 7th IEEE International Workshop on Advances in Sensors and Interfaces, 2017

DeepEmote: Towards multi-layer neural networks in a low power wearable multi-sensors bracelet.
Proceedings of the 7th IEEE International Workshop on Advances in Sensors and Interfaces, 2017

Plenty of room at the bottom? Micropower deep learning for cognitive cyber physical systems.
Proceedings of the 7th IEEE International Workshop on Advances in Sensors and Interfaces, 2017

A sub-10mW real-time implementation for EMG hand gesture recognition based on a multi-core biomedical SoC.
Proceedings of the 7th IEEE International Workshop on Advances in Sensors and Interfaces, 2017

LightProbe: A 64-channel programmable ultrasound transducer head with an integrated front-end and a 26.4 Gb/s optical link.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2017

A 142MOPS/mW integrated programmable array accelerator for smart visual processing.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2017

A wide tuning-range ADFLL for mW-SoCs with dithering-enhanced accuracy in 65 nm CMOS.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2017

Design of an Energy Aware Petaflops Class High Performance Cluster Based on Power Architecture.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium Workshops, 2017

CAS-CNN: A deep convolutional neural network for image compression artifact suppression.
Proceedings of the 2017 International Joint Conference on Neural Networks, 2017

Long-term monitoring of small-sized birds using a miniaturized bluetooth-low-energy sensor node.
Proceedings of the 2017 IEEE SENSORS, Glasgow, United Kingdom, October 29, 2017

CBinfer: Change-Based Inference for Convolutional Neural Networks on Video Data.
Proceedings of the 11th International Conference on Distributed Smart Cameras, 2017

GPU-Accelerated Real-Time Path Planning and the Predictable Execution Model.
Proceedings of the International Conference on Computational Science, 2017

Multi-core data analytics SoC with a flexible 1.76 Gbit/s AES-XTS cryptographic accelerator in 65 nm CMOS.
Proceedings of the Fourth Workshop on Cryptography and Security in Computing Systems, 2017

Deep structured features for semantic segmentation.
Proceedings of the 25th European Signal Processing Conference, 2017

Impact of temporal subsampling on accuracy and performance in practical video classification.
Proceedings of the 25th European Signal Processing Conference, 2017

A multi-sensor and parallel processing SoC for wearable and implantable telemetry systems.
Proceedings of the 43rd IEEE European Solid State Circuits Conference, 2017

Paving the Way Towards a Highly Energy-Efficient and Highly Integrated Compute Node for the Exascale Revolution: The ExaNoDe Approach.
Proceedings of the Euromicro Conference on Digital System Design, 2017

Towards a Mobile Health Platform with Parallel Processing and Multi-sensor Capabilities.
Proceedings of the Euromicro Conference on Digital System Design, 2017

Ultra low-power visual odometry for nano-scale unmanned aerial vehicles.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2017

WULoRa: An energy efficient IoT end-node for energy harvesting and heterogeneous communication.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2017

A scan-chain based state retention methodology for IoT processors operating on intermittent energy.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2017

GPUguard: Towards supporting a predictable execution model for heterogeneous SoC.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2017

Continuous learning of HPC infrastructure models using big data analytics and in-memory processing tools.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2017

An Ultra-Low Power Address-Event Sensor Interface for Energy-Proportional Time-to-Information Extraction.
Proceedings of the 54th Annual Design Automation Conference, 2017

Stream Drive: A Dynamic Dataflow Framework For Clustered Embedded Architectures.
Proceedings of the Computing Frontiers Conference, 2017

Self-Sustainability in Nano Unmanned Aerial Vehicles: A Blimp Case Study.
Proceedings of the Computing Frontiers Conference, 2017

A 2.1 μW event-driven wake-up circuit based on a level-crossing ADC for pattern recognition in healthcare.
Proceedings of the IEEE Biomedical Circuits and Systems Conference, 2017

Efficient mapping of CDFG onto coarse-grained reconfigurable array architectures.
Proceedings of the 22nd Asia and South Pacific Design Automation Conference, 2017

Optimal Tiling Strategy for Memory Bandwidth Reduction for CNNs.
Proceedings of the Advanced Concepts for Intelligent Vision Systems, 2017

Benefits in Relaxing the Power Capping Constraint.
Proceedings of the 1st Workshop on AutotuniNg and aDaptivity AppRoaches for Energy efficient HPC Systems, 2017

2016
He-P2012: Performance and Energy Exploration of Architecturally Heterogeneous Many-Cores.
J. Signal Process. Syst., 2016

PULP: A Ultra-Low Power Parallel Accelerator for Energy-Efficient and Flexible Embedded Vision.
J. Signal Process. Syst., 2016

Ekho: A 30.3W, 10k-Channel Fully Digital Integrated 3-D Beamformer for Medical Ultrasound Imaging Achieving 298M Focal Points per Second.
IEEE Trans. Very Large Scale Integr. Syst., 2016

A Constraint Programming Scheduler for Heterogeneous High-Performance Computing Machines.
IEEE Trans. Parallel Distributed Syst., 2016

Power, Area, and Performance Optimization of Standard Cell Memory Arrays Through Controlled Placement.
ACM Trans. Design Autom. Electr. Syst., 2016

Design, Implementation, and Performance Evaluation of a Flexible Low-Latency Nanowatt Wake-Up Radio Receiver.
IEEE Trans. Ind. Informatics, 2016

Integrated Energy-Aware Management of Supercomputer Hybrid Cooling Systems.
IEEE Trans. Ind. Informatics, 2016

VirtualSoC: A Research Tool for Modern MPSoCs.
ACM Trans. Embed. Comput. Syst., 2016

Hybrid ASIC/FPGA System for Fully Automatic Stereo-to-Multiview Conversion Using IDW.
IEEE Trans. Circuits Syst. Video Technol., 2016

Thermal Analysis and Interpolation Techniques for a Logic + WideIO Stacked DRAM Test Chip.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2016

Hibernus++: A Self-Calibrating and Adaptive System for Transiently-Powered Embedded Devices.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2016

Graceful Performance Modulation for Power-Neutral Transient Computing Systems.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2016

InfiniTime: Multi-sensor wearable bracelet with human body harvesting.
Sustain. Comput. Informatics Syst., 2016

Variability Mitigation in Nanometer CMOS Integrated Systems: A Survey of Techniques From Circuits to Software.
Proc. IEEE, 2016

Controlling NUMA effects in embedded manycore applications with lightweight nested parallelism support.
Parallel Comput., 2016

Associative Memristive Memory for Approximate Computing in GPUs.
IEEE J. Emerg. Sel. Topics Circuits Syst., 2016

CIRCA-GPUs: Increasing Instruction Reuse Through Inexact Computing in GP-GPUs.
IEEE Des. Test, 2016

Computationally Efficient Target Classification in Multispectral Image Data with Deep Neural Networks.
CoRR, 2016

A dual-band wake-up radio for ultra-low power Wireless Sensor Networks.
Proceedings of the IEEE Topical Conference on Wireless Sensors and Sensor Networks, 2016

Predictive Modeling for Job Power Consumption in HPC Systems.
Proceedings of the High Performance Computing - 31st International Conference, 2016

A Dual Processor Energy-Efficient Platform with Multi-core Accelerator for Smart Sensing.
Proceedings of the Sensor Systems and Software - 7th International Conference, S-Cube 2016, 2016

SHelmet: An Intelligent Self-sustaining Multi Sensors Smart Helmet for Bikers.
Proceedings of the Sensor Systems and Software - 7th International Conference, S-Cube 2016, 2016

SNW-MAC: An Asynchronous Protocol Leveraging Wake-Up Receivers for Data Gathering in Star Networks.
Proceedings of the Sensor Systems and Software - 7th International Conference, S-Cube 2016, 2016

A high-efficiency runtime reconfigurable IP for CNN acceleration on a mid-range all-programmable SoC.
Proceedings of the International Conference on ReConFigurable Computing and FPGAs, 2016

YodaNN: An Ultra-Low Power Convolutional Neural Network Accelerator Based on Binary Weights.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2016

4.6 A 65nm CMOS 6.4-to-29.2pJ/FLOP@0.8V shared logarithmic floating point unit for acceleration of nonlinear function kernels in a tightly coupled processor cluster.
Proceedings of the 2016 IEEE International Solid-State Circuits Conference, 2016

A heterogeneous multi-core system-on-chip for energy efficient brain inspired vision.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2016

Autonomous smartwatch with flexible sensors for accurate and continuous mapping of skin temperature.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2016

Energy-efficient design of an always-on smart visual trigger.
Proceedings of the IEEE International Smart Cities Conference, 2016

Poster Abstract: KinetiSee - A Perpetual Wearable Camera Acquisition System with a Kinetic Harvester.
Proceedings of the 15th ACM/IEEE International Conference on Information Processing in Sensor Networks, 2016

Poster Abstract: An Ultra-Low Power Wake up Radio with Addressing and Retransmission Capabilities for Advanced Energy Efficient MAC Protocols.
Proceedings of the 15th ACM/IEEE International Conference on Information Processing in Sensor Networks, 2016

Poster Abstract: MagoNode++ - A Wake-Up-Radio-Enabled Wireless Sensor Mote for Energy-Neutral Applications.
Proceedings of the 15th ACM/IEEE International Conference on Information Processing in Sensor Networks, 2016

Poster Abstract: Wake-Up Receivers for Energy Efficient and Low Latency Communication.
Proceedings of the 15th ACM/IEEE International Conference on Information Processing in Sensor Networks, 2016

A contactless three-phase autonomous power meter.
Proceedings of the 2016 IEEE SENSORS, Orlando, FL, USA, October 30 - November 3, 2016, 2016

Evaluation of synchronization protocols for fine-grain HPC sensor data time-stamping and collection.
Proceedings of the International Conference on High Performance Computing & Simulation, 2016

Cooling-aware node-level task allocation for next-generation green HPC systems.
Proceedings of the International Conference on High Performance Computing & Simulation, 2016

Thermal model identification of supercomputing nodes in production environment.
Proceedings of the IECON 2016, 2016

Hyperdimensional biosignal processing: A case study for EMG-based hand gesture recognition.
Proceedings of the IEEE International Conference on Rebooting Computing, 2016

Always-on motion detection with application-level error control on a near-threshold approximate computing platform.
Proceedings of the 2016 IEEE International Conference on Electronics, Circuits and Systems, 2016

VarDroid: Online Variability Emulation in Android/Linux Platforms.
Proceedings of the 26th edition on Great Lakes Symposium on VLSI, 2016

Analytical and Experimental Evaluation of Wake-Up Receivers Based Protocols.
Proceedings of the 2016 IEEE Global Communications Conference, 2016

Mobile Ultrasound Imaging on Heterogeneous Multi-Core Platforms.
Proceedings of the 14th ACM/IEEE Symposium on Embedded Systems for Real-Time Multimedia, 2016

A 2 MS/s 10A Hall current sensor SoC with digital compressive sensing encoder in 0.16 µm BCD.
Proceedings of the ESSCIRC Conference 2016: 42<sup>nd</sup> European Solid-State Circuits Conference, 2016

Context Change Detection for an Ultra-Low Power Low-Resolution Ego-Vision Imager.
Proceedings of the Computer Vision - ECCV 2016 Workshops, 2016

DARDIS: Distributed And Randomized DIspatching and Scheduling.
Proceedings of the ECAI 2016 - 22nd European Conference on Artificial Intelligence, 29 August-2 September 2016, The Hague, The Netherlands, 2016

A Low Latency and Energy Efficient Communication Architecture for Heterogeneous Long-Short Range Communication.
Proceedings of the 2016 Euromicro Conference on Digital System Design, 2016

Autotuning and adaptivity approach for energy efficient Exascale HPC systems: The ANTAREX approach.
Proceedings of the 2016 Design, Automation & Test in Europe Conference & Exhibition, 2016

High-efficiency logarithmic number unit design based on an improved cotransformation scheme.
Proceedings of the 2016 Design, Automation & Test in Europe Conference & Exhibition, 2016

Towards near-threshold server processors.
Proceedings of the 2016 Design, Automation & Test in Europe Conference & Exhibition, 2016

Low-power multichannel spectro-temporal feature extraction circuit for audio pattern wake-up.
Proceedings of the 2016 Design, Automation & Test in Europe Conference & Exhibition, 2016

A power-efficient 3-D on-chip interconnect for multi-core accelerators with stacked L2 cache.
Proceedings of the 2016 Design, Automation & Test in Europe Conference & Exhibition, 2016

Dynamic energy burst scaling for transiently powered systems.
Proceedings of the 2016 Design, Automation & Test in Europe Conference & Exhibition, 2016

An optimized task-based runtime system for resource-constrained parallel accelerators.
Proceedings of the 2016 Design, Automation & Test in Europe Conference & Exhibition, 2016

Quantifying the benefits of compressed sensing on a WBSN-based real-time biosignal monitor.
Proceedings of the 2016 Design, Automation & Test in Europe Conference & Exhibition, 2016

Enabling the heterogeneous accelerator model on ultra-low power microcontroller platforms.
Proceedings of the 2016 Design, Automation & Test in Europe Conference & Exhibition, 2016

193 MOPS/mW @ 162 MOPS, 0.32V to 1.15V voltage range multi-core accelerator for energy efficient parallel and sequential digital processing.
Proceedings of the 2016 IEEE Symposium in Low-Power and High-Speed Chips, 2016

The ANTAREX approach to autotuning and adaptivity for energy efficient HPC systems.
Proceedings of the ACM International Conference on Computing Frontiers, CF'16, 2016

An energy-efficient parallel algorithm for real-time near-optimal UAV path planning.
Proceedings of the ACM International Conference on Computing Frontiers, CF'16, 2016

Curbing the roofline: a scalable and flexible architecture for CNNs on FPGA.
Proceedings of the ACM International Conference on Computing Frontiers, CF'16, 2016

PHIDIAS: ultra-low-power holistic design for smart bio-signals computing platforms.
Proceedings of the ACM International Conference on Computing Frontiers, CF'16, 2016

Sub-PicoJoule per operation scalable computing: why, when, how?
Proceedings of the ACM International Conference on Computing Frontiers, CF'16, 2016

Enabling OpenVX support in mW-scale parallel accelerators.
Proceedings of the 2016 International Conference on Compilers, 2016

Application of compressed sensing to ECG signals: Decoder-side benefits of the rakeness approach.
Proceedings of the IEEE Biomedical Circuits and Systems Conference, 2016

Sampling modulation: An energy efficient novel feature extraction for biosignal processing.
Proceedings of the IEEE Biomedical Circuits and Systems Conference, 2016

Scalable EEG seizure detection on an ultra low power multi-core architecture.
Proceedings of the IEEE Biomedical Circuits and Systems Conference, 2016

Accuracy and Performance Trade-Offs of Logarithmic Number Units in Multi-Core Clusters.
Proceedings of the 23nd IEEE Symposium on Computer Arithmetic, 2016

Design and Evaluation of a Processing-in-Memory Architecture for the Smart Memory Cube.
Proceedings of the Architecture of Computing Systems - ARCS 2016, 2016

A Contactless, Energy-Neutral Power Meter for Smart City Applications.
Proceedings of the Applications in Electronics Pervading Industry, Environment and Society, 2016

Long-Range Radio for Underground Sensors in Geothermal Energy Systems.
Proceedings of the Applications in Electronics Pervading Industry, Environment and Society, 2016

2015
Cost-Effective Design of Mesh-of-Tree Interconnect for Multicore Clusters With 3-D Stacked L2 Scratchpad Memory.
IEEE Trans. Very Large Scale Integr. Syst., 2015

A Modular Shared L2 Memory Design for 3-D Integration.
IEEE Trans. Very Large Scale Integr. Syst., 2015

GPU Acceleration for Simulating Massively Parallel Many-Core Platforms.
IEEE Trans. Parallel Distributed Syst., 2015

Simplifying Many-Core-Based Heterogeneous SoC Programming With Offload Directives.
IEEE Trans. Ind. Informatics, 2015

Guaranteed Computational Resprinting via Model-Predictive Control.
ACM Trans. Embed. Comput. Syst., 2015

3D CV Descriptor on Parallel Heterogeneous Platforms.
ACM Trans. Embed. Comput. Syst., 2015

A Reconfigurable 5-to-14 bit SAR ADC for Battery-Powered Medical Instrumentation.
IEEE Trans. Circuits Syst. I Regul. Pap., 2015

A Low-Power Architecture for Punctured Compressed Sensing and Estimation in Wireless Sensor-Nodes.
IEEE Trans. Circuits Syst. I Regul. Pap., 2015

Energy-Efficiency Analysis of Analog and Digital Compressive Sensing in Wireless Sensors.
IEEE Trans. Circuits Syst. I Regul. Pap., 2015

Architecture Support for Tightly-Coupled Multi-Core Clusters with Shared-Memory HW Accelerators.
IEEE Trans. Computers, 2015

A Versatile Embedded Platform for EMG Acquisition and Gesture Recognition.
IEEE Trans. Biomed. Circuits Syst., 2015

Aging-Aware Compilation for GP-GPUs.
ACM Trans. Archit. Code Optim., 2015

Sub-Sampling Framework Comparison for Low-Power Data Gathering: A Comparative Analysis.
Sensors, 2015

Temperature variation aware multi-scale delay, power and thermal analysis at RT and gate level.
Integr., 2015

Hibernus: Sustaining Computation During Intermittent Supply for Energy-Harvesting Systems.
IEEE Embed. Syst. Lett., 2015

A 2.4 GHz-868 MHz dual-band wake-up radio for wireless sensor network and IoT.
Proceedings of the 11th IEEE International Conference on Wireless and Mobile Computing, 2015

Tailoring instruction-set extensions for an ultra-low power tightly-coupled cluster of OpenRISC cores.
Proceedings of the 2015 IFIP/IEEE International Conference on Very Large Scale Integration, 2015

Automatic multiview synthesis - Prototype demo.
Proceedings of the 2015 Visual Communications and Image Processing, 2015

Automatic multiview synthesis - Towards a mobile system on a chip.
Proceedings of the 2015 Visual Communications and Image Processing, 2015

A framework for optimizing OpenVX applications performance on embedded manycore accelerators.
Proceedings of the 18th International Workshop on Software and Compilers for Embedded Systems, 2015

Self-powered wireless sensor nodes for monitoring radioactivity in contaminated areas using unmanned aerial vehicles.
Proceedings of the IEEE Sensors Applications Symposium, 2015

Non-intrusive Zigbee power meter for load monitoring in smart buildings.
Proceedings of the IEEE Sensors Applications Symposium, 2015

Experimental evaluation of a sEMG-based human-robot interface for human-like grasping tasks.
Proceedings of the 2015 IEEE International Conference on Robotics and Biomimetics, 2015

Long-Term ECG monitoring with zeroing Compressed Sensing approach.
Proceedings of the Nordic Circuits and Systems Conference, 2015

An Energy Neutral Wearable Camera with EPD Display.
Proceedings of the 2015 workshop on Wearable Systems and Applications, 2015

ADRENALINE: An OpenVX Environment to Optimize Embedded Vision Applications on Many-core Accelerators.
Proceedings of the IEEE 9th International Symposium on Embedded Multicore/Many-core Systems-on-Chip, 2015

Enabling Scalable and Fine-Grained Nested Parallelism on Embedded Many-cores.
Proceedings of the IEEE 9th International Symposium on Embedded Multicore/Many-core Systems-on-Chip, 2015

Energy-Aware Bio-signal Compressed Sensing Reconstruction: FOCUSS on the WBSN-Gateway.
Proceedings of the IEEE 9th International Symposium on Embedded Multicore/Many-core Systems-on-Chip, 2015

Towards Internet of Things for event-driven low-power gas sensing using carbon nanotubes.
Proceedings of the 6th International Workshop on Advances in Sensors and Interfaces, 2015

Synergistic Architecture and Programming Model Support for Approximate Micropower Computing.
Proceedings of the 2015 IEEE Computer Society Annual Symposium on VLSI, 2015

3.8 A 0.45-to-0.7V 1-to-6Gb/S 0.29-to-0.58pJ/b source-synchronous transceiver using automatic phase calibration in 65nm CMOS.
Proceedings of the 2015 IEEE International Solid-State Circuits Conference, 2015

Message from the general chairs.
Proceedings of the IEEE/ACM International Symposium on Low Power Electronics and Design, 2015

Ultra-Low Power Context Recognition Fusing Sensor Data from an Energy-Neutral Smart Watch.
Proceedings of the Internet of Things. IoT Infrastructures, 2015

Beyond duty cycling: Wake-up radio with selective awakenings for long-lived wireless sensing systems.
Proceedings of the 2015 IEEE Conference on Computer Communications, 2015

Hybrid EMG classifier based on HMM and SVM for hand gesture recognition in prosthetics.
Proceedings of the IEEE International Conference on Industrial Technology, 2015

Exploring architectural heterogeneity in intelligent vision systems.
Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture, 2015

PULP: A parallel ultra low power platform for next generation IoT applications.
Proceedings of the 2015 IEEE Hot Chips 27 Symposium (HCS), 2015

Playing with Fire: Transactional Memory Revisited for Error-Resilient and Energy-Efficient MPSoC Execution.
Proceedings of the 25th edition on Great Lakes Symposium on VLSI, GLVLSI 2015, Pittsburgh, PA, USA, May 20, 2015

Origami: A Convolutional Network Accelerator.
Proceedings of the 25th edition on Great Lakes Symposium on VLSI, GLVLSI 2015, Pittsburgh, PA, USA, May 20, 2015

Extending Body Sensor Nodes' Lifetime Using a Wearable Wake-up Radio.
Proceedings of the Future Access Enablers for Ubiquitous and Intelligent Infrastructures, 2015

Context Aware Power Management Enhanced by Radio Wake Up in Body Area Networks.
Proceedings of the 13th IEEE International Conference on Embedded and Ubiquitous Computing, 2015

Digitally controlled feedback for DC offset cancellation in a wearable multichannel EMG platform.
Proceedings of the 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2015

DRAM or no-DRAM?: exploring linear solver architectures for image domain warping in 28 nm CMOS.
Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, 2015

Approximate associative memristive memory for energy-efficient GPUs.
Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, 2015

Tackling the bottleneck of delay tables in 3D ultrasound imaging.
Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, 2015

Paper, pen and ink: an innovative system and software framework to assist writing rehabilitation.
Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, 2015

Reducing energy consumption in microcontroller-based platforms with low design margin co-processors.
Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, 2015

Energy-aware cooling for hot-water cooled supercomputers.
Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, 2015

An ultra-low power dual-mode ECG monitor for healthcare and wellness.
Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, 2015

High performance AXI-4.0 based interconnect for extensible smart memory cubes.
Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, 2015

A ultra-low-energy convolution engine for fast brain-inspired vision in multicore clusters.
Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, 2015

Task scheduling strategies to mitigate hardware variability in embedded shared memory clusters.
Proceedings of the 52nd Annual Design Automation Conference, 2015

Accelerating real-time embedded scene labeling with convolutional networks.
Proceedings of the 52nd Annual Design Automation Conference, 2015

ANTAREX - AutoTuning and Adaptivity appRoach for Energy Efficient eXascale HPC Systems.
Proceedings of the 18th IEEE International Conference on Computational Science and Engineering, 2015

Power Capping in High Performance Computing Systems.
Proceedings of the Principles and Practice of Constraint Programming, 2015

Lightweight virtual memory support for many-core accelerators in heterogeneous embedded SoCs.
Proceedings of the 2015 International Conference on Hardware/Software Codesign and System Synthesis, 2015

An Evaluation of Memory Sharing Performance for Heterogeneous Embedded SoCs with Many-Core Accelerators.
Proceedings of the 2015 International Workshop on Code Optimisation for Multi and Many Cores, 2015

Runtime Support for Multiple Offload-Based Programming Models on Embedded Manycore Accelerators.
Proceedings of the 2015 International Workshop on Code Optimisation for Multi and Many Cores, 2015

Exploring multi-banked shared-L1 program cache on ultra-low power, tightly coupled processor clusters.
Proceedings of the 12th ACM International Conference on Computing Frontiers, 2015

Multiple Biopotentials Acquisition System for Wearable Applications.
Proceedings of the BIODEVICES 2015, 2015

Controlled placement of standard cell memory arrays for high density and low power in 28nm FD-SOI.
Proceedings of the 20th Asia and South Pacific Design Automation Conference, 2015

A Smart LED Light Control System for Environmentally Friendly Buildings.
Proceedings of the Applications in Electronics Pervading Industry, Environment and Society, 2015

Sensormind: Virtual Sensing and Complex Event Detection for Internet of Things.
Proceedings of the Applications in Electronics Pervading Industry, Environment and Society, 2015

A CP Scheduler for High-Performance Computers.
Proceedings of the Doctoral Consortium (DC) co-located with the 14th Conference of the Italian Association for Artificial Intelligence (AI*IA 2015), 2015

2014
A Novel Object-Oriented Software Cache for Scratchpad-Based Multi-Core Clusters.
J. Signal Process. Syst., 2014

Ensuring Survivability of Resource-Intensive Sensor Networks Through Ultra-Low Power Overlays.
IEEE Trans. Ind. Informatics, 2014

Compressive Sensing Optimization for Signal Ensembles in WSNs.
IEEE Trans. Ind. Informatics, 2014

Extended Wireless Monitoring Through Intelligent Hybrid Energy Supply.
IEEE Trans. Ind. Electron., 2014

Bias-Compensated Least Squares Identification of Distributed Thermal Models for Many-Core Systems-on-Chip.
IEEE Trans. Circuits Syst. I Regul. Pap., 2014

Application-Adaptive Guardbanding to Mitigate Static and Dynamic Variability.
IEEE Trans. Computers, 2014

At-Speed Distributed Functional Testing to Detect Logic and Delay Faults in NoCs.
IEEE Trans. Computers, 2014

An Effective Gray-Box Identification Procedure for Multicore Thermal Modeling.
IEEE Trans. Computers, 2014

Clamp-and-Forget: A self-sustainable non-invasive wireless sensor node for smart metering applications.
Microelectron. J., 2014

A low power wireless node for contact and contactless heart monitoring.
Microelectron. J., 2014

An ultra-low power resilient multi-core architecture with static and dynamic tolerance to ambient temperature-induced variability.
Microprocess. Microsystems, 2014

Message Passing-Aware Power Management on Many-Core Systems.
J. Low Power Electron., 2014

Sleep power minimisation using adaptive duty-cycling of DC-DC converters in state-retentive systems.
IET Circuits Devices Syst., 2014

Improving Resilience to Timing Errors by Exposing Variability Effects to Software in Tightly-Coupled Processor Clusters.
IEEE J. Emerg. Sel. Topics Circuits Syst., 2014

CROSS cyclic resource-constrained scheduling solver.
Artif. Intell., 2014

An ultra low power high sensitivity wake-up radio receiver with addressing capability.
Proceedings of the IEEE 10th International Conference on Wireless and Mobile Computing, 2014

Optimized active and power-down mode refresh control in 3D-DRAMs.
Proceedings of the 22nd International Conference on Very Large Scale Integration, 2014

Energy-efficient vision on the PULP platform for ultra-low power parallel computing.
Proceedings of the 2014 IEEE Workshop on Signal Processing Systems, 2014

Speculative synchronization for coherence-free embedded NUMA architectures.
Proceedings of the XIVth International Conference on Embedded Computer Systems: Architectures, 2014

Towards EMG control interface for smart garments.
Proceedings of the ISWC'14, 2014

Quantifying the impact of variability on the energy efficiency for a next-generation ultra-green supercomputer.
Proceedings of the International Symposium on Low Power Electronics and Design, 2014

Approximate compressed sensing: ultra-low power biosignal processing via aggressive voltage scaling on a hybrid memory multi-core processor.
Proceedings of the International Symposium on Low Power Electronics and Design, 2014

An architecture for low-power compressed sensing and estimation in wireless sensor nodes.
Proceedings of the IEEE International Symposium on Circuits and Systemss, 2014

A Virtualization Framework for IOMMU-less Many-Core Accelerators.
Proceedings of the 2nd International Workshop on Many-core Embedded Systems, 2014

A high-sensitivity fully passive wake-up radio front-end for wireless sensor nodes.
Proceedings of the IEEE International Conference on Consumer Electronics, 2014

Dynamic variability management in mobile multicore processors under lifetime constraints.
Proceedings of the 32nd IEEE International Conference on Computer Design, 2014

Optimum: Thermal-aware task allocation for heterogeneous many-core devices.
Proceedings of the International Conference on High Performance Computing & Simulation, 2014

InfiniTime: A multi-sensor energy neutral wearable bracelet.
Proceedings of the International Green Computing Conference, 2014

Efficient parallel beamforming for 3D ultrasound imaging.
Proceedings of the Great Lakes Symposium on VLSI 2014, GLSVLSI '14, Houston, TX, USA - May 21, 2014

Customizing an open source processor to fit in an ultra-low power cluster with a shared L1 memory.
Proceedings of the Great Lakes Symposium on VLSI 2014, GLSVLSI '14, Houston, TX, USA - May 21, 2014

An On-line Reliability Emulation Framework.
Proceedings of the 12th IEEE International Conference on Embedded and Ubiquitous Computing, 2014

A HLS-Based Toolflow to Design Next-Generation Heterogeneous Many-Core Platforms with Shared Memory.
Proceedings of the 12th IEEE International Conference on Embedded and Ubiquitous Computing, 2014

Energy optimization in 3D MPSoCs with Wide-I/O DRAM using temperature variation aware bank-wise refresh.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2014

Temporal memoization for energy-efficient timing error recovery in GPGPUs.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2014

A Linux-governor based Dynamic Reliability Manager for android mobile devices.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2014

A multi banked - Multi ported - Non blocking shared L2 cache for MPSoC platforms.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2014

Context aware power management for motion-sensing body area network nodes.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2014

Tightly-coupled hardware support to dynamic parallelism acceleration in embedded shared memory clusters.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2014

A tightly-coupled hardware controller to improve scalability and programmability of shared-memory heterogeneous clusters.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2014

Hybrid memory architecture for voltage scaling in ultra-low power multi-core biomedical processors.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2014

Thermal analysis and model identification techniques for a logic + WIDEIO stacked DRAM test chip.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2014

Unveiling Eurora - Thermal and power characterization of the most energy-efficient supercomputer in the world.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2014

Optimizing memory bandwidth in OpenVX graph execution on embedded many-core accelerators.
Proceedings of the 2014 Conference on Design and Architectures for Signal and Image Processing, 2014

0, 1, 2, many - A classroom occupancy monitoring system for smart public buildings.
Proceedings of the 2014 Conference on Design and Architectures for Signal and Image Processing, 2014

Rakeness-based compressed sensing on ultra-low power multi-core biomedicai processors.
Proceedings of the 2014 Conference on Design and Architectures for Signal and Image Processing, 2014

An Approximate Computing Technique for Reducing the Complexity of a Direct-Solver for Sparse Linear Systems in Real-Time Video Processing.
Proceedings of the 51st Annual Design Automation Conference 2014, 2014

Energy-Efficient GPGPU Architectures via Collaborative Compilation and Memristive Memory-Based Computing.
Proceedings of the 51st Annual Design Automation Conference 2014, 2014

Gesture Recognition in Ego-centric Videos Using Dense Trajectories and Hand Segmentation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014

Brain-Inspired Classroom Occupancy Monitoring on a Low-Power Mobile Platform.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014

Supporting localized OpenVX kernel execution for efficient computer vision application development on STHORM many-core platform.
Proceedings of the Computing Frontiers Conference, CF'14, 2014

Ultra-low-latency lightweight DMA for tightly coupled multi-core clusters.
Proceedings of the Computing Frontiers Conference, CF'14, 2014

Analysis of Robust Implementation of an EMG Pattern Recognition based Control.
Proceedings of the BIOSIGNALS 2014, 2014

Assessing the area/power/performance tradeoffs for an integrated fully-digital, large-scale 3D-ultrasound beamformer.
Proceedings of the IEEE Biomedical Circuits and Systems Conference, 2014

EMG-based hand gesture recognition with flexible analog front end.
Proceedings of the IEEE Biomedical Circuits and Systems Conference, 2014

Exploring DMA-assisted prefetching strategies for software caches on multicore clusters.
Proceedings of the IEEE 25th International Conference on Application-Specific Systems, 2014

SIR10US: A tightly coupled elliptic-curve cryptography co-processor for the OpenRISC.
Proceedings of the IEEE 25th International Conference on Application-Specific Systems, 2014

He-P2012: Architectural heterogeneity exploration on a scalable many-core platform.
Proceedings of the IEEE 25th International Conference on Application-Specific Systems, 2014

2013
Aging-Aware Energy-Efficient Workload Allocation for Mobile Multimedia Platforms.
IEEE Trans. Parallel Distributed Syst., 2013

Thermal and Energy Management of High-Performance Multicores: Distributed and Self-Calibrating Model-Predictive Controller.
IEEE Trans. Parallel Distributed Syst., 2013

Designing best effort networks-on-chip to meet hard latency constraints.
ACM Trans. Embed. Comput. Syst., 2013

Spatial Memoization: Concurrent Instruction Reuse to Correct Timing Errors in SIMD Architectures.
IEEE Trans. Circuits Syst. II Express Briefs, 2013

Exploration and Optimization of 3-D Integrated DRAM Subsystems.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2013

Computing Accurate Performance Bounds for Best Effort Networks-on-Chip.
IEEE Trans. Computers, 2013

Robust Scheduling of Task Graphs under Execution Time Uncertainty.
IEEE Trans. Computers, 2013

An integrated, programming model-driven framework for NoC-QoS support in cluster-based embedded many-cores.
Parallel Comput., 2013

Maximum-throughput mapping of SDFGs on multi-core SoC platforms.
J. Parallel Distributed Comput., 2013

A case for three-dimensional stacking of tightly coupled data memories over multi-core clusters using low-latency interconnects.
IET Comput. Digit. Tech., 2013

Multimodal Video Analysis on Self-Powered Resource-Limited Wireless Smart Camera.
IEEE J. Emerg. Sel. Topics Circuits Syst., 2013

SIM<i>in</i>G-1<i>k</i>: A thousand-core simulator running on general-purpose graphical processing units.
Concurr. Comput. Pract. Exp., 2013

Wearable low power dry surface wireless sensor node for healthcare monitoring application.
Proceedings of the 9th IEEE International Conference on Wireless and Mobile Computing, 2013

A Complete Real-Time Feature Extraction and Matching System Based on Semantic Kernels Binarized.
Proceedings of the VLSI-SoC: At the Crossroads of Emerging Trends, 2013

SWIFTNET: A data acquisition protocol for fast-reactive monitoring applications.
Proceedings of the 8th IEEE International Symposium on Industrial Embedded Systems, 2013

Heterogeneous multi-harvester for wireless sensor networks.
Proceedings of the 1st International Workshop on Energy Neutral Sensing Systems, 2013

Improving the efficiency of air-flow energy harvesters combining active and passive rectifiers.
Proceedings of the 1st International Workshop on Energy Neutral Sensing Systems, 2013

Powering wireless sensor nodes with micro fuel cells.
Proceedings of the 1st International Workshop on Energy Neutral Sensing Systems, 2013

Power saving policies for multipurpose WBAN.
Proceedings of the 2013 23rd International Workshop on Power and Timing Modeling, 2013

A variation tolerant architecture for ultra low power multi-processor cluster.
Proceedings of the 2013 23rd International Workshop on Power and Timing Modeling, 2013

On-line thermal emulation: How to speed-up your thermal controller design.
Proceedings of the 2013 23rd International Workshop on Power and Timing Modeling, 2013

3D logarithmic interconnect: Stacking multiple L1 memory dies over multi-core clusters.
Proceedings of the 2013 Seventh IEEE/ACM International Symposium on Networks-on-Chip (NoCS), 2013

Efficient energy management and data recovery in sensor networks using latent variables based tensor factorization.
Proceedings of the 16th ACM International Conference on Modeling, 2013

Clamp-and-measure forever: A MOSFET-based circuit for energy harvesting and measurement targeted for power meters.
Proceedings of the 5th IEEE International Workshop on Advances in Sensors and Interfaces, 2013

A versatile biomedical wireless sensor node with novel drysurface sensors and energy efficient power management.
Proceedings of the 5th IEEE International Workshop on Advances in Sensors and Interfaces, 2013

Designing next-generation smart sensor hubs for the Internet-of-Things.
Proceedings of the 5th IEEE International Workshop on Advances in Sensors and Interfaces, 2013

Prolonging the lifetime of wireless sensor networks using light-weight forecasting algorithms.
Proceedings of the 2013 IEEE Eighth International Conference on Intelligent Sensors, 2013

Transparent and energy-efficient speculation on NUMA architectures for embedded MPSoCs.
Proceedings of the 1st International Workshop on Many-core Embedded Systems 2013, 2013

Improving the programmability of STHORM-based heterogeneous systems with offload-enabled OpenMP.
Proceedings of the 1st International Workshop on Many-core Embedded Systems 2013, 2013

VirtualSoC: A Full-System Simulation Environment for Massively Parallel Heterogeneous System-on-Chip.
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

Low-power wireless accelerometer-based system for wear detection of bandsaw blades.
Proceedings of the 11th IEEE International Conference on Industrial Informatics, 2013

A power-aware multi harvester power unit with hydrogen fuel cell for embedded systems in outdoor applications.
Proceedings of the International Green Computing Conference, 2013

Energy and performance exploration of accelerator coherency port using Xilinx ZYNQ.
Proceedings of the 10th FPGAworld Conference, 2013

Errors-in-variables identification of thermal models for many-core computing systems.
Proceedings of the 12th European Control Conference, 2013

An Ambient Temperature Variation Tolerance Scheme for an Ultra Low Power Shared-L1 Processor Cluster.
Proceedings of the 2013 Euromicro Conference on Digital System Design, 2013

An Application-Specific Forecasting Algorithm for Extending WSN Lifetime.