Paolo Ienne

Orcid: 0000-0002-6142-7345

Affiliations:
  • Swiss Federal Institute of Technology in Lausanne, Switzerland


According to our database1, Paolo Ienne authored at least 215 papers between 1993 and 2024.

Collaborative distances:
  • Dijkstra number2 of three.
  • Erdős number3 of two.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Exploring FPGA Switch-Blocks without Explicitly Listing Connectivity Patterns.
ACM Trans. Reconfigurable Technol. Syst., March, 2024

DynaRapid: From C to FPGA in a Few Seconds.
Proceedings of the 2024 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2024

Survival of the Fastest: Enabling More Out-of-Order Execution in Dataflow Circuits.
Proceedings of the 2024 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2024

2023
Resource Sharing in Dataflow Circuits.
ACM Trans. Reconfigurable Technol. Syst., December, 2023

Introduction to the Special Section on FPGA 2022.
ACM Trans. Reconfigurable Technol. Syst., December, 2023

Fast Parallel Algorithms for Enumeration of Simple, Temporal, and Hop-constrained Cycles.
ACM Trans. Parallel Comput., September, 2023

Regularity Matters: Designing Practical FPGA Switch-Blocks.
Proceedings of the 2023 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2023

Straight to the Queue: Fast Load-Store Queue Allocation in Dataflow Circuits.
Proceedings of the 2023 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2023

2022
Detailed Placement for Dedicated LUT-Level FPGA Interconnect.
ACM Trans. Reconfigurable Technol. Syst., 2022

Buffer Placement and Sizing for High-Performance Dataflow Circuits.
ACM Trans. Reconfigurable Technol. Syst., 2022

Request, Coalesce, Serve, and Forget: Miss-Optimized Memory Systems for Bandwidth-Bound Cache-Unfriendly Applications on FPGAs.
ACM Trans. Reconfigurable Technol. Syst., 2022

From C/C++ Code to High-Performance Dataflow Circuits.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

DASS: Combining Dynamic & Static Scheduling in High-Level Synthesis.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

Scalable Fine-Grained Parallel Cycle Enumeration Algorithms.
Proceedings of the SPAA '22: 34th ACM Symposium on Parallelism in Algorithms and Architectures, Philadelphia, PA, USA, July 11, 2022

A Comprehensive Timing Model for Accurate Frequency Tuning in Dataflow Circuits.
Proceedings of the 32nd International Conference on Field-Programmable Logic and Applications, 2022

Unleashing Parallelism in Elastic Circuits with Faster Token Delivery.
Proceedings of the 32nd International Conference on Field-Programmable Logic and Applications, 2022

2021
How Many CPU Cores is an FPGA Worth? Lessons Learned from Accelerating String Sorting on a CPU-FPGA System.
J. Signal Process. Syst., 2021

Large-Scale Graph Processing on FPGAs with Caches for Thousands of Simultaneous Misses.
Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021

Turning PathFinder Upside-Down: Exploring FPGA Switch-Blocks by Negotiating Switch Presence.
Proceedings of the 31st International Conference on Field-Programmable Logic and Applications, 2021

Global Is the New Local: FPGA Architecture at 5nm and Beyond.
Proceedings of the FPGA '21: The 2021 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Virtual Event, USA, February 28, 2021

2020
Many-Core Clique Enumeration with Fast Set Intersections.
Proc. VLDB Endow., 2020

Parallelizing Maximal Clique Enumeration on Modern Manycore Processors.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium Workshops, 2020

Timing-Driven Placement for FPGA Architectures with Dedicated Routing Paths.
Proceedings of the 30th International Conference on Field-Programmable Logic and Applications, 2020

Straight to the Point: Intra- and Intercluster LUT Connections to Mitigate the Delay of Programmable Routing.
Proceedings of the FPGA '20: The 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2020

Invited Tutorial: Dynamatic: From C/C++ to Dynamically Scheduled Circuits.
Proceedings of the FPGA '20: The 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2020

Combining Dynamic & Static Scheduling in High-level Synthesis.
Proceedings of the FPGA '20: The 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2020

FPGAs in the Datacenters: the Case of Parallel Hybrid Super Scalar String Sample Sort.
Proceedings of the 31st IEEE International Conference on Application-specific Systems, 2020

2019
Snap-On User-Space Manager for Dynamically Reconfigurable System-on-Chips.
IEEE Access, 2019

Shrink It or Shed It! Minimize the Use of LSQs in Dataflow Designs.
Proceedings of the International Conference on Field-Programmable Technology, 2019

In Search of Lost Bandwidth: Extensive Reordering of DRAM Accesses on FPGA.
Proceedings of the International Conference on Field-Programmable Technology, 2019

Finding a Needle in the Haystack of Hardened Interconnect Patterns.
Proceedings of the 29th International Conference on Field Programmable Logic and Applications, 2019

DynaBurst: Dynamically Assemblying DRAM Bursts over a Multitude of Random Accesses.
Proceedings of the 29th International Conference on Field Programmable Logic and Applications, 2019

On Feasibility of FPGAs Without Dedicated Programmable Interconnect Structure.
Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2019

Speculative Dataflow Circuits.
Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2019

Stop Crying Over Your Cache Miss Rate: Handling Efficiently Thousands of Outstanding Misses in FPGAs.
Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2019

2018
Exploiting Compute Caches for Memory Bound Vector Operations.
Proceedings of the 30th International Symposium on Computer Architecture and High Performance Computing, 2018

Dynamically Scheduled High-level Synthesis.
Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2018

LEOSoC: An Open-Source Cross-Platform Embedded Linux Library for Managing Hardware Accelerators in Heterogeneous System-on-Chips(Abstract Only).
Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2018

FPGAs in the Datacenters: the Case of Parallel Hybrid Super Scalar String Sample Sort (pHS<sup>5</sup>)(Abstract Only).
Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2018

A Dynamically Reconfigurable Platform for High-Performance and Low-Power On-Board Processing.
Proceedings of the 2018 NASA/ESA Conference on Adaptive Hardware and Systems, 2018

Progressive Generation of Canonical Irredundant Sums of Products Using a SAT Solver.
Proceedings of the Advanced Logic Synthesis, 2018

2017
An Out-of-Order Load-Store Queue for Spatial Computing.
ACM Trans. Embed. Comput. Syst., 2017

An Accelerator for High Efficient Vision Processing.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2017

Virtualized Execution Runtime for FPGA Accelerators in the Cloud.
IEEE Access, 2017

Design Space Exploration of LDPC Decoders Using High-Level Synthesis.
IEEE Access, 2017

Improving Circuit Mapping Performance Through MIG-based Synthesis for Carry Chains.
Proceedings of the on Great Lakes Symposium on VLSI 2017, 2017

Evaluating FPGA clusters under wide ranges of design parameters.
Proceedings of the 27th International Conference on Field Programmable Logic and Applications, 2017

NAND-NOR: A Compact, Fast, and Delay Balanced FPGA Logic Element.
Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2017

Arbitrary Precision and Complexity Tradeoffs for Gate-Level Information Flow Tracking.
Proceedings of the 54th Annual Design Automation Conference, 2017

From C to elastic circuits.
Proceedings of the 51st Asilomar Conference on Signals, Systems, and Computers, 2017

2016
Designing Low Power and Durable Digital Blocks Using Shadow Nanoelectromechanical Relays.
IEEE Trans. Very Large Scale Integr. Syst., 2016

Guest Editorial: Special Issue on Models and Methodologies for System Design.
ACM Trans. Embed. Comput. Syst., 2016

Guest Editors' Introduction: Special Section on Emerging Memory Technologies in Very Large Scale Computing and Storage Systems.
IEEE Trans. Computers, 2016

Introduction: Special Section on Architecture of Future Many Core Systems.
Microprocess. Microsystems, 2016

Heuristic NPN Classification for Large Functions Using AIGs and LEXSAT.
Proceedings of the Theory and Applications of Satisfiability Testing - SAT 2016, 2016

Fast generation of lexicographic satisfiable assignments: enabling canonicity in SAT-based applications.
Proceedings of the 35th International Conference on Computer-Aided Design, 2016

Imprecise security: quality and complexity tradeoffs for hardware information flow tracking.
Proceedings of the 35th International Conference on Computer-Aided Design, 2016

Automatic wire modeling to explore novel FPGA architectures.
Proceedings of the 2016 International Conference on Field-Programmable Technology, 2016

Enriching C-based High-Level Synthesis with parallel pattern templates.
Proceedings of the 2016 International Conference on Field-Programmable Technology, 2016

Fast hierarchical NPN classification.
Proceedings of the 26th International Conference on Field Programmable Logic and Applications, 2016

Preface.
Proceedings of the 26th International Conference on Field Programmable Logic and Applications, 2016

Designing a virtual runtime for FPGA accelerators in the cloud.
Proceedings of the 26th International Conference on Field Programmable Logic and Applications, 2016

FPRESSO: Enabling Express Transistor-Level Exploration of FPGA Architectures.
Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2016

Instruction Set Extensions for secure applications.
Proceedings of the 2016 Design, Automation & Test in Europe Conference & Exhibition, 2016

2015
Enhancing Design Space Exploration by Extending CPU/GPU Specifications onto FPGAs.
ACM Trans. Embed. Comput. Syst., 2015

Libra: Software-Controlled Cell Bit-Density to Balance Wear in NAND Flash.
ACM Trans. Embed. Comput. Syst., 2015

Automatic Application of Power Analysis Countermeasures.
IEEE Trans. Computers, 2015

Exploring Automatically Generated Platforms in High Performance FPGAs.
Proceedings of the Parallel Computing: On the Road to Exascale, 2015

ShiDianNao: shifting vision processing closer to the sensor.
Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015

FudgeFactor: Syntax-Guided Synthesis for Accurate RTL Error Localization and Correction.
Proceedings of the Hardware and Software: Verification and Testing, 2015

Improved carry chain mapping for the VTR flow.
Proceedings of the 2015 International Conference on Field Programmable Technology, 2015

A technology mapper for depth-constrained FPGA logic cells.
Proceedings of the 25th International Conference on Field Programmable Logic and Applications, 2015

Automatic support for multi-module parallelism from computational patterns.
Proceedings of the 25th International Conference on Field Programmable Logic and Applications, 2015

From low-architectural expertise up to high-throughput non-binary LDPC decoders: Optimization guidelines using high-level synthesis.
Proceedings of the 25th International Conference on Field Programmable Logic and Applications, 2015

Fast Design Space Exploration Using Vivado HLS: Non-binary LDPC Decoders.
Proceedings of the 23rd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2015

Retraining-based timing error mitigation for hardware neural networks.
Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, 2015

2014
Way Stealing: A Unified Data Cache and Architecturally Visible Storage for Instruction Set Extensions.
IEEE Trans. Very Large Scale Integr. Syst., 2014

Virtual Ways: Low-Cost Coherence for Instruction Set Extensions with Architecturally Visible Storage.
ACM Trans. Archit. Code Optim., 2014

Reconfigurable Computing.
IEEE Micro, 2014

Constrained interpolation for guided logic synthesis.
Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, 2014

Hardware system synthesis from Domain-Specific Languages.
Proceedings of the 24th International Conference on Field Programmable Logic and Applications, 2014

Revisiting and-inverter cones.
Proceedings of the 2014 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2014

Wear unleveling: improving NAND flash lifetime by balancing page endurance.
Proceedings of the 12th USENIX conference on File and Storage Technologies, 2014

Energy efficient MIMO processing: A case study of opportunistic run-time approximations.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2014

SKETCHILOG: Sketching combinational circuits.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2014

A low-cost memory interface for high-throughput accelerators.
Proceedings of the 2014 International Conference on Compilers, 2014

2013
Selective Flexibility: Creating Domain-Specific Reconfigurable Arrays.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2013

Cracking the complexity of fixed-point refinement in complex wireless systems.
Proceedings of the IEEE Workshop on Signal Processing Systems, 2013

Spontaneous Reload Cache: Mimicking a Larger Cache with Minimal Hardware Requirement.
Proceedings of the IEEE Eighth International Conference on Networking, 2013

Making domain-specific hardware synthesis tools cost-efficient.
Proceedings of the 2013 International Conference on Field-Programmable Technology, 2013

Shadow And-Inverter Cones.
Proceedings of the 23rd International Conference on Field programmable Logic and Applications, 2013

Shadow AICs: reaping the benefits of and-inverter cones with minimal architectural impact (abstract only).
Proceedings of the 2013 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2013

Elastic CGRAs.
Proceedings of the 2013 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2013

A Case for Heterogeneous Technology-Mapping: Soft Versus Hard Multiplexers.
Proceedings of the 21st IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2013

Accuracy vs speed tradeoffs in the estimation of fixed-point errors on linear time-invariant systems.
Proceedings of the Design, Automation and Test in Europe, 2013

Phœnix: reviving MLC blocks as SLC to extend NAND flash devices lifetime.
Proceedings of the Design, Automation and Test in Europe, 2013

Fast and accurate BER estimation methodology for I/O links based on extreme value theory.
Proceedings of the Design, Automation and Test in Europe, 2013

An EDA-friendly protection scheme against side-channel attacks.
Proceedings of the Design, Automation and Test in Europe, 2013

Sleuth: Automated Verification of Software Power Analysis Countermeasures.
Proceedings of the Cryptographic Hardware and Embedded Systems - CHES 2013, 2013

Automated circuit elaboration from incomplete architectural descriptions.
Proceedings of the 2013 Asilomar Conference on Signals, 2013

2012
Interaction Between Fault Attack Countermeasures and the Resistance Against Power Analysis Attacks.
Proceedings of the Fault Analysis in Cryptography, 2012

Guest Editors' Introduction: Special Section on Computer Arithmetic.
IEEE Trans. Computers, 2012

Making wide-issue VLIW processors viable on FPGAs.
ACM Trans. Archit. Code Optim., 2012

An architecture-independent instruction shuffler to protect against side-channel attacks.
ACM Trans. Archit. Code Optim., 2012

Counting stream registers: An efficient and effective snoop filter architecture.
Proceedings of the 2012 International Conference on Embedded Computer Systems: Architectures, 2012

Rethinking FPGAs: elude the flexibility excess of LUTs with and-inverter cones.
Proceedings of the ACM/SIGDA 20th International Symposium on Field Programmable Gate Arrays, 2012

Reducing the cost of floating-point mantissa alignment and normalization in FPGAs.
Proceedings of the ACM/SIGDA 20th International Symposium on Field Programmable Gate Arrays, 2012

Shortening Design Time through Multiplatform Simulations with a Portable OpenCL Golden-model: The LDPC Decoder Case.
Proceedings of the 2012 IEEE 20th Annual International Symposium on Field-Programmable Custom Computing Machines, 2012

CRAW/P: A Workload Partition Method for the Efficient Parallel Simulation of Manycores.
Proceedings of the Euro-Par 2012 Parallel Processing - 18th International Conference, 2012

Selective flexibility: Breaking the rigidity of datapath merging.
Proceedings of the 2012 Design, Automation & Test in Europe Conference & Exhibition, 2012

Software controlled cell bit-density to improve NAND flash lifetime.
Proceedings of the 49th Annual Design Automation Conference 2012, 2012

2011
Compressor tree synthesis on commercial high-performance FPGAs.
ACM Trans. Reconfigurable Technol. Syst., 2011

Measuring and Reducing the Performance Gap between Embedded and Soft Multipliers on FPGAs.
Proceedings of the International Conference on Field Programmable Logic and Applications, 2011

Reducing the pressure on routing resources of FPGAs with generic logic chains.
Proceedings of the ACM/SIGDA 19th International Symposium on Field Programmable Gate Arrays, 2011

Power-gated MOS current mode logic (PG-MCML): a power aware DPA-resistant standard cell library.
Proceedings of the 48th Design Automation Conference, 2011

A first step towards automatic application of power analysis countermeasures.
Proceedings of the 48th Design Automation Conference, 2011

2010
Improving FPGA Performance for Carry-Save Arithmetic.
IEEE Trans. Very Large Scale Integr. Syst., 2010

Fast, Nearly Optimal ISE Identification With I/O Serialization Through Maximal Clique Enumeration.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2010

An Optimal Linear-Time Algorithm for Interprocedural Register Allocation in High Level Synthesis Using SSA Form.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2010

High performance comparison-based sorting algorithm on many-core GPUs.
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Virtual Ways: Efficient Coherence for Architecturally Visible Storage in Automatic Instruction Set Extensions.
Proceedings of the High Performance Embedded Architectures and Compilers, 2010

Synthesis of Floating-Point Addition Clusters on FPGAs Using Carry-Save Arithmetic.
Proceedings of the International Conference on Field Programmable Logic and Applications, 2010

Highly Versatile DSP Blocks for Improved FPGA Arithmetic Performance.
Proceedings of the 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2010

A high-level synthesis flow for custom instruction set extensions for application-specific processors.
Proceedings of the 15th Asia South Pacific Design Automation Conference, 2010

2009
An FPGA Logic Cell and Carry Chain Configurable as a 6: 2 or 7: 2 Compressor.
ACM Trans. Reconfigurable Technol. Syst., 2009

Field Programmable Compressor Trees: Acceleration of Multi-Input Addition on FPGAs.
ACM Trans. Reconfigurable Technol. Syst., 2009

Evaluating Resistance of MCML Technology to Power Analysis Attacks Using a Simulation-Based Methodology.
Trans. Comput. Sci., 2009

Optimistic chordal coloring: a coalescing heuristic for SSA form programs.
Des. Autom. Embed. Syst., 2009

Architectural support for the orchestration of fine-grained multiprocessing for portable streaming applications.
Proceedings of the IEEE Workshop on Signal Processing Systems, 2009

Introducing control-flow inclusion to support pipelining in custom instruction set extensions.
Proceedings of the IEEE 7th Symposium on Application Specific Processors, 2009

Arithmetic optimization for custom instruction set synthesis.
Proceedings of the IEEE 7th Symposium on Application Specific Processors, 2009

Iterative layering: Optimizing arithmetic circuits by structuring the information flow.
Proceedings of the 2009 International Conference on Computer-Aided Design, 2009

Memory organization and data layout for instruction set extensions with architecturally visible storage.
Proceedings of the 2009 International Conference on Computer-Aided Design, 2009

MPSoC Design Using Application-Specific Architecturally Visible Communication.
Proceedings of the High Performance Embedded Architectures and Compilers, 2009

A flexible DSP block to enhance FPGA arithmetic performance.
Proceedings of the 2009 International Conference on Field-Programmable Technology, 2009

Exploiting fast carry-chains of FPGAs for designing compressor trees.
Proceedings of the 19th International Conference on Field Programmable Logic and Applications, 2009

Using 3D integration technology to realize multi-context FPGAs.
Proceedings of the 19th International Conference on Field Programmable Logic and Applications, 2009

3D configuration caching for 2D FPGAs.
Proceedings of the ACM/SIGDA 17th International Symposium on Field Programmable Gate Arrays, 2009

FPGA Implementation of a Single-Precision Floating-Point Multiply-Accumulator with Single-Cycle Accumulation.
Proceedings of the FCCM 2009, 2009

From gates to multi-processors learning systems hands-on with FPGA4U in a computer science programme.
Proceedings of the 2009 Workshop on Embedded Systems Education, 2009

Way Stealing: cache-assisted automatic instruction set extensions.
Proceedings of the 46th Design Automation Conference, 2009

A Design Flow and Evaluation Framework for DPA-Resistant Instruction Set Extensions.
Proceedings of the Cryptographic Hardware and Embedded Systems, 2009

Hybrid LZA: a near optimal implementation of the leading zero anticipator.
Proceedings of the 14th Asia South Pacific Design Automation Conference, 2009

Challenges in Automatic Optimization of Arithmetic Circuits.
Proceedings of the 19th IEEE Symposium on Computer Arithmetic, 2009

2008
Guest Editorial Special Section on Application Specific Processors.
IEEE Trans. Very Large Scale Integr. Syst., 2008

Data-Flow Transformations to Maximize the Use of Carry-Save Representation in Arithmetic Circuits.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2008

Error Protected Data Bus Inversion Using Standard DRAM Components.
Proceedings of the 9th International Symposium on Quality of Electronic Design (ISQED 2008), 2008

A novel FPGA logic block for improved arithmetic performance.
Proceedings of the ACM/SIGDA 16th International Symposium on Field Programmable Gate Arrays, 2008

Architectural improvements for field programmable counter arrays: enabling efficient synthesis of fast compressor trees on FPGAs.
Proceedings of the ACM/SIGDA 16th International Symposium on Field Programmable Gate Arrays, 2008

Can Knowledge Regarding the Presence of Countermeasures Against Fault Attacks Simplify Power Attacks on Cryptographic Devices?.
Proceedings of the 23rd IEEE International Symposium on Defect and Fault-Tolerance in VLSI Systems (DFT 2008), 2008

Variable Latency Speculative Addition: A New Paradigm for Arithmetic Circuit Design.
Proceedings of the Design, Automation and Test in Europe, 2008

Improving Synthesis of Compressor Trees on FPGAs via Integer Linear Programming.
Proceedings of the Design, Automation and Test in Europe, 2008

Speculative DMA for architecturally visible storage in instruction set extensions.
Proceedings of the 6th International Conference on Hardware/Software Codesign and System Synthesis, 2008

Design space exploration for field programmable compressor trees.
Proceedings of the 2008 International Conference on Compilers, 2008

Fast, quasi-optimal, and pipelined instruction-set extensions.
Proceedings of the 13th Asia South Pacific Design Automation Conference, 2008

Efficient synthesis of compressor trees on FPGAs.
Proceedings of the 13th Asia South Pacific Design Automation Conference, 2008

2007
Introduction of Architecturally Visible Storage in Instruction Set Extensions.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2007

A Simulation-Based Methodology for Evaluating the DPA-Resistance of Cryptographic Functional Units with Application to CMOS and MCML Technologies.
Proceedings of the 2007 International Conference on Embedded Computer Systems: Architectures, 2007

Optimizing Checking-Logic for Reliability-Agnostic Control of Self-Calibrating Designs.
Proceedings of the 8th International Symposium on Quality of Electronic Design (ISQED 2007), 2007

Optimal polynomial-time interprocedural register allocation for high-level synthesis and ASIP design.
Proceedings of the 2007 International Conference on Computer-Aided Design, 2007

Power Attacks Resistance of Cryptographic S-Boxes with Added Error Detection Circuits.
Proceedings of the 22nd IEEE International Symposium on Defect and Fault-Tolerance in VLSI Systems (DFT 2007), 2007

Automatic synthesis of compressor trees: reevaluating large counters.
Proceedings of the 2007 Design, Automation and Test in Europe Conference and Exposition, 2007

Progressive Decomposition: A Heuristic to Structure Arithmetic Circuits.
Proceedings of the 44th Design Automation Conference, 2007

Enhancing FPGA Performance for Arithmetic Circuits.
Proceedings of the 44th Design Automation Conference, 2007

Rethinking custom ISE identification: a new processor-agnostic method.
Proceedings of the 2007 International Conference on Compilers, 2007

An optimistic and conservative register assignment heuristic for chordal graphs.
Proceedings of the 2007 International Conference on Compilers, 2007

Improving XOR-Dominated Circuits by Exploiting Dependencies between Operands.
Proceedings of the 12th Conference on Asia South Pacific Design Automation, 2007

2006
Virtual memory window for application-specific reconfigurable coprocessors.
IEEE Trans. Very Large Scale Integr. Syst., 2006

ISEGEN: an iterative improvement-based ISE generation technique for fast customization of processors.
IEEE Trans. Very Large Scale Integr. Syst., 2006

Exact and approximate algorithms for the extension of embedded processor instruction sets.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2006

Performance and Energy Benefits of Instruction Set Extensions in an FPGA Soft Core.
Proceedings of the 19th International Conference on VLSI Design (VLSI Design 2006), 2006

A Predictable Communication Scheme for Embedded Multiprocessor Systems.
Proceedings of the IFIP VLSI-SoC 2006, 2006

Designing Robust Checkers in the Presence of Massive Timing Errors.
Proceedings of the 12th IEEE International On-Line Testing Symposium (IOLTS 2006), 2006

Multithreaded virtual-memory-enabled reconfigurable hardware accelerators.
Proceedings of the 2006 IEEE International Conference on Field Programmable Technology, 2006

Combining algorithm exploration with instruction set design: a case study in elliptic curve cryptography.
Proceedings of the Conference on Design, Automation and Test in Europe, 2006

Automatic identification of application-specific functional units with architecturally visible storage.
Proceedings of the Conference on Design, Automation and Test in Europe, 2006

Towards the automatic exploration of arithmetic-circuit architectures.
Proceedings of the 43rd Design Automation Conference, 2006

2005
A robust self-calibrating transmission scheme for on-chip networks.
IEEE Trans. Very Large Scale Integr. Syst., 2005

Seamless Hardware-Software Integration in Reconfigurable Computing Systems.
IEEE Des. Test Comput., 2005

Self-calibrating networks-on-chip.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2005), 2005

Quantitative modelling and comparison of communication schemes to guarantee quality-of-service in networks-on-chip.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2005), 2005

Characterizing and Exploiting Task-Load Variability and Correlation for Energy Management in multi-core systems.
Proceedings of the 2005 3rd Workshop on Embedded Systems for Real-Time Multimedia, 2005

ISEGEN: Generation of High-Quality Instruction Set Extensions by Iterative Improvement.
Proceedings of the 2005 Design, 2005

Enabling unrestricted automated synthesis of portable hardware accelerators for virtual machines.
Proceedings of the 3rd IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, 2005

Exploiting pipelining to relax register-file port constraints of instruction-set extensions.
Proceedings of the 2005 International Conference on Compilers, 2005

A Unified Coding Framework for Delay-Insensitivity.
Proceedings of the 11th International Symposium on Advanced Research in Asynchronous Circuits and Systems (ASYNC 2005), 2005

2004
On-Chip Self-Calibrating Communication Techniques Robust to Electrical Parameter Variations.
IEEE Des. Test Comput., 2004

Automatically Customising VLIW Architectures with Coarse Grained Application-Specific Functional Units.
Proceedings of the Software and Compilers for Embedded Systems, 8th International Workshop, 2004

Providing QoS to connection-less packet-switched NoC by implementing DiffServ functionalities.
Proceedings of the 2004 International Symposium on System-on-Chip, 2004

Soft self-synchronising codes for self-calibrating communication.
Proceedings of the 2004 International Conference on Computer-Aided Design, 2004

Improved use of the carry-save representation for the synthesis of complex arithmetic circuits.
Proceedings of the 2004 International Conference on Computer-Aided Design, 2004

Dynamic Prefetching in the Virtual Memory Window of Portable Reconfigurable Coprocessors.
Proceedings of the Field Programmable Logic and Application, 2004

Virtual Memory Window for a Portable Reconfigurable Cryptography Coprocessor.
Proceedings of the 12th IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM 2004), 2004

Arithmetic Transformations to Maximise the Use of Compressor Trees.
Proceedings of the 2nd IEEE International Workshop on Electronic Design, 2004

Operating System Support for Interface Virtualisation of Reconfigurable Coprocessors.
Proceedings of the 2004 Design, 2004

Introduction of local memory elements in instruction set extensions.
Proceedings of the 41th Design Automation Conference, 2004

Programming Transparency and Portable Hardware Interfacing: Towards General-Purpose Reconfigurable Computing.
Proceedings of the 15th IEEE International Conference on Application-Specific Systems, 2004

Dynamic Reallocation of Functional Units in Superscalar Processors.
Proceedings of the Advances in Computer Systems Architecture, 9th Asia-Pacific Conference, 2004

Adding Limited Reconfigurability to Superscalar Processors.
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques (PACT 2004), 29 September, 2004

2003
Automatic Application-Specific Instruction-Set Extensions Under Microarchitectural Constraints.
Int. J. Parallel Program., 2003

Automatic Instruction Set Extension and Utilization for Embedded Processors.
Proceedings of the 14th IEEE International Conference on Application-Specific Systems, 2003

2002
An Adaptive Low-Power Transmission Scheme for On-Chip Networks.
Proceedings of the 15th International Symposium on System Synthesis (ISSS 2002), 2002

A Trimaran Based Framework for Exploring the Design Space of VLIW ASIPs with Coarse Grain Functional Units.
Proceedings of the 15th International Symposium on System Synthesis (ISSS 2002), 2002

Automatic Topology-Based Identification of Instruction-Set Extensions for Embedded Processors.
Proceedings of the 2002 Design, 2002

1998
Practical Experiences with Standard-Cell Based Datapath Design Tools: Do We Really Need Regular Layouts?
Proceedings of the 35th Conference on Design Automation, 1998

1997
Modified self-organizing feature map algorithms for efficient digital hardware implementation.
IEEE Trans. Neural Networks, 1997

Digital Connectionist Hardware: Current Problems and Future Challenges.
Proceedings of the Biological and Artificial Computation: From Neuroscience to Technology, 1997

1996
Special-purpose digital hardware for neural networks: An architectural survey.
J. VLSI Signal Process., 1996

Design, Implementation, and Test of a Multi-Model Systolic Neural-Network Accelerator.
Sci. Program., 1996

On modifications of Kohonen's feature map algorithm for an efficient parallel implementation.
Proceedings of International Conference on Neural Networks (ICNN'96), 1996

1995
GENES IV: A bit-serial processing element for a multi-model neural-network accelerator.
J. VLSI Signal Process., 1995

Horizontal Microcode Compaction for Programmable Systolic Accelerators.
Proceedings of the International Conference on Application Specific Array Processors (ASAP'95), 1995

1994
Bit-Serial Multipliers and Squarers.
IEEE Trans. Computers, 1994

1993
Mobile Robot Miniaturisation: A Tool for Investigation in Control Algorithms.
Proceedings of the Experimental Robotics III, 1993

GENES IV: A bit-serial processing element for a built-model neural-network accelerator.
Proceedings of the International Conference on Application-Specific Array Processors, 1993


  Loading...