Kiamal Z. Pekmestzi

According to our database1, Kiamal Z. Pekmestzi authored at least 99 papers between 1996 and 2023.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2023
Approximate Computing Survey, Part II: Application-Specific & Architectural Approximation Techniques and Applications.
CoRR, 2023

Approximate Computing Survey, Part I: Terminology and Software & Hardware Approximation Techniques.
CoRR, 2023

2022
Systematic Embedded Development and Implementation Techniques on Intel Myriad VPUs.
Proceedings of the 30th IFIP/IEEE 30th International Conference on Very Large Scale Integration, 2022

MAx-DNN: Multi-Level Arithmetic Approximation for Energy-Efficient DNN Hardware Accelerators.
Proceedings of the 13th IEEE Latin America Symposium on Circuits and System, 2022

2021
Improving Power of DSP and CNN Hardware Accelerators Using Approximate Floating-point Multipliers.
ACM Trans. Embed. Comput. Syst., 2021

Exploiting the Potential of Approximate Arithmetic in DSP & AI Hardware Accelerators.
Proceedings of the 31st International Conference on Field-Programmable Logic and Applications, 2021

2020
On the Diminished-1 Modulo 2n+1 Addition and Subtraction.
J. Circuits Syst. Comput., 2020

Efficient design of magnitude and 2's complement comparators.
Integr., 2020

Combining Arithmetic Approximation Techniques for Improved CNN Circuit Design.
Proceedings of the 27th IEEE International Conference on Electronics, Circuits and Systems, 2020

2019
Multi-Level Approximate Accelerator Synthesis Under Voltage Island Constraints.
IEEE Trans. Circuits Syst. II Express Briefs, 2019

Energy-efficient VLSI implementation of multipliers with double LSB operands.
IET Circuits Devices Syst., 2019

TF2FPGA: A Framework for Projecting and Accelerating Tensorflow CNNs on FPGA Platforms.
Proceedings of the 8th International Conference on Modern Circuits and Systems Technologies, 2019

Cooperative Arithmetic-Aware Approximation Techniques for Energy-Efficient Multipliers.
Proceedings of the 56th Annual Design Automation Conference 2019, 2019

2018
VOSsim: A Framework for Enabling Fast Voltage Overscaling Simulation for Approximate Computing Circuits.
IEEE Trans. Very Large Scale Integr. Syst., 2018

Approximate Hybrid High Radix Encoding for Energy-Efficient Inexact Multipliers.
IEEE Trans. Very Large Scale Integr. Syst., 2018

Walking through the Energy-Error Pareto Frontier of Approximate Multipliers.
IEEE Micro, 2018

Efficient support vector machines implementation on Intel/Movidius Myriad 2.
Proceedings of the 7th International Conference on Modern Circuits and Systems Technologies, 2018

2017
DIRT latch: A novel low cost double node upset tolerant latch.
Microelectron. Reliab., 2017

On the design of the FFT Butterfly Units.
Proceedings of the 6th International Conference on Modern Circuits and Systems Technologies, 2017

2016
Design-Efficient Approximate Multiplication Circuits Through Partial Product Perforation.
IEEE Trans. Very Large Scale Integr. Syst., 2016

Flexible DSP Accelerator Architecture Exploiting Carry-Save Arithmetic.
IEEE Trans. Very Large Scale Integr. Syst., 2016

Pre-Encoded Multipliers Based on Non-Redundant Radix-4 Signed-Digit Encoding.
IEEE Trans. Computers, 2016

Low latency radiation tolerant self-repair reconfigurable SRAM architecture.
Microelectron. Reliab., 2016

Fused modulo 2n + 1 add-multiply unit for weighted operands.
Proceedings of the 2016 International Conference on Design and Technology of Integrated Systems in Nanoscale Era, 2016

Design of Efficient 1's Complement Modified Booth Multiplier.
Proceedings of the 2016 Euromicro Conference on Digital System Design, 2016

2015
Delta DICE: A Double Node Upset resilient latch.
Proceedings of the IEEE 58th International Midwest Symposium on Circuits and Systems, 2015

DONUT: A Double Node Upset Tolerant Latch.
Proceedings of the 2015 IEEE Computer Society Annual Symposium on VLSI, 2015

Modulo 2n ± 1 Fused Add-Multiply Units.
Proceedings of the 2015 IEEE Computer Society Annual Symposium on VLSI, 2015

Hybrid approximate multiplier architectures for improved power-accuracy trade-offs.
Proceedings of the IEEE/ACM International Symposium on Low Power Electronics and Design, 2015

Low leakage radiation tolerant CAM/TCAM cell.
Proceedings of the 21st IEEE International On-Line Testing Symposium, 2015

Approximate Multiplier Architectures Through Partial Product Perforation: Power-Area Tradeoffs Analysis.
Proceedings of the 25th edition on Great Lakes Symposium on VLSI, GLVLSI 2015, Pittsburgh, PA, USA, May 20, 2015

2014
An Optimized Modified Booth Recoder for Efficient Design of the Add-Multiply Operator.
IEEE Trans. Circuits Syst. I Regul. Pap., 2014

Efficient modulo 2<sup>n</sup>+1 multiply and multiply-add units based on modified Booth encoding.
Integr., 2014

FF-DICE: An 8T soft-error tolerant cell using Independent Dual Gate SOI FinFETs.
Proceedings of the 2014 IEEE 20th International On-Line Testing Symposium, 2014

A high radix montgomery multiplier with concurrent error detection.
Proceedings of the 9th International Design and Test Symposium, 2014

Modulo 2<sup>n</sup>+1 addition and multiplication for redundant operands.
Proceedings of the 9th International Design and Test Symposium, 2014

An independent dual gate SOI FinFET soft-error resilient memory cell.
Proceedings of the 9th International Design and Test Symposium, 2014

High performance MAC designs.
Proceedings of the 9th International Design and Test Symposium, 2014

Fused modulo 2<sup>n</sup> - 1 add-multiply unit.
Proceedings of the 21st IEEE International Conference on Electronics, Circuits and Systems, 2014

On the design of efficient modulo 2<sup>n</sup>+1 multiply-add-add units.
Proceedings of the 9th International Conference on Design & Technology of Integrated Systems in Nanoscale Era, 2014

A segmentation-based BISR scheme.
Proceedings of the 19th Asia and South Pacific Design Automation Conference, 2014

2013
A column parity based fault detection mechanism for FIFO buffers.
Integr., 2013

On the design of modulo 2<sup>n</sup> + 1 dot product and generalized multiply-add units.
Comput. Electr. Eng., 2013

On the design of modulo 2<sup>n</sup>±1 residue generators.
Proceedings of the 21st IEEE/IFIP International Conference on VLSI and System-on-Chip, 2013

A radiation tolerant and self-repair memory cell.
Proceedings of the 2013 IEEE 19th International On-Line Testing Symposium (IOLTS), 2013

Efficient modulo 2<sup>n</sup>+1 multiplication for the idea block cipher.
Proceedings of the Great Lakes Symposium on VLSI 2013 (part of ECRC), 2013

2012
Efficient Memory Repair Using Cache-Based Redundancy.
IEEE Trans. Very Large Scale Integr. Syst., 2012

Compiler-in-the-loop exploration during datapath synthesis for higher quality delay-area trade-offs.
ACM Trans. Design Autom. Electr. Syst., 2012

Cost Effective Protection Techniques for TCAM Memory Arrays.
IEEE Trans. Computers, 2012

On the Design of Configurable Modulo 2n±1 Residue Generators.
Proceedings of the 15th Euromicro Conference on Digital System Design, 2012

2011
High Performance and Area Efficient Flexible DSP Datapath Synthesis.
IEEE Trans. Very Large Scale Integr. Syst., 2011

On the Design of Modulo 2^n+1 Multipliers.
Proceedings of the 14th Euromicro Conference on Digital System Design, 2011

2010
Custom multi-threaded Dynamic Memory Management for Multiprocessor System-on-Chip platforms.
Proceedings of the 2010 International Conference on Embedded Computer Systems: Architectures, 2010

A hardware peripheral for Java bytecodes translation acceleration.
Proceedings of the 2010 ACM Symposium on Applied Computing (SAC), 2010

A fast multiplier-less edge detection accelerator for FPGAs.
Proceedings of the 2010 ACM Symposium on Applied Computing (SAC), 2010

A Temperature-Aware Time-Dependent Dielectric Breakdown Analysis Framework.
Proceedings of the Integrated Circuit and System Design. Power and Timing Modeling, Optimization, and Simulation, 2010

Efficient High Level Synthesis Exploration Methodology Combining Exhaustive and Gradient-Based Pruned Searching.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2010

A High Level Synthesis Exploration Framework with Iterative Design Space Partitioning.
Proceedings of the VLSI 2010 Annual Symposium - Selected papers, 2010

High-Level Synthesis Methodologies for Delay-Area Optimized Coarse-Grained Reconfigurable Coprocessor Architectures.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2010

A New Low-Power Soft-Error Tolerant SRAM Cell.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2010

Designing efficient DSP datapaths through compiler-in-the-loop exploration methodology.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2010), May 30, 2010

A bit level area aware cache-based architecture for memory repairs.
Proceedings of the 16th IEEE International On-Line Testing Symposium (IOLTS 2010), 2010

2009
Extending an embedded RISC microprocessor for efficient translation based Java execution.
Microprocess. Microsystems, 2009

Designing coarse-grain reconfigurable architectures by inlining flexibility into custom arithmetic data-paths.
Integr., 2009

A design methodology for high-performance and low-leakage fixed-point transpose FIR filters.
Proceedings of the 16th IEEE International Conference on Electronics, 2009

Flexible Datapath Synthesis through Arithmetically Optimized Operation Chaining.
Proceedings of the NASA/ESA Conference on Adaptive Hardware and Systems, 2009

2008
A predecoding technique for ILP exploitation in Java processors.
J. Syst. Archit., 2008

An instruction set extension for java bytecodes translation acceleration.
Proceedings of the 2008 International Conference on Embedded Computer Systems: Architectures, 2008

A BISR Architecture for Embedded Memories.
Proceedings of the 14th IEEE International On-Line Testing Symposium (IOLTS 2008), 2008

A high-speed radix-4 multiplexer-based array multiplier.
Proceedings of the 18th ACM Great Lakes Symposium on VLSI 2008, 2008

A flexible architecture for DSP applications combining high performance arithmetic with small scale configurability.
Proceedings of the 2008 16th European Signal Processing Conference, 2008

Efficient serial and parallel implementation of programmable fir filters based on the merging technique.
Proceedings of the 2008 16th European Signal Processing Conference, 2008

Mapping DSP Applications onto High-Performance Architectural Templates with Inlined Flexibility.
Proceedings of the NASA/ESA Conference on Adaptive Hardware and Systems, 2008

2007
Flexibility Inlining into Arithmetic Data-paths Exploiting A Regular Interconnection Scheme.
Proceedings of the 2007 International Conference on Embedded Computer Systems: Architectures, 2007

An Elliptic Curve Cryptosystem Design Based on FPGA Pipeline Folding.
Proceedings of the 13th IEEE International On-Line Testing Symposium (IOLTS 2007), 2007

Power-Efficient and Low Latency Implementation of Programmable FIR filters Using Carry-Save Arithmetic.
Proceedings of the 14th IEEE International Conference on Electronics, 2007

A regular interconnection scheme for efficient mapping of DSP kernels into reconfigurable hardware.
Proceedings of the 15th European Signal Processing Conference, 2007

Building embedded DSP applications in a Java modeling framework.
Proceedings of the 15th European Signal Processing Conference, 2007

A Reconfigurable Arithmetic Data-path Based On Regular Interconnection.
Proceedings of the Second NASA/ESA Conference on Adaptive Hardware and Systems (AHS 2007), 2007

2006
A cache based stack folding technique for high performance Java processors.
Proceedings of the 4th international workshop on Java technologies for real-time and embedded systems, 2006

Segmentation based design of serial parallel multipliers.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2006), 2006

FPGA-based Design of a Large Moduli Multiplier for Public Key Cryptographic Systems.
Proceedings of the 24th International Conference on Computer Design (ICCD 2006), 2006

A Large Scale Adaptable Multiplier for Cryptographic Applications.
Proceedings of the First NASA/ESA Conference on Adaptive Hardware and Systems (AHS 2006), 2006

2005
A New Low Latency Parallel FIR Filter Scheme.
J. VLSI Signal Process., 2005

Pipelined array-based FIR filter folding.
IEEE Trans. Circuits Syst. I Regul. Pap., 2005

Novel systolic schemes for serial-parallel multiplication.
Proceedings of the 13th European Signal Processing Conference, 2005

100% operational efficient bit-serial programmable FIR digital filters.
Proceedings of the 13th European Signal Processing Conference, 2005

Long Number Bit-Serial Squarers.
Proceedings of the 17th IEEE Symposium on Computer Arithmetic (ARITH-17 2005), 2005

2004
Low-latency and high-efficiency bit serial-serial multipliers.
Proceedings of the 2004 12th European Signal Processing Conference, 2004

Pipeline array implementation of FIR filters.
Proceedings of the 2004 12th European Signal Processing Conference, 2004

2002
A Systolic, High Speed Architecture for an RSA Cryptosystem.
J. VLSI Signal Process., 2002

2001
A bit-interleaved systolic architecture for a high-speed RSA system.
Integr., 2001

A Novel Systolic Architecture for Efficient RSA Implementation.
Proceedings of the Public Key Cryptography, 2001

On the Hardware Implementation of the 3GPP Confidentiality and Integrity Algorithms.
Proceedings of the Information Security, 4th International Conference, 2001

2000
Constant Number Serial Pipeline Multipliers.
J. VLSI Signal Process., 2000

A systolic serial squarer of continuous operation.
Proceedings of the 10th European Signal Processing Conference, 2000

1998
A scheme for the VLSI implementation of FIR digital filters with reduced latency.
Proceedings of the 9th European Signal Processing Conference, 1998

1997
Hardware compilation using attribute grammars.
Proceedings of the Advances in Hardware Design and Verification, 1997

1996
Systolic digital filters with reduced latency - serial implementation.
Int. J. Circuit Theory Appl., 1996


  Loading...