We stand with Ukraine

We stand with Ukraine

Ehsan Atoofian

Orcid: 0000-0002-1662-5334

According to our database¹, Ehsan Atoofian authored at least 70 papers between 2003 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

Design of a Superposition-Based Approximate QRAM for Noise-Tolerant Quantum Machine Learning.

[DOI]

Sohrab Sajadimanesh

,

Proceedings of the Supercomputing Asia and International Conference on High Performance Computing in Asia Pacific Region, 2026

2025

Fused Tensor Core: A Hardware-Software Co-Design for Efficient Execution of Attentions on GPUs.

[DOI]

,

,

IEEE Embed. Syst. Lett., October, 2025

NR-QNN: Noise-Resilient Quantum Neural Network.

[DOI]

Sohrab Sajadimanesh

,

Hanieh Aghaee Rad

,

Jean Paul Latyr Faye

,

IEEE Access, 2025

Sparse Attention: A Co-Design Approach for Efficient Transformer Execution on Tensor Cores.

[DOI]

,

,

Proceedings of the 38th IEEE International System-on-Chip Conference, 2025

2024

Transient Fault Detection in Tensor Cores for Modern GPUs.

[DOI]

Mohammad Hafezan

,

ACM Trans. Embed. Comput. Syst., September, 2024

Improving Energy-Efficiency of Capsule Networks on Modern GPUs.

[DOI]

Mohammad Hafezan

,

IEEE Comput. Archit. Lett., 2024

Inexact Quantum Square Root Circuit for NISQ Devices.

[DOI]

Sohrab Sajadimanesh

,

Hanieh Aghaee Rad

,

Jean Paul Latyr Faye

,

IEEE Access, 2024

Hardened-TC: A Low-cost Reliability Solution for CNNs Run by Modern GPUs.

[DOI]

Proceedings of the 37th IEEE International System-on-Chip Conference, 2024

Low-Power Register File for Tensor Cores.

[DOI]

,

Proceedings of the 15th IEEE International Green and Sustainable Computing Conference, 2024

PCTC: Hardware and Software Co-design for Pruned Capsule Networks on Tensor Cores.

[DOI]

Mohammad Hafezan

,

,

Proceedings of the Euro-Par 2024: Parallel Processing, 2024

2023

EAM: Ensemble of approximate multipliers for robust DNNs.

[DOI]

Sohrab Sajadimanesh

,

Microprocess. Microsystems, April, 2023

PTTS: Power-aware tensor cores using two-sided sparsity.

[DOI]

J. Parallel Distributed Comput., March, 2023

2022

NISQ-Friendly Non-Linear Activation Functions for Quantum Neural Networks.

[DOI]

Sohrab Sajadimanesh

,

Jean Paul Latyr Faye

,

Proceedings of the IEEE International Conference on Networking, Architecture and Storage, 2022

Increasing Robustness against Adversarial Attacks through Ensemble of Approximate Multipliers.

[DOI]

Proceedings of the IEEE International Conference on Networking, Architecture and Storage, 2022

Practical approximate quantum multipliers for NISQ devices.

[DOI]

Sohrab Sajadimanesh

,

Jean Paul Latyr Faye

,

Proceedings of the CF '22: 19th ACM International Conference on Computing Frontiers, Turin, Italy, May 17, 2022

2021

Adaptive Computation Reuse for Energy-Efficient Training of Deep Neural Networks.

[DOI]

,

ACM Trans. Embed. Comput. Syst., 2021

Reducing Energy in GPGPUs through Approximate Trivial Bypassing.

[DOI]

,

,

ACM Trans. Embed. Comput. Syst., 2021

Trivial Bypassing in GPGPUs.

[DOI]

IEEE Embed. Syst. Lett., 2021

Sparsity-aware Power Gating for Tensor Cores.

[DOI]

Proceedings of the 33rd IEEE International Symposium on Computer Architecture and High Performance Computing, 2021

2020

Approximate Cache in GPGPUs.

[DOI]

ACM Trans. Embed. Comput. Syst., 2020

Energy Efficient On-Demand Dynamic Branch Prediction Models.

[DOI]

Milad Mohammadi

,

,

,

Amirali Baniasadi

,

,

William J. Dally

IEEE Trans. Computers, 2020

Approximate trivial instructions.

[DOI]

,

Proceedings of the 17th ACM International Conference on Computing Frontiers, 2020

2018

Data-type specific cache compression in GPGPUs.

[DOI]

,

J. Supercomput., 2018

TELEPORT: Hardware/software alternative to CUDA shared memory programming.

[DOI]

,

,

Amirali Baniasadi

Microprocess. Microsystems, 2018

Improving performance of transactional memory through machine learning.

[DOI]

,

Thireshan Jeyakumaran

,

,

Concurr. Comput. Pract. Exp., 2018

Loop Perforation in OpenACC.

[DOI]

,

,

Amirali Baniasadi

Proceedings of the IEEE International Conference on Parallel & Distributed Processing with Applications, 2018

Mitigating Critical Path Decompression Latency in Compressed L1 Data Caches Via Prefetching.

[DOI]

,

Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium Workshops, 2018

2017

Reducing Power of Memory Hierarchy in General Purpose Graphics Processing Units.

[DOI]

,

,

J. Low Power Electron., 2017

An efficient racetrack memory for L2 cache in GPGPUs.

,

Comput. Syst. Sci. Eng., 2017

2016

Many-Thread Aware Compression in GPGPUs.

[DOI]

Proceedings of the 2016 Intl IEEE Conferences on Ubiquitous Intelligence & Computing, 2016

Temperature-Aware Register Mapping in GPGPUs.

[DOI]

Proceedings of the 2016 IEEE Trustcom/BigDataSE/ISPA, 2016

A low power STT-RAM based register file for GPGPUs.

[DOI]

Proceedings of the 31st Annual ACM Symposium on Applied Computing, 2016

Improving Performance of Transactional Applications through Adaptive Transactional Memory.

[DOI]

Thireshan Jeyakumaran

,

,

,

,

Proceedings of the 24th Euromicro International Conference on Parallel, 2016

Compressed L1 data cache and L2 cache in GPGPUs.

[DOI]

Proceedings of the 27th IEEE International Conference on Application-specific Systems, 2016

2015

TurboLock: increasing associativity of lock table in transactional memory.

[DOI]

Amir Ghanbari Bavarsad

,

Computing, 2015

Workshop Preview of the 2nd International Workshop on Software for Parallel Systems (SEPS 2015).

[DOI]

,

Siegfried Benkner

,

,

,

Proceedings of the Companion Proceedings of the 2015 ACM SIGPLAN International Conference on Systems, 2015

Shift-aware racetrack memory.

[DOI]

,

Proceedings of the 33rd IEEE International Conference on Computer Design, 2015

Automatic Optimization of Software Transactional Memory Through Linear Regression and Decision Tree.

[DOI]

,

,

,

Proceedings of the Algorithms and Architectures for Parallel Processing, 2015

Reducing shift penalty in Domain Wall Memory through register locality.

[DOI]

Proceedings of the 2015 International Conference on Compilers, 2015

2014

Boosting performance of transactional memory through O-GEHL predictors.

[DOI]

Microprocess. Microsystems, 2014

Acceleration of Software Transactional Memory through Hardware Clock.

[DOI]

Proceedings of the 2nd International Workshop on Many-core Embedded Systems, 2014

Reducing Static and Dynamic Power of L1 Data Caches in GPGPUs.

[DOI]

Proceedings of the 2014 IEEE International Parallel & Distributed Processing Symposium Workshops, 2014

Power-Aware L1 and L2 Caches for GPGPUs.

[DOI]

,

Proceedings of the Euro-Par 2014 Parallel Processing, 2014

Improving Power of Cache and Register File through Critical Path Instructions.

[DOI]

,

,

Proceedings of the 17th Euromicro Conference on Digital System Design, 2014

2013

Improving performance of software transactional memory through contention locality.

[DOI]

J. Supercomput., 2013

ARV-ALA: Improving performance of software transactional memory through adaptive read and write policies.

[DOI]

,

Amirali Baniasadi

,

Sci. Comput. Program., 2013

TxSnoop: Power-Aware Transactional Snoop.

[DOI]

Proceedings of the 12th IEEE International Conference on Trust, 2013

Consistency Check through O-GEHL Predictors.

[DOI]

Proceedings of the 21st Euromicro International Conference on Parallel, 2013

Read-Write Lock Allocation in Software Transactional Memory.

[DOI]

Amir Ghanbari Bavarsad

,

Proceedings of the 42nd International Conference on Parallel Processing, 2013

VGTS: Variable Granularity Transactional Snoop.

[DOI]

Proceedings of the Euro-Par 2013 Parallel Processing, 2013

2012

AGC: adaptive global clock in software transactional memory.

[DOI]

,

Amir Ghanbari Bavarsad

Proceedings of the 2012 PPOPP International Workshop on Programming Models and Applications for Multicores and Manycores, 2012

ArTA: Adaptive Granularity in Transactional Applications.

[DOI]

Proceedings of the 20th Euromicro International Conference on Parallel, 2012

TRT: Transactional Read Tracking.

[DOI]

Amir Ghanbari Bavarsad

,

Proceedings of the 13th International Conference on Parallel and Distributed Computing, 2012

Maintaining Consistency in Software Transactional Memory through Dynamic Versioning Tuning.

[DOI]

,

Amir Ghanbari Bavarsad

Proceedings of the Algorithms and Architectures for Parallel Processing, 2012

Speculative Versioning through Perceptron Predictors.

[DOI]

,

Amir Ghanbari Bavarsad

Proceedings of the 14th IEEE International Conference on High Performance Computing and Communication & 9th IEEE International Conference on Embedded Software and Systems, 2012

2011

Speculative Contention Avoidance in Software Transactional Memory.

[DOI]

Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

2008

Using supplier locality in power-aware interconnects and caches in chip multiprocessors.

[DOI]

,

Amirali Baniasadi

J. Syst. Archit., 2008

Exploiting program cyclic behavior to reduce memory latency in embedded processors.

[DOI]

,

Amirali Baniasadi

Proceedings of the 2008 ACM Symposium on Applied Computing (SAC), 2008

Adaptive Read Validation in Time-Based Software Transactional Memory.

[DOI]

,

Amirali Baniasadi

,

Proceedings of the Euro-Par 2008 Workshops, 2008

2007

Speculative trivialization point advancing in high-performance processors.

[DOI]

,

Amirali Baniasadi

J. Syst. Archit., 2007

Exploiting Speculation Cost Prediction in Power-Aware Applications.

[DOI]

,

Amirali Baniasadi

,

J. Low Power Electron., 2007

A Power-Aware Prediction-Based Cache Coherence Protocol for Chip Multiprocessors.

[DOI]

,

Amirali Baniasadi

Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

Speculative supplier identification for reducing power of interconnects in snoopy cache coherence protocols.

[DOI]

,

Amirali Baniasadi

,

Proceedings of the 4th Conference on Computing Frontiers, 2007

Computational and storage power optimizations for the O-GEHL branch predictor.

[DOI]

,

Amirali Baniasadi

,

Proceedings of the 4th Conference on Computing Frontiers, 2007

2006

A Test Approach for Look-Up Table Based FPGAs.

[DOI]

,

Zainalabedin Navabi

J. Comput. Sci. Technol., 2006

2005

A low-power scan-path architecture.

[DOI]

Mohammad Alisafaee

,

,

,

Zainalabedin Navabi

,

Ali Afzali-Kusha

Proceedings of the International Symposium on Circuits and Systems (ISCAS 2005), 2005

Improving Energy-Efficiency by Bypassing Trivial Computations.

[DOI]

,

Amirali Baniasadi

Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005

Low-power prediction based data transfer architecture.

[DOI]

,

,

Amirali Baniasadi

,

Yehea I. Ismail

Proceedings of the IEEE 2005 Custom Integrated Circuits Conference, 2005

2003

A Low Power BIST Architecture for FPGA Look-Up Table Testing.

,

Zainalabedin Navabi

Proceedings of the IFIP VLSI-SoC 2003, 2003

A BIST Architecture for FPGA Look-Up Table Testing Reduces Reconfigurations.

[DOI]

,

Zainalabedin Navabi

Proceedings of the 12th Asian Test Symposium (ATS 2003), 17-19 November 2003, Xian, China, 2003

Loading...