Avi Mendelson

According to our database1, Avi Mendelson authored at least 88 papers between 1991 and 2020.

Collaborative distances:



In proceedings 
PhD thesis 


On csauthors.net:


A Metric-Guided Method for Discovering Impactful Features and Architectural Insights for Skylake-Based Processors.
TACO, 2020

Memory-Side Protection With a Capability Enforcement Co-Processor.
TACO, 2019

Energy oriented EDF for real-time systems.
IJES, 2019

Smoothed Inference for Adversarially-Trained Models.
CoRR, 2019

Loss Aware Post-training Quantization.
CoRR, 2019

CAT: Compression-Aware Training for bandwidth reduction.
CoRR, 2019

Feature Map Transform Coding for Energy-Efficient CNN Inference.
CoRR, 2019

Towards Learning of Filter-Level Heterogeneous Compression of Convolutional Neural Networks.
CoRR, 2019

Security and Privacy in the Age of Big Data and Machine Learning.
IEEE Computer, 2019

Rack-Scale Capabilities: Fine-Grained Protection for Large-Scale Memories.
IEEE Computer, 2019

Tuning Performance via Metrics with Expectations.
Computer Architecture Letters, 2019

A Comprehensive Evaluation of Power Delivery Schemes for Modern Microprocessors.
Proceedings of the 20th International Symposium on Quality Electronic Design, 2019

SoK: An Overview of Algorithmic Methods in IC Reverse Engineering.
Proceedings of the 3rd ACM Workshop on Attacks and Solutions in Hardware Security Workshop, 2019

Recruiting Fault Tolerance Techniques for Microprocessor Security.
Proceedings of the 28th IEEE Asian Test Symposium, 2019

MIA: Metric Importance Analysis for Big Data Workload Characterization.
IEEE Trans. Parallel Distrib. Syst., 2018

Minimum-Weight Link-Disjoint Node-"Somewhat Disjoint" Paths.
IEEE/ACM Trans. Netw., 2018

Efficient non-uniform quantizer for quantized neural network targeting reconfigurable hardware.
CoRR, 2018

NICE: Noise Injection and Clamping Estimation for Neural Network Quantization.
CoRR, 2018

UNIQ: Uniform Noise Injection for the Quantization of Neural Networks.
CoRR, 2018

Rebooting Computers to Avoid Meltdown and Spectre.
IEEE Computer, 2018

Streaming Architecture for Large-Scale Quantized Neural Networks on an FPGA-Based Dataflow Platform.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium Workshops, 2018

Using Scan Side Channel to Detect IP Theft.
IEEE Trans. VLSI Syst., 2017

SPACE: Semi-Partitioned CachE for Energy Efficient, Hard Real-Time Systems.
IEEE Trans. Computers, 2017

Streaming Architecture for Large-Scale Quantized Neural Networks on an FPGA-Based Dataflow Platform.
CoRR, 2017

Extending Amdahl's Law for Multicores with Turbo Boost.
Computer Architecture Letters, 2017

ScaleSimulator: A Fast and Cycle-Accurate Parallel Simulator for Architectural Exploration.
Proceedings of the 10th EAI International Conference on Simulation Tools and Techniques, 2017

Revealing On-chip Proprietary Security Functions with Scan Side Channel Based Reverse Engineering.
Proceedings of the on Great Lakes Symposium on VLSI 2017, 2017

Fine-Grain Power Breakdown of Modern Out-of-Order Cores and Its Implications on Skylake-Based Systems.
TACO, 2016

Architectural Support for Fault Tolerance in a Teradevice Dataflow System.
International Journal of Parallel Programming, 2016

H-EARtH: Heterogeneous Multicore Platform Energy Management.
IEEE Computer, 2016

GPUpIO: the case for I/O-driven preemption on GPUs.
Proceedings of the 9th Annual Workshop on General Purpose Processing using Graphics Processing Unit, 2016

Using Scan Side Channel for Detecting IP Theft.
Proceedings of the Hardware and Architectural Support for Security and Privacy 2016, 2016

Optimal link-disjoint node-"somewhat disjoint" paths.
Proceedings of the 24th IEEE International Conference on Network Protocols, 2016

Power and thermal constraints of modern system-on-a-chip computer.
Microelectron. J., 2015

Peripheral Memory: A Technique for Fighting Memory Bandwidth Bottleneck.
Computer Architecture Letters, 2015

Hardware Transactions in Nonvolatile Memory.
Proceedings of the Distributed Computing - 29th International Symposium, 2015

Establishing a Base of Trust with Performance Counters for Enterprise Workloads.
Proceedings of the 2015 USENIX Annual Technical Conference, 2015

The Impact of Hypervisor Scheduling on Compromising Virtualized Environments.
Proceedings of the 15th IEEE International Conference on Computer and Information Technology, 2015

TERAFLUX: Harnessing dataflow in next generation teradevices.
Microprocess. Microsystems, 2014

Energy Aware Race to Halt: A Down to EARtH Approach for Platform Energy Management.
Computer Architecture Letters, 2014

Energy management of highly dynamic server workloads in an heterogeneous data center.
Proceedings of the 24th International Workshop on Power and Timing Modeling, 2014

Deep-dive analysis of the data analytics workload in CloudSuite.
Proceedings of the 2014 IEEE International Symposium on Workload Characterization, 2014

Batch Method for Efficient Resource Sharing in Real-Time Multi-GPU Systems.
Proceedings of the Distributed Computing and Networking - 15th International Conference, 2014

Scheduling periodic real-time communication in multi-GPU systems.
Proceedings of the 23rd International Conference on Computer Communication and Networks, 2014


Data-Parallel Computing Meets STRIPS.
Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence, 2013

Exploring the limits of GPGPU scheduling in control flow bound applications.
TACO, 2012

Scheduling processing of real-time data streams on heterogeneous multi-GPU systems.
Proceedings of the 5th Annual International Systems and Storage Conference, 2012

Topic 4: High-Performance Architecture and Compilers.
Proceedings of the Euro-Par 2012 Parallel Processing - 18th International Conference, 2012

DiDi: Mitigating the Performance Impact of TLB Shootdowns Using a Shared TLB Directory.
Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques, 2011

Using Underutilized CPU Resources to Enhance Its Reliability.
IEEE Trans. Dependable Sec. Comput., 2010

Threads vs. caches: Modeling the behavior of parallel workloads.
Proceedings of the 28th International Conference on Computer Design, 2010

Service level agreement for multithreaded processors.
TACO, 2009

Many-Core vs. Many-Thread Machines: Stay Away From the Valley.
Computer Architecture Letters, 2009

Programming model for a heterogeneous x86 platform.
Proceedings of the 2009 ACM SIGPLAN Conference on Programming Language Design and Implementation, 2009

Multiple clock and voltage domains for chip multi processors.
Proceedings of the 42st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-42 2009), 2009

Dependable Embedded Systems Special Day Panel: Issues and Challenges in Dependable Embedded Systems.
Proceedings of the Design, Automation and Test in Europe, 2008

A Programming Model and Architectural Extensions for Fine-Grain Parallelism.
Proceedings of the Handbook of Parallel Computing - Models, Algorithms and Applications., 2007

Trace cache sampling filter.
ACM Trans. Comput. Syst., 2007

Fairness enforcement in switch on event multithreading.
TACO, 2007

Using fine grain multithreading for energy efficient computing.
Proceedings of the 12th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2007

Current trends in computer architectures: multi-cores, many-cores and special-cores.
Proceedings of the 21th Annual International Conference on Supercomputing, 2007

Code Compilation for an Explicitly Parallel Register-Sharing Architecture.
Proceedings of the 2007 International Conference on Parallel Processing (ICPP 2007), 2007

Inthreads: a low granularity parallelization model.
SIGARCH Computer Architecture News, 2006

A PAB-Based Multi-Prefetcher Mechanism.
International Journal of Parallel Programming, 2006

Fairness and Throughput in Switch on Event Multithreading.
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-39 2006), 2006

Memory management challenges in the power-aware computing era.
Proceedings of the 5th International Symposium on Memory Management, 2006

Speculative synchronization and thread management for fine granularity threads.
Proceedings of the 12th International Symposium on High-Performance Computer Architecture, 2006

Power Awareness through Selective Dynamically Optimized Traces.
Proceedings of the 31st International Symposium on Computer Architecture (ISCA 2004), 2004

Micro-operation cache: a power aware frontend for variable instruction length ISA.
IEEE Trans. VLSI Syst., 2003

On Estimating Optimal Performance of CPU Dynamic Thermal Management.
Computer Architecture Letters, 2003

PARROT: Power Awareness Through Selective Dynamically Optimized Traces.
Proceedings of the Power-Aware Computer Systems, Third International Workshop, 2003

The effect of seance communication on multiprocessing systems.
ACM Trans. Comput. Syst., 2001

Design of a parallel interconnect based on communication pattern considerations.
Parallel Algorithms Appl., 2001

Filtering Techniques to Improve Trace-Cache Efficiency.
Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques (PACT 2001), 2001

Designing High-Performance & Reliable Superscalar Architectures: The out of Order Reliable Superscalar (O3RS) Approach.
Proceedings of the 2000 International Conference on Dependable Systems and Networks (DSN 2000) (formerly FTCS-30 and DCCA-8), 2000

The "Smart" simulation environment - A tool-set to develop new cache coherency protocols.
Journal of Systems Architecture, 1999

Design Alternatives of Multithreaded Architecture.
International Journal of Parallel Programming, 1999

Using Value Prediction to Increase the Power of Speculative Execution Hardware.
ACM Trans. Comput. Syst., 1998

Improving achievable ILP through value prediction and program profiling.
Microprocess. Microsystems, 1998

The Effect of Instruction Fetch Bandwidth on Value Prediction.
Proceedings of the 25th Annual International Symposium on Computer Architecture, 1998

Can Program Profiling Support Value Prediction?
Proceedings of the Thirtieth Annual IEEE/ACM International Symposium on Microarchitecture, 1997

Smart: An Advanced Shared-Memory Simulator - Towards a System-Level Simulation Environmen.
Proceedings of the MASCOTS 1997, 1997

Cache based fault recovery for distributed systems.
Proceedings of the 3rd IEEE International Conference on Engineering of Complex Computer Systems (ICECCS '97), 1997

Performance and hardware complexity tradeoffs in designing multithreaded architectures.
Proceedings of the Fifth International Conference on Parallel Architectures and Compilation Techniques, 1996

A performance analysis of Pentium processor systems.
IEEE Micro, 1995

Toward a General-Purpose Multi-Stream System.
Proceedings of the Parallel Architectures and Compilation Techniques, 1994

BDG-torus union graph-an efficient algorithmically specializedparallel interconnect.
Proceedings of the Third IEEE Symposium on Parallel and Distributed Processing, 1991