Michela Becchi

According to our database1, Michela Becchi
  • authored at least 58 papers between 2006 and 2017.
  • has a "Dijkstra number"2 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

On csauthors.net:

Bibliography

2017
A Principled Approach to Secure Multi-core Processor Design with ReWire.
ACM Trans. Embedded Comput. Syst., 2017

Fast Integral Histogram Computations on GPU for Real-Time Video Analytics.
CoRR, 2017

Understanding the performance-accuracy tradeoffs of floating-point arithmetic on GPUs.
Proceedings of the 2017 IEEE International Symposium on Workload Characterization, 2017

Demystifying automata processing: GPUs, FPGAs or Micron's AP?
Proceedings of the International Conference on Supercomputing, 2017

A Memory-Efficient GPU Method for Hamming and Levenshtein Distance Similarity.
Proceedings of the 24th IEEE International Conference on High Performance Computing, 2017

2016
Picking Pesky Parameters: Optimizing Regular Expression Matching in Practice.
IEEE Trans. Parallel Distrib. Syst., 2016

Compiler-Assisted Workload Consolidation For Efficient Dynamic Parallelism on GPU.
CoRR, 2016

A programming model for reconfigurable computing based in functional concurrency.
Proceedings of the 11th International Symposium on Reconfigurable Communication-centric Systems-on-Chip, 2016

Compiler-Assisted Workload Consolidation for Efficient Dynamic Parallelism on GPU.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016

High Performance Pattern Matching Using the Automata Processor.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016

Parallel Gene Upstream Comparison via Multi-Level Hash Tables on GPU.
Proceedings of the 22nd IEEE International Conference on Parallel and Distributed Systems, 2016

IVM: a task-based shared memory programming model and runtime system to enable uniform access to CPU-GPU clusters.
Proceedings of the ACM International Conference on Computing Frontiers, CF'16, 2016

Evaluating the Energy Efficiency of Deep Convolutional Neural Networks on CPUs and GPUs.
Proceedings of the 2016 IEEE International Conferences on Big Data and Cloud Computing (BDCloud), 2016

O3FA: A Scalable Finite Automata-based Pattern-Matching Engine for Out-of-Order Deep Packet Inspection.
Proceedings of the 2016 Symposium on Architectures for Networking and Communications Systems, 2016

2015
Fast support for unstructured data processing: the unified automata processor.
Proceedings of the 48th International Symposium on Microarchitecture, 2015

Semantics Driven Hardware Design, Implementation, and Verification with ReWire.
Proceedings of the 16th ACM SIGPLAN/SIGBED Conference on Languages, 2015

Accelerating regular expression matching over compressed HTTP.
Proceedings of the 2015 IEEE Conference on Computer Communications, 2015

Nested Parallelism on GPU: Exploring Parallelization Templates for Irregular Loops and Recursive Computations.
Proceedings of the 44th International Conference on Parallel Processing, 2015

Improving Application Concurrency on GPUs by Managing Implicit and Explicit Synchronizations.
Proceedings of the 21st IEEE International Conference on Parallel and Distributed Systems, 2015

Exploiting Dynamic Parallelism to Efficiently Support Irregular Nested Loops on GPUs.
Proceedings of the 2015 International Workshop on Code Optimisation for Multi and Many Cores, 2015

Hardware Synthesis from Functional Embedded Domain-Specific Languages: A Case Study in Regular Expression Compilation.
Proceedings of the Applied Reconfigurable Computing - 11th International Symposium, 2015

2014
Large-Scale Pairwise Alignments on GPU Clusters: Exploring the Implementation Space.
Signal Processing Systems, 2014

Revisiting State Blow-Up: Automatically Building Augmented-FA While Preserving Functional Equivalence.
IEEE Journal on Selected Areas in Communications, 2014

GRapid: A compilation and runtime framework for rapid prototyping of graph applications on many-core processors.
Proceedings of the 20th IEEE International Conference on Parallel and Distributed Systems, 2014

A flexible scheduling framework for heterogeneous CPU-GPU clusters.
Proceedings of the 21st International Conference on High Performance Computing, 2014

Design of a hybrid MPI-CUDA benchmark suite for CPU-GPU clusters.
Proceedings of the International Conference on Parallel Architectures and Compilation, 2014

2013
A-DFA: A Time- and Space-Efficient DFA Compression Algorithm for Fast Regular Expression Evaluation.
TACO, 2013

Scheduling concurrent applications on a cluster of CPU-GPU nodes.
Future Generation Comp. Syst., 2013

Exploring different automata representations for efficient regular expression matching on GPUs.
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2013

Deploying Graph Algorithms on GPUs: An Adaptive Solution.
Proceedings of the 27th IEEE International Symposium on Parallel and Distributed Processing, 2013

A preemption-based runtime to efficiently schedule multi-process applications on heterogeneous clusters with GPUs.
Proceedings of the 22nd International Symposium on High-Performance Parallel and Distributed Computing, 2013

Semantics-directed machine architecture in ReWire.
Proceedings of the 2013 International Conference on Field-Programmable Technology, 2013

GPU acceleration of regular expression matching for large datasets: exploring the implementation space.
Proceedings of the Computing Frontiers Conference, 2013

A distributed CPU-GPU framework for pairwise alignments on large-scale sequence datasets.
Proceedings of the 24th International Conference on Application-Specific Systems, 2013

Picking pesky parameters: Optimizing regular expression matching in practice.
Proceedings of the Symposium on Architecture for Networking and Communications Systems, 2013

2012
A Massively Parallel, Energy Efficient Programmable Accelerator for Learning and Classification.
TACO, 2012

Formal Semantics of Heterogeneous CUDA-C: A Modular Approach with Applications
Proceedings of the Proceedings Seventh Conference on Systems Software Verification, 2012

ValuePack: value-based scheduling framework for CPU-GPU clusters.
Proceedings of the SC Conference on High Performance Computing Networking, 2012

Poster: Multiple Pairwise Sequence Alignments with the Needleman-Wunsch Algorithm on GPU.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Abstract: Multiple Pairwise Sequence Alignments with the Needleman-Wunsch Algorithm on GPU.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012

A virtual memory based runtime to support multi-tenancy in clusters with GPUs.
Proceedings of the 21st International Symposium on High-Performance Parallel and Distributed Computing, 2012

Scheduling Concurrent Applications on a Cluster of CPU-GPU Nodes.
Proceedings of the 12th IEEE/ACM International Symposium on Cluster, 2012

Efficient GPU Implementation of the Integral Histogram.
Proceedings of the Computer Vision - ACCV 2012 Workshops, 2012

2011
Supporting GPU sharing in cloud environments with a transparent runtime consolidation framework.
Proceedings of the 20th ACM International Symposium on High Performance Distributed Computing, 2011

2010
Data-aware scheduling of legacy kernels on heterogeneous platforms with distributed memory.
Proceedings of the SPAA 2010: Proceedings of the 22nd Annual ACM Symposium on Parallelism in Algorithms and Architectures, 2010

A programmable parallel accelerator for learning and classification.
Proceedings of the 19th International Conference on Parallel Architecture and Compilation Techniques, 2010

2009
Evaluating regular expression matching engines on network and general purpose processors.
Proceedings of the 2009 ACM/IEEE Symposium on Architecture for Networking and Communications Systems, 2009

2008
Dynamic Thread Assignment on Heterogeneous Multiprocessor Architectures.
J. Instruction-Level Parallelism, 2008

A workload for evaluating deep packet inspection architectures.
Proceedings of the 4th International Symposium on Workload Characterization (IISWC 2008), 2008

Extending finite automata to efficiently match Perl-compatible regular expressions.
Proceedings of the 2008 ACM Conference on Emerging Network Experiment and Technology, 2008

A remotely accessible network processor-based router for network experimentation.
Proceedings of the 2008 ACM/IEEE Symposium on Architecture for Networking and Communications Systems, 2008

Efficient regular expression evaluation: theory to practice.
Proceedings of the 2008 ACM/IEEE Symposium on Architecture for Networking and Communications Systems, 2008

2007
Memory-Efficient Regular Expression Search Using State Merging.
Proceedings of the INFOCOM 2007. 26th IEEE International Conference on Computer Communications, 2007

A hybrid finite automaton for practical deep packet inspection.
Proceedings of the 2007 ACM Conference on Emerging Network Experiment and Technology, 2007

Performance/area efficiency in chip multiprocessors with micro-caches.
Proceedings of the 4th Conference on Computing Frontiers, 2007

An improved algorithm to accelerate regular expression evaluation.
Proceedings of the 2007 ACM/IEEE Symposium on Architecture for Networking and Communications Systems, 2007

2006
Dynamic thread assignment on heterogeneous multiprocessor architectures.
Proceedings of the Third Conference on Computing Frontiers, 2006

CAMP: fast and efficient IP lookup architecture.
Proceedings of the 2006 ACM/IEEE Symposium on Architecture for Networking and Communications Systems, 2006


  Loading...