Davide Rossetti

According to our database1, Davide Rossetti
  • authored at least 28 papers between 1997 and 2017.
  • has a "Dijkstra number"2 of four.



In proceedings 
PhD thesis 


On csauthors.net:


MPI-GDS: High Performance MPI Designs with GPUDirect-aSync for CPU-GPU Control Flow Decoupling.
Proceedings of the 46th International Conference on Parallel Processing, 2017

GPU-Centric Communication on NVIDIA GPU Clusters with InfiniBand: A Case Study with OpenSHMEM.
Proceedings of the 24th IEEE International Conference on High Performance Computing, 2017

Offloading communication control logic in GPU accelerated applications.
Proceedings of the 17th IEEE/ACM International Symposium on Cluster, 2017

Dynamic many-process applications on many-tile embedded systems and HPC clusters: The EURETILE programming environment and execution platforms.
Journal of Systems Architecture - Embedded Systems Design, 2016

ASIP acceleration for virtual-to-physical address translation on RDMA-enabled FPGA-based network interfaces.
Future Generation Comp. Syst., 2015

A hierarchical watchdog mechanism for systemic fault awareness on distributed systems.
Future Generation Comp. Syst., 2015

Exploring OpenSHMEM Model to Program GPU-based Extreme-Scale Systems.
Proceedings of the OpenSHMEM and Related Technologies. Experiences, Implementations, and Technologies, 2015

NaNet: a Low-Latency, Real-Time, Multi-Standard Network Interface Card with GPUDirect Features.
CoRR, 2014

LO-FA-MO: Fault Detection and Systemic Awareness for the QUonG Computing System.
Proceedings of the 33rd IEEE International Symposium on Reliable Distributed Systems, 2014

Designing efficient small message transfer mechanism for inter-node MPI communication on InfiniBand GPU clusters.
Proceedings of the 21st International Conference on High Performance Computing, 2014

Benchmarking of communication techniques for GPUs.
J. Parallel Distrib. Comput., 2013

NaNet: a flexible and configurable low-latency NIC for real-time trigger systems based on GPUs.
CoRR, 2013

A heterogeneous many-core platform for experiments on scalable custom interconnects and management of fault and critical events, applied to many-process applications: Vol. II, 2012 technical report.
CoRR, 2013

Architectural improvements and 28 nm FPGA implementation of the APEnet+ 3D Torus network for hybrid HPC systems.
CoRR, 2013

'Mutual Watch-dog Networking': Distributed Awareness of Faults and Critical Events in Petascale/Exascale systems.
CoRR, 2013

GPU peer-to-peer techniques applied to a cluster interconnect.
CoRR, 2013

Design and implementation of a modular, low latency, fault-aware, FPGA-based network interface.
Proceedings of the 2012 International Conference on Reconfigurable Computing and FPGAs, 2013

GPU Peer-to-Peer Techniques Applied to a Cluster Interconnect.
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

Virtual-to-Physical address translation for an FPGA-based interconnect with host and GPU remote DMA capabilities.
Proceedings of the 2013 International Conference on Field-Programmable Technology, 2013

The Distributed Network Processor: a novel off-chip and on-chip interconnection network architecture
CoRR, 2012

Breadth First Search on APEnet+.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012

APEnet+: high bandwidth 3D torus direct network for petaflops scale commodity clusters
CoRR, 2011

APEnet+: a 3D toroidal network enabling Petaflops scale Lattice QCD simulations on commodity clusters
CoRR, 2010

Synthesis of Communication Mechanisms for Multi-tile Systems Based on Heterogeneous Multi-processor System-On-Chips.
Proceedings of the Twentienth IEEE/IFIP International Symposium on Rapid System Prototyping, 2009

Computing for LQCD: apeNEXT.
Computing in Science and Engineering, 2006

APENet: a high speed, low latency 3D interconnect network.
Proceedings of the 2004 IEEE International Conference on Cluster Computing (CLUSTER 2004), 2004