According to our database1, Davide Rossetti
Legend:Book In proceedings Article PhD thesis Other
MPI-GDS: High Performance MPI Designs with GPUDirect-aSync for CPU-GPU Control Flow Decoupling.
Proceedings of the 46th International Conference on Parallel Processing, 2017
GPU-Centric Communication on NVIDIA GPU Clusters with InfiniBand: A Case Study with OpenSHMEM.
Proceedings of the 24th IEEE International Conference on High Performance Computing, 2017
Offloading communication control logic in GPU accelerated applications.
Proceedings of the 17th IEEE/ACM International Symposium on Cluster, 2017
Dynamic many-process applications on many-tile embedded systems and HPC clusters: The EURETILE programming environment and execution platforms.
Journal of Systems Architecture - Embedded Systems Design, 2016
ASIP acceleration for virtual-to-physical address translation on RDMA-enabled FPGA-based network interfaces.
Future Generation Comp. Syst., 2015
A hierarchical watchdog mechanism for systemic fault awareness on distributed systems.
Future Generation Comp. Syst., 2015
Exploring OpenSHMEM Model to Program GPU-based Extreme-Scale Systems.
Proceedings of the OpenSHMEM and Related Technologies. Experiences, Implementations, and Technologies, 2015
UCX: An Open Source Framework for HPC Network APIs and Beyond.
Proceedings of the 23rd IEEE Annual Symposium on High-Performance Interconnects, 2015
NaNet: a Low-Latency, Real-Time, Multi-Standard Network Interface Card with GPUDirect Features.
LO-FA-MO: Fault Detection and Systemic Awareness for the QUonG Computing System.
Proceedings of the 33rd IEEE International Symposium on Reliable Distributed Systems, 2014
Designing efficient small message transfer mechanism for inter-node MPI communication on InfiniBand GPU clusters.
Proceedings of the 21st International Conference on High Performance Computing, 2014
Benchmarking of communication techniques for GPUs.
J. Parallel Distrib. Comput., 2013
NaNet: a flexible and configurable low-latency NIC for real-time trigger systems based on GPUs.
A heterogeneous many-core platform for experiments on scalable custom interconnects and management of fault and critical events, applied to many-process applications: Vol. II, 2012 technical report.
Architectural improvements and 28 nm FPGA implementation of the APEnet+ 3D Torus network for hybrid HPC systems.
'Mutual Watch-dog Networking': Distributed Awareness of Faults and Critical Events in Petascale/Exascale systems.
GPU peer-to-peer techniques applied to a cluster interconnect.
Design and implementation of a modular, low latency, fault-aware, FPGA-based network interface.
Proceedings of the 2012 International Conference on Reconfigurable Computing and FPGAs, 2013
GPU Peer-to-Peer Techniques Applied to a Cluster Interconnect.
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013
Virtual-to-Physical address translation for an FPGA-based interconnect with host and GPU remote DMA capabilities.
Proceedings of the 2013 International Conference on Field-Programmable Technology, 2013
The Distributed Network Processor: a novel off-chip and on-chip interconnection network architecture
Breadth First Search on APEnet+.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012
APEnet+: high bandwidth 3D torus direct network for petaflops scale commodity clusters
APEnet+: a 3D toroidal network enabling Petaflops scale Lattice QCD simulations on commodity clusters
Synthesis of Communication Mechanisms for Multi-tile Systems Based on Heterogeneous Multi-processor System-On-Chips.
Proceedings of the Twentienth IEEE/IFIP International Symposium on Rapid System Prototyping, 2009
Computing for LQCD: apeNEXT.
Computing in Science and Engineering, 2006
APENet: a high speed, low latency 3D interconnect network.
Proceedings of the 2004 IEEE International Conference on Cluster Computing (CLUSTER 2004), 2004
The Teraflop Parallel Computer APEmille.
Proceedings of the High-Performance Computing and Networking, 1997