Gilad Shainer

Orcid: 0009-0000-9318-9620

According to our database¹, Gilad Shainer authored at least 36 papers between 2006 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

High-speed Networking for Giga-Scale AI Factories.

[BibT_eX]

[DOI]

CoRR, May, 2026

2025

Unified Collective Communication: A Unified Library for CPU, GPU, and DPU Collectives.

[BibT_eX]

[DOI]

Manjunath Gorentla Venkata

IEEE Micro, 2025

Co-Packaged Silicon Photonics Switches for Gigawatt AI Factories.

[BibT_eX]

[DOI]

Gilad Shainer

Proceedings of the IEEE Hot Chips 37 Symposium, 2025

2024

Optimizing Application Performance with BlueField: Accelerating Large-Message Blocking and Nonblocking Collective Operations.

[BibT_eX]

[DOI]

Richard L. Graham

George Bosilca

Yong Qin

Bradley W. Settlemyer

Gerardo Cisneros-Stoianowski

Sebastian T. Ohlmann

Markus Rampp

Proceedings of the ISC High Performance 2024 Research Paper Proceedings (39th International Conference), 2024

Unified Collective Communication (UCC): An Unified Library for CPU, GPU, and DPU Collectives.

[BibT_eX]

[DOI]

Manjunath Gorentla Venkata

Proceedings of the IEEE Symposium on High-Performance Interconnects, 2024

2022

NVIDIA's Quantum InfiniBand Network Congestion Control Technology and Its Impact on Application Performance.

[BibT_eX]

[DOI]

Gerardo Cisneros-Stoianowski

Craig B. Stunkel

Proceedings of the High Performance Computing - 37th International Conference, 2022

2021

NVIDIA's Cloud Native Supercomputing.

[BibT_eX]

[DOI]

Proceedings of the Driving Scientific and Engineering Discoveries Through the Integration of Experiment, Big Data, and Modeling and Simulation, 2021

2020

The high-speed networks of the Summit and Sierra supercomputers.

[BibT_eX]

[DOI]

IBM J. Res. Dev., 2020

Scalable Hierarchical Aggregation and Reduction Protocol (SHARP)<sup>TM</sup> Streaming-Aggregation Hardware Design and Evaluation.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing - 35th International Conference, 2020

2019

Accelerating OpenSHMEM Collectives Using In-Network Computing Approach.

[BibT_eX]

[DOI]

Manjunath Gorentla Venkata

Gil Bloch

Gilad Shainer

Richard L. Graham

Proceedings of the 31st International Symposium on Computer Architecture and High Performance Computing, 2019

2018

LRUM: Local Reliability Protocol for Unreliable Hardware Multicast.

[BibT_eX]

[DOI]

Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region, 2018

2017

Towards A Data Centric System Architecture: SHARP.

[BibT_eX]

[DOI]

Supercomput. Front. Innov., 2017

Enabling One-Sided Communication Semantics on ARM.

[BibT_eX]

[DOI]

Pavel Shamis

M. Graham Lopez

Gilad Shainer

Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium Workshops, 2017

2016

Scalable Hierarchical Aggregation Protocol (SHArP): A Hardware Architecture for Efficient Data Reduction.

[BibT_eX]

[DOI]

Proceedings of the First International Workshop on Communication Optimizations in HPC, 2016

Using InfiniBand Hardware Gather-Scatter Capabilities to Optimize MPI All-to-All.

[BibT_eX]

[DOI]

Proceedings of the 23rd European MPI Users' Group Meeting, EuroMPI 2016, 2016

2015

Local and Remote GPUs Perform Similar with EDR 100G InfiniBand.

[BibT_eX]

[DOI]

Proceedings of the Industrial Track of the 16th International Middleware Conference, 2015

UCX: An Open Source Framework for HPC Network APIs and Beyond.

[BibT_eX]

[DOI]

Proceedings of the 23rd IEEE Annual Symposium on High-Performance Interconnects, 2015

2014

Development and Extension of Atomic Memory Operations in OpenSHMEM.

[BibT_eX]

[DOI]

Pavel Shamis

Manjunath Gorentla Venkata

Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models, 2014

Boosting the performance of remote GPU virtualization using InfiniBand connect-IB and PCIe 3.0.

[BibT_eX]

[DOI]

Enrique S. Quintana-Ortí

José Duato

Proceedings of the 2014 IEEE International Conference on Cluster Computing, 2014

2013

The co-design architecture for exascale systems, a novel approach for scalable designs.

[BibT_eX]

[DOI]

Comput. Sci. Res. Dev., 2013

Maximizing Application Performance in a Multi-core, NUMA-Aware Compute Cluster by Multi-level Tuning.

[BibT_eX]

[DOI]

Proceedings of the Supercomputing - 28th International Supercomputing Conference, 2013

2012

Exploring the Scope of the InfiniBand Congestion Control Mechanism.

[BibT_eX]

[DOI]

Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

2011

The development of Mellanox/NVIDIA GPUDirect over InfiniBand - a new model for GPU to GPU communications.

[BibT_eX]

[DOI]

Comput. Sci. Res. Dev., 2011

The development of Mellanox/NVIDIA GPUDirect over InfiniBand: a new model for GPU to GPU communications.

[BibT_eX]

[DOI]

Gilad Shainer

Pak Lui

Tong Liu

Proceedings of the 2011 TeraGrid Conference - Extreme Digital Discovery, 2011

ConnectX-2 CORE-Direct Enabled Asynchronous Broadcast Collective Communications.

[BibT_eX]

[DOI]

Manjunath Gorentla Venkata

Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

The ParaPhrase Project: Parallel Patterns for Adaptive Heterogeneous Multicore Systems.

[BibT_eX]

[DOI]

Horacio González-Vélez

Proceedings of the Formal Methods for Components and Objects, 10th International Symposium, 2011

On the Relation between Congestion Control, Switch Arbitration and Fairness.

[BibT_eX]

[DOI]

Proceedings of the 11th IEEE/ACM International Symposium on Cluster, 2011

Cheetah: A Framework for Scalable Hierarchical Collective Operations.

[BibT_eX]

[DOI]

Richard L. Graham

Manjunath Gorentla Venkata

Proceedings of the 11th IEEE/ACM International Symposium on Cluster, 2011

2010

Network Offloaded Hierarchical Collectives Using ConnectX-2's CORE-<i>Direct</i> Capabilities.

[BibT_eX]

[DOI]

Proceedings of the Recent Advances in the Message Passing Interface, 2010

First experiences with congestion control in InfiniBand hardware.

[BibT_eX]

[DOI]

Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Overlapping computation and communication: Barrier algorithms and ConnectX-2 CORE-Direct capabilities.

[BibT_eX]

[DOI]

Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

ConnectX-2 InfiniBand Management Queues: First Investigation of the New Support for Network Offloaded Collective Operations.

[BibT_eX]

[DOI]

Proceedings of the 10th IEEE/ACM International Conference on Cluster, 2010

2009

Optics for Enabling Future HPC Systems.

[BibT_eX]

[DOI]

Proceedings of the 17th IEEE Symposium on High Performance Interconnects, 2009

Scheduling strategies for HPC as a service (HPCaaS).

[BibT_eX]

[DOI]

Proceedings of the 2009 IEEE International Conference on Cluster Computing, August 31, 2009

2006

Multi-core usage - Multi-core clusters usage model.

[BibT_eX]

[DOI]

Gilad Shainer

Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

Architecture and Implementation of Sockets Direct Protocol in Windows.

[BibT_eX]

[DOI]

Dror Goldenberg

Tzachi Dar

Gilad Shainer

Proceedings of the 2006 IEEE International Conference on Cluster Computing, 2006

Gilad Shainer

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...