Sreeram Potluri

According to our database1, Sreeram Potluri authored at least 39 papers between 2010 and 2021.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2021
Dynamic Symmetric Heap Allocation in NVSHMEM.
Proceedings of the OpenSHMEM and Related Technologies. OpenSHMEM in the Era of Exascale and Smart Networks, 2021

2020
An Initial Assessment of NVSHMEM for High Performance Computing.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium Workshops, 2020

2018
GPUDirect Async: Exploring GPU synchronous communication techniques for InfiniBand clusters.
J. Parallel Distributed Comput., 2018

Designing High-Performance In-Memory Key-Value Operations with Persistent GPU Kernels and OpenSHMEM.
Proceedings of the OpenSHMEM and Related Technologies. OpenSHMEM in the Era of Extreme Heterogeneity, 2018

2017
Efficient Breadth First Search on Multi-GPU Systems Using GPU-Centric OpenSHMEM.
Proceedings of the OpenSHMEM and Related Technologies. Big Compute and Big Data Convergence, 2017

MPI-GDS: High Performance MPI Designs with GPUDirect-aSync for CPU-GPU Control Flow Decoupling.
Proceedings of the 46th International Conference on Parallel Processing, 2017

GPU-Centric Communication on NVIDIA GPU Clusters with InfiniBand: A Case Study with OpenSHMEM.
Proceedings of the 24th IEEE International Conference on High Performance Computing, 2017

Offloading communication control logic in GPU accelerated applications.
Proceedings of the 17th IEEE/ACM International Symposium on Cluster, 2017

2015
Exploring OpenSHMEM Model to Program GPU-based Extreme-Scale Systems.
Proceedings of the OpenSHMEM and Related Technologies. Experiences, Implementations, and Technologies, 2015


2014
GPU-Aware MPI on RDMA-Enabled Clusters: Design, Implementation and Evaluation.
IEEE Trans. Parallel Distributed Syst., 2014

Designing Scalable Out-of-core Sorting with Hybrid MPI+PGAS Programming Models.
Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models, 2014

A Comprehensive Performance Evaluation of OpenSHMEM Libraries on InfiniBand Clusters.
Proceedings of the OpenSHMEM and Related Technologies. Experiences, Implementations, and Tools, 2014

High Performance Alltoall and Allgather Designs for InfiniBand MIC Clusters.
Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014

HAND: A Hybrid Approach to Accelerate Non-contiguous Data Movement Using MPI Datatypes on GPU Clusters.
Proceedings of the 43rd International Conference on Parallel Processing, 2014

MIC-Check: a distributed check pointing framework for the intel many integrated cores architecture.
Proceedings of the 23rd International Symposium on High-Performance Parallel and Distributed Computing, 2014

Designing efficient small message transfer mechanism for inter-node MPI communication on InfiniBand GPU clusters.
Proceedings of the 21st International Conference on High Performance Computing, 2014

Scalable Graph500 design with MPI-3 RMA.
Proceedings of the 2014 IEEE International Conference on Cluster Computing, 2014

High performance OpenSHMEM for Xeon Phi clusters: Extensions, runtime designs and application co-design.
Proceedings of the 2014 IEEE International Conference on Cluster Computing, 2014

2013
Designing Scalable Graph500 Benchmark with Hybrid MPI+OpenSHMEM Programming Models.
Proceedings of the Supercomputing - 28th International Supercomputing Conference, 2013

MVAPICH-PRISM: a proxy-based communication framework using InfiniBand and SCIF for intel MIC clusters.
Proceedings of the International Conference for High Performance Computing, 2013

Efficient and truly passive MPI-3 RMA using InfiniBand atomics.
Proceedings of the 20th European MPI Users's Group Meeting, 2013

Extending OpenSHMEM for GPU Computing.
Proceedings of the 27th IEEE International Symposium on Parallel and Distributed Processing, 2013

MIC-RO: enabling efficient remote offload on heterogeneous many integrated core (MIC) clusters with InfiniBand.
Proceedings of the International Conference on Supercomputing, 2013

Efficient Inter-node MPI Communication Using GPUDirect RDMA for InfiniBand Clusters with NVIDIA GPUs.
Proceedings of the 42nd International Conference on Parallel Processing, 2013

Designing Optimized MPI Broadcast and Allreduce for Many Integrated Core (MIC) InfiniBand Clusters.
Proceedings of the IEEE 21st Annual Symposium on High-Performance Interconnects, 2013

A scalable and portable approach to accelerate hybrid HPL on heterogeneous CPU-GPU clusters.
Proceedings of the 2013 IEEE International Conference on Cluster Computing, 2013

Efficient Intra-node Communication on Intel-MIC Clusters.
Proceedings of the 13th IEEE/ACM International Symposium on Cluster, 2013

2012
Design of a scalable InfiniBand topology service to enable network-topology-aware placement of processes.
Proceedings of the SC Conference on High Performance Computing Networking, 2012

OMB-GPU: A Micro-Benchmark Suite for Evaluating MPI Libraries on GPU Clusters.
Proceedings of the Recent Advances in the Message Passing Interface, 2012

Optimizing MPI Communication on Multi-GPU Systems Using CUDA Inter-Process Communication.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

2011
MVAPICH2-GPU: optimized GPU to GPU communication for InfiniBand clusters.
Comput. Sci. Res. Dev., 2011

Codesign for InfiniBand Clusters.
Computer, 2011

Optimizing MPI One Sided Communication on Multi-core InfiniBand Clusters Using Shared Memory Backed Windows.
Proceedings of the Recent Advances in the Message Passing Interface, 2011

Design and Implementation of Key Proposed MPI-3 One-Sided Communication Semantics on InfiniBand.
Proceedings of the Recent Advances in the Message Passing Interface, 2011

Optimized Non-contiguous MPI Datatype Communication for GPU Clusters: Design, Implementation and Evaluation with MVAPICH2.
Proceedings of the 2011 IEEE International Conference on Cluster Computing (CLUSTER), 2011

MPI Alltoall Personalized Exchange on GPGPU Clusters: Design Alternatives and Benefit.
Proceedings of the 2011 IEEE International Conference on Cluster Computing (CLUSTER), 2011

2010
Quantifying performance benefits of overlap using MPI-2 in a seismic modeling application.
Proceedings of the 24th International Conference on Supercomputing, 2010

High Performance Design and Implementation of Nemesis Communication Layer for Two-Sided and One-Sided MPI Semantics in MVAPICH2.
Proceedings of the 39th International Conference on Parallel Processing, 2010


  Loading...