Devendar Bureddy

Orcid: 0009-0006-4638-2456

According to our database¹, Devendar Bureddy authored at least 15 papers between 2011 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Bibliography

2025

Unified Collective Communication: A Unified Library for CPU, GPU, and DPU Collectives.

[BibT_eX]

[DOI]

Manjunath Gorentla Venkata

IEEE Micro, 2025

2024

Unified Collective Communication (UCC): An Unified Library for CPU, GPU, and DPU Collectives.

[BibT_eX]

[DOI]

Manjunath Gorentla Venkata

Proceedings of the IEEE Symposium on High-Performance Interconnects, 2024

2020

Scalable Hierarchical Aggregation and Reduction Protocol (SHARP)<sup>TM</sup> Streaming-Aggregation Hardware Design and Evaluation.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing - 35th International Conference, 2020

2017

Towards A Data Centric System Architecture: SHARP.

[BibT_eX]

[DOI]

Supercomput. Front. Innov., 2017

2016

Scalable Hierarchical Aggregation Protocol (SHArP): A Hardware Architecture for Efficient Data Reduction.

[BibT_eX]

[DOI]

Proceedings of the First International Workshop on Communication Optimizations in HPC, 2016

2014

GPU-Aware MPI on RDMA-Enabled Clusters: Design, Implementation and Evaluation.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2014

2013

MVAPICH-PRISM: a proxy-based communication framework using InfiniBand and SCIF for intel MIC clusters.

[BibT_eX]

[DOI]

Krishna Chaitanya Kandalla

Hari Subramoni

Dhabaleswar K. Panda

Proceedings of the International Conference for High Performance Computing, 2013

Extending OpenSHMEM for GPU Computing.

[BibT_eX]

[DOI]

Proceedings of the 27th IEEE International Symposium on Parallel and Distributed Processing, 2013

Efficient Inter-node MPI Communication Using GPUDirect RDMA for InfiniBand Clusters with NVIDIA GPUs.

[BibT_eX]

[DOI]

Proceedings of the 42nd International Conference on Parallel Processing, 2013

Designing Optimized MPI Broadcast and Allreduce for Many Integrated Core (MIC) InfiniBand Clusters.

[BibT_eX]

[DOI]

Krishna Chaitanya Kandalla

Proceedings of the IEEE 21st Annual Symposium on High-Performance Interconnects, 2013

Design of network topology aware scheduling services for large InfiniBand clusters.

[BibT_eX]

[DOI]

Hari Subramoni

Devendar Bureddy

Krishna Chaitanya Kandalla

Proceedings of the 2013 IEEE International Conference on Cluster Computing, 2013

Efficient Intra-node Communication on Intel-MIC Clusters.

[BibT_eX]

[DOI]

Sreeram Potluri

Akshay Venkatesh

Devendar Bureddy

Krishna Chaitanya Kandalla

Dhabaleswar K. Panda

Proceedings of the 13th IEEE/ACM International Symposium on Cluster, 2013

2012

OMB-GPU: A Micro-Benchmark Suite for Evaluating MPI Libraries on GPU Clusters.

[BibT_eX]

[DOI]

Proceedings of the Recent Advances in the Message Passing Interface, 2012

Optimizing MPI Communication on Multi-GPU Systems Using CUDA Inter-Process Communication.

[BibT_eX]

[DOI]

Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

2011

Design and Implementation of Key Proposed MPI-3 One-Sided Communication Semantics on InfiniBand.

[BibT_eX]

[DOI]

Proceedings of the Recent Advances in the Message Passing Interface, 2011

Devendar Bureddy

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...