Jahanzeb Maqbool Hashmi

According to our database1, Jahanzeb Maqbool Hashmi authored at least 27 papers between 2016 and 2021.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

On csauthors.net:

Bibliography

2021
Designing a ROCm-Aware MPI Library for AMD GPUs: Early Experiences.
Proceedings of the High Performance Computing - 36th International Conference, 2021

BluesMPI: Efficient MPI Non-blocking Alltoall Offloading Designs on Modern BlueField Smart NICs.
Proceedings of the High Performance Computing - 36th International Conference, 2021

Towards Architecture-aware Hierarchical Communication Trees on Modern HPC Systems.
Proceedings of the 28th IEEE International Conference on High Performance Computing, 2021

Efficient MPI-based Communication for GPU-Accelerated Dask Applications.
Proceedings of the 21st IEEE/ACM International Symposium on Cluster, 2021

2020
FALCON-X: Zero-copy MPI derived datatype processing on modern CPU and GPU architectures.
J. Parallel Distributed Comput., 2020

Communication-Aware Hardware-Assisted MPI Overlap Engine.
Proceedings of the High Performance Computing - 35th International Conference, 2020

MPI Meets Cloud: Case Study with Amazon EC2 and Microsoft Azure.
Proceedings of the Fourth IEEE/ACM Annual Workshop on Emerging Parallel and Distributed Runtime Systems and Middleware, 2020

Exploring Hybrid MPI+Kokkos Tasks Programming Model.
Proceedings of the 3rd IEEE/ACM Annual Parallel Applications Workshop: Alternatives To MPI+X, 2020

GEMS: GPU-enabled memory-aware model-parallelism system for distributed DNN training.
Proceedings of the International Conference for High Performance Computing, 2020

Scalable MPI Collectives using SHARP: Large Scale Performance Evaluation on the TACC Frontera System.
Proceedings of the Workshop on Exascale MPI, 2020

Performance Characterization of Network Mechanisms for Non-Contiguous Data Transfers in MPI.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium Workshops, 2020

Machine-agnostic and Communication-aware Designs for MPI on Emerging Architectures.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2020

Blink: Towards Efficient RDMA-based Communication Coroutines for Parallel Python Applications.
Proceedings of the 27th IEEE International Conference on High Performance Computing, 2020

2019
A Hybrid Framework for Sentiment Analysis Using Genetic Algorithm Based Feature Reduction.
IEEE Access, 2019

Design and Evaluation of Shared Memory CommunicationBenchmarks on Emerging Architectures using MVAPICH2.
Proceedings of the IEEE/ACM Third Annual Workshop on Emerging Parallel and Distributed Runtime Systems and Middleware, 2019

Leveraging Network-level parallelism with Multiple Process-Endpoints for MPI Broadcast.
Proceedings of the IEEE/ACM Third Annual Workshop on Emerging Parallel and Distributed Runtime Systems and Middleware, 2019

FALCON: Efficient Designs for Zero-Copy MPI Datatype Processing on Emerging Architectures.
Proceedings of the 2019 IEEE International Parallel and Distributed Processing Symposium, 2019

High-Performance Adaptive MPI Derived Datatype Communication for Modern Multi-GPU Systems.
Proceedings of the 26th IEEE International Conference on High Performance Computing, 2019

Design and Characterization of Shared Address Space MPI Collectives on Modern Architectures.
Proceedings of the 19th IEEE/ACM International Symposium on Cluster, 2019

2018
Cooperative rendezvous protocols for improved performance and overlap.
Proceedings of the International Conference for High Performance Computing, 2018

Designing Efficient Shared Address Space Reduction Collectives for Multi-/Many-cores.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium, 2018

SALaR: Scalable and Adaptive Designs for Large Message Reduction Collectives.
Proceedings of the IEEE International Conference on Cluster Computing, 2018

2017
S-Caffe: Co-designing MPI Runtimes and Caffe for Scalable Deep Learning on Modern GPU Clusters.
Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2017

Exploiting and Evaluating OpenSHMEM on KNL Architecture.
Proceedings of the OpenSHMEM and Related Technologies. Big Compute and Big Data Convergence, 2017

Efficient and Scalable Multi-Source Streaming Broadcast on GPU Clusters for Deep Learning.
Proceedings of the 46th International Conference on Parallel Processing, 2017

Kernel-Assisted Communication Engine for MPI on Emerging Manycore Processors.
Proceedings of the 24th IEEE International Conference on High Performance Computing, 2017

2016
Enabling Performance Efficient Runtime Support for Hybrid MPI+UPC++ Programming Models.
Proceedings of the 18th IEEE International Conference on High Performance Computing and Communications; 14th IEEE International Conference on Smart City; 2nd IEEE International Conference on Data Science and Systems, 2016


  Loading...