Rajeev Thakur

Orcid: 0000-0002-5532-3048

According to our database1, Rajeev Thakur authored at least 186 papers between 1992 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Designing and Prototyping Extensions to MPI in MPICH.
CoRR, 2024

POSTER: Optimizing Collective Communications with Error-bounded Lossy Compression for GPU Clusters.
Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2024

2023
A Comprehensive Performance Study of Large Language Models on Novel AI Accelerators.
CoRR, 2023

gZCCL: Compression-Accelerated Collective Communication Framework for GPU Clusters.
CoRR, 2023

C-Coll: Introducing Error-bounded Lossy Compression into MPI Collectives.
CoRR, 2023

Frustrated With MPI+Threads? Try MPIxThreads!
Proceedings of the 30th European MPI Users' Group Meeting, 2023

Cloud Services Enable Efficient AI-Guided Simulation Workflows across Heterogeneous Resources.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023

Quantifying the Performance Benefits of Partitioned Communication in MPI.
Proceedings of the 52nd International Conference on Parallel Processing, 2023

Accelerating MPI Collectives with Process-in-Process-based Multi-object Techniques.
Proceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing, 2023

Generalized Collective Algorithms for the Exascale Era.
Proceedings of the IEEE International Conference on Cluster Computing, 2023

PiP-MColl: Process-in-Process-based Multi-object MPI Collectives.
Proceedings of the IEEE International Conference on Cluster Computing, 2023

2022
MPIX Stream: An Explicit Solution to Hybrid MPI+X Programming.
Proceedings of the EuroMPI/USA'22: 29th European MPI Users' Group Meeting, Chattanooga, TN, USA, September 26, 2022


ACCLAiM: Advancing the Practicality of MPI Collective Communication Autotuning Using Machine Learning.
Proceedings of the IEEE International Conference on Cluster Computing, 2022

2021
Translational research in the MPICH project.
J. Comput. Sci., 2021

Co-design Center for Exascale Machine Learning Technologies (ExaLearn).
Int. J. High Perform. Comput. Appl., 2021

Performance Portability in the Exascale Computing Project: Exploration Through a Panel Series.
Comput. Sci. Eng., 2021

Colmena: Scalable Machine-Learning-Based Steering of Ensemble Simulations for High Performance Computing.
Proceedings of the IEEE/ACM Workshop on Machine Learning in High Performance Computing Environments, 2021

A FACT-based Approach: Making Machine Learning Collective Autotuning Feasible on Exascale Systems.
Proceedings of the Workshop on Exascale MPI, 2021

2019
Guest editor's introduction: Special issue on best papers from EuroMPI/USA 2017.
Parallel Comput., 2019

2017
Rethinking key-value store for parallel I/O optimization.
Int. J. High Perform. Comput. Appl., 2017

2016
MPI-ACC: Accelerator-Aware MPI for Scientific Applications.
IEEE Trans. Parallel Distributed Syst., 2016

An implementation and evaluation of the MPI 3.0 one-sided communication interface.
Concurr. Comput. Pract. Exp., 2016

Scanning LIDAR in Advanced Driver Assistance Systems and Beyond.
IEEE Consumer Electron. Mag., 2016

Rethinking High Performance Computing System Architecture for Scientific Big Data Applications.
Proceedings of the 2016 IEEE Trustcom/BigDataSE/ISPA, 2016

2015
Remote Memory Access Programming in MPI-3.
ACM Trans. Parallel Comput., 2015

IOPro: a parallel I/O profiling and visualization framework for high-performance storage systems.
J. Supercomput., 2015

Performance model-directed data sieving for high-performance I/O.
J. Supercomput., 2015

Collective input/output under memory constraints.
Int. J. High Perform. Comput. Appl., 2015

Efficient disk-to-disk sorting: a case study in the decoupled execution paradigm.
Proceedings of the 2015 International Workshop on Data-Intensive Scalable Computing Systems, 2015

Distributed Monitoring and Management of Exascale Systems in the Argo Project.
Proceedings of the Distributed Applications and Interoperable Systems, 2015

2014
Processing MPI Derived Datatypes on Noncontiguous GPU-Resident Data.
IEEE Trans. Parallel Distributed Syst., 2014

Enabling communication concurrency through flexible MPI endpoints.
Int. J. High Perform. Comput. Appl., 2014

Rethinking key-value store for parallel I/O optimization.
Proceedings of the 2014 International Workshop on Data Intensive Scalable Computing Systems, 2014

Decoupled I/O for Data-Intensive High Performance Computing.
Proceedings of the 43rd International Conference on Parallel Processing Workshops, 2014

2013
Software Abstractions and Methodologies for HPC Simulation Codes on Future Architectures.
CoRR, 2013

MPI + MPI: a new hybrid approach to parallel programming with MPI plus shared memory.
Computing, 2013

Analysis of topology-dependent MPI performance on Gemini networks.
Proceedings of the 20th European MPI Users's Group Meeting, 2013

Enabling MPI interoperability through flexible communication endpoints.
Proceedings of the 20th European MPI Users's Group Meeting, 2013

Pattern-Direct and Layout-Aware Replication Scheme for Parallel I/O Systems.
Proceedings of the 27th IEEE International Symposium on Parallel and Distributed Processing, 2013

Synchronization and Ordering Semantics in Hybrid MPI+GPU Programming.
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

Memory-conscious collective I/O for extreme scale HPC systems.
Proceedings of the 3rd International Workshop on Runtime and Operating Systems for Supercomputers, 2013

MPI-Interoperable Generalized Active Messages.
Proceedings of the 19th IEEE International Conference on Parallel and Distributed Systems, 2013

pVOCL: Power-Aware Dynamic Placement and Migration in Virtualized GPU Environments.
Proceedings of the IEEE 33rd International Conference on Distributed Computing Systems, 2013

On the efficacy of GPU-integrated MPI for scientific applications.
Proceedings of the 22nd International Symposium on High-Performance Parallel and Distributed Computing, 2013

Optimization Strategies for MPI-Interoperable Active Messages.
Proceedings of the IEEE 11th International Conference on Dependable, 2013

Runtime system design of decoupled execution paradigm for data-intensive high-end computing.
Proceedings of the 2013 IEEE International Conference on Cluster Computing, 2013

Toward Asynchronous and MPI-Interoperable Active Messages.
Proceedings of the 13th IEEE/ACM International Symposium on Cluster, 2013

2012
Poster: Memory-Conscious Collective I/O for Extreme-Scale HPC Systems.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Abstract: Memory-Conscious Collective I/O for Extreme-Scale HPC Systems.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012

IOPin: Runtime Profiling of Parallel I/O in HPC Systems.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Leveraging MPI's One-Sided Communication Interface for Shared-Memory Programming.
Proceedings of the Recent Advances in the Message Passing Interface, 2012

Advanced MPI Including New MPI-3 Features.
Proceedings of the Recent Advances in the Message Passing Interface, 2012

Efficient Multithreaded Context ID Allocation in MPI.
Proceedings of the Recent Advances in the Message Passing Interface, 2012

A Server-Level Adaptive Data Layout Strategy for Parallel File Systems.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

CHAIO: Enabling HPC Applications on Data-Intensive File Systems.
Proceedings of the 41st International Conference on Parallel Processing, 2012

DMA-Assisted, Intranode Communication in GPU Accelerated Systems.
Proceedings of the 14th IEEE International Conference on High Performance Computing and Communication & 9th IEEE International Conference on Embedded Software and Systems, 2012

MPI-ACC: An Integrated and Extensible Approach to Data Movement in Accelerator-based Systems.
Proceedings of the 14th IEEE International Conference on High Performance Computing and Communication & 9th IEEE International Conference on Embedded Software and Systems, 2012

Enabling Fast, Noncontiguous GPU Data Movement in Hybrid MPI+GPU Environments.
Proceedings of the 2012 IEEE International Conference on Cluster Computing, 2012

KNOWAC: I/O Prefetch via Accumulated Knowledge.
Proceedings of the 2012 IEEE International Conference on Cluster Computing, 2012

A Decoupled Execution Paradigm for Data-Intensive High-End Computing.
Proceedings of the 2012 IEEE International Conference on Cluster Computing, 2012

Boosting Application-Specific Parallel I/O Optimization Using IOSIG.
Proceedings of the 12th IEEE/ACM International Symposium on Cluster, 2012

Transparent Accelerator Migration in a Virtualized GPU Environment.
Proceedings of the 12th IEEE/ACM International Symposium on Cluster, 2012

2011
Mpi on millions of Cores.
Parallel Process. Lett., 2011

The International Exascale Software Project roadmap.
Int. J. High Perform. Comput. Appl., 2011

The scalable process topology interface of MPI 2.2.
Concurr. Comput. Pract. Exp., 2011

Formal analysis of MPI-based parallel programs.
Commun. ACM, 2011

Server-side I/O coordination for parallel file systems.
Proceedings of the Conference on High Performance Computing Networking, 2011

Performance Expectations and Guidelines for MPI Derived Datatypes.
Proceedings of the Recent Advances in the Message Passing Interface, 2011

Scalable Memory Use in MPI: A Case Study with MPICH2.
Proceedings of the Recent Advances in the Message Passing Interface, 2011

LACIO: A New Collective I/O Strategy for Parallel I/O Systems.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

A Segment-Level Adaptive Data Layout Scheme for Improved Load Balance in Parallel File Systems.
Proceedings of the 11th IEEE/ACM International Symposium on Cluster, 2011

2010
Self-Consistent MPI Performance Guidelines.
IEEE Trans. Parallel Distributed Syst., 2010

Formal methods applied to high-performance computing software design: a case study of MPI one-sided communication-based locking.
Softw. Pract. Exp., 2010

A study of dynamic meta-learning for failure prediction in large-scale systems.
J. Parallel Distributed Comput., 2010

A Pipelined Algorithm for Large, Irregular All-Gather Problems.
Int. J. High Perform. Comput. Appl., 2010

The Importance of Non-Data-Communication Overheads in MPI.
Int. J. High Perform. Comput. Appl., 2010

Fine-Grained Multithreading Support for Hybrid Threaded MPI Programming.
Int. J. High Perform. Comput. Appl., 2010

Global-scale distributed I/O with ParaMEDIC.
Concurr. Comput. Pract. Exp., 2010

Implementing MPI on Windows: Comparison with Common Approaches on Unix.
Proceedings of the Recent Advances in the Message Passing Interface, 2010

Toward Performance Models of MPI Implementations for Understanding Application Scaling Issues.
Proceedings of the Recent Advances in the Message Passing Interface, 2010

Enabling Concurrent Multithreaded MPI Communication on Multicore Petascale Systems.
Proceedings of the Recent Advances in the Message Passing Interface, 2010

Dynamic Verification of Hybrid Programs.
Proceedings of the Recent Advances in the Message Passing Interface, 2010

PMI: A Scalable Parallel Process-Management Interface for Extreme-Scale Systems.
Proceedings of the Recent Advances in the Message Passing Interface, 2010

Enabling active storage on parallel I/O software stacks.
Proceedings of the IEEE 26th Symposium on Mass Storage Systems and Technologies, 2010

A layout-aware optimization strategy for collective I/O.
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, 2010

Minimizing MPI Resource Contention in Multithreaded Multicore Environments.
Proceedings of the 2010 IEEE International Conference on Cluster Computing, 2010

Improving Parallel I/O Performance with Data Layout Awareness.
Proceedings of the 2010 IEEE International Conference on Cluster Computing, 2010

Hybrid parallel programming with MPI and unified parallel C.
Proceedings of the 7th Conference on Computing Frontiers, 2010

2009
Test suite for evaluating performance of multithreaded MPI communication.
Parallel Comput., 2009

ProOnE: a general-purpose protocol onload engine for multi- and many-core architectures.
Comput. Sci. Res. Dev., 2009

Toward message passing for a million processes: characterizing MPI on a massive scale blue gene/P.
Comput. Sci. Res. Dev., 2009

A configurable algorithm for parallel image-compositing applications.
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2009

Hierarchical Collectives in MPICH2.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2009

Sound and Efficient Dynamic Verification of MPI Programs with Probe Non-determinism.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2009

Static-Analysis Assisted Dynamic Verification of MPI Waitany Programs (Poster Abstract).
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2009

Conflict Detection Algorithm to Minimize Locking for MPI-IO Atomicity.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2009

Processing MPI Datatypes Outside MPI.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2009

MPI on a Million Processors.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2009

How Formal Dynamic Verification Tools Facilitate Novel Concurrency Visualizations.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2009

Formal verification of practical MPI programs.
Proceedings of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2009

Investigating High Performance RMA Interfaces for the MPI-3 Standard.
Proceedings of the ICPP 2009, 2009

Natively Supporting True One-Sided Communication in.
Proceedings of the 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, 2009

2008
Hiding I/O latency with pre-execution prefetching for parallel applications.
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2008

Parallel I/O prefetching using MPI file caching and I/O signatures.
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2008

Implementing Efficient Dynamic Formal Verification Methods for MPI Programs.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2008

A Simple, Pipelined Algorithm for Large, Irregular All-gather Problems.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2008

A Formal Approach to Detect Functionally Irrelevant Barriers in MPI Programs.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2008

Self-consistent MPI-IO Performance Requirements and Expectations.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2008

Non-data-communication Overheads in MPI: Analysis on Blue Gene/P.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2008

Toward Efficient Support for Multithreaded MPI Communication.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2008

Semantics-based distributed I/O for mpiBLAST.
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2008

2008 International Conference on Parallel Processing September 8-12, 2008 Portland, Oregon Exploring Parallel I/O Concurrency with Speculative Prefetching.
Proceedings of the 2008 International Conference on Parallel Processing, 2008

Communication Analysis of Parallel 3D FFT for Flat Cartesian Meshes on Large Blue Gene Systems.
Proceedings of the High Performance Computing, 2008

Sockets Direct Protocol for Hybrid Network Stacks: A Case Study with iWARP over 10G Ethernet.
Proceedings of the High Performance Computing, 2008

2007
Thread-safety in an MPI implementation: Requirements and analysis.
Parallel Comput., 2007

Implementing MPI-IO Atomic Mode and Shared File Pointers Using MPI One-Sided Communication.
Int. J. High Perform. Comput. Appl., 2007

Analyzing the impact of supporting out-of-order communication on in-order performance with iWARP.
Proceedings of the ACM/IEEE Conference on High Performance Networking and Computing, 2007

Self-consistent MPI Performance Requirements.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 14th European PVM/MPI User's Group Meeting, Paris, France, September 30, 2007

Test Suite for Evaluating Performance of MPI Implementations That Support MPI_THREAD_MULTIPLE.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 14th European PVM/MPI User's Group Meeting, Paris, France, September 30, 2007

Practical Model-Checking Method for Verifying Correctness of MPI Programs.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 14th European PVM/MPI User's Group Meeting, Paris, France, September 30, 2007

Extending the MPI-2 Generalized Request Interface.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 14th European PVM/MPI User's Group Meeting, Paris, France, September 30, 2007

Revealing the Performance of MPI RMA Implementations.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 14th European PVM/MPI User's Group Meeting, Paris, France, September 30, 2007

Parallel I/O Performance Characterization of Columbia and NEC SX-8 Superclusters.
Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

Nonuniformly Communicating Noncontiguous Data: A Case Study with PETSc and MPI.
Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

A Meta-Learning Failure Predictor for Blue Gene/L Systems.
Proceedings of the 2007 International Conference on Parallel Processing (ICPP 2007), 2007

Advanced Flow-control Mechanisms for the Sockets Direct Protocol over InfiniBand.
Proceedings of the 2007 International Conference on Parallel Processing (ICPP 2007), 2007

Open Issues in MPI Implementation.
Proceedings of the Advances in Computer Systems Architecture, 2007

2006
Discretionary Caching for I/O on Clusters.
Clust. Comput., 2006

M02 - Parallel I/O in practice.
Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

S01 - Advanced MPI: I/O and one-sided communication.
Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

Formal Verification of Programs That Use MPI One-Sided Communication.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2006

Can MPI Be Used for Persistent Parallel Services?
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2006

Issues in Developing a Thread-Safe MPI Implementation.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2006

Automatic Memory Optimizations for Improving MPI Derived Datatype Performance.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2006

Collective communication on architectures that support simultaneous communication over multiple links.
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2006

MPI-IO/L: efficient remote I/O for MPI-IO via logistical networking.
Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

High performance file I/O for the Blue Gene/L supercomputer.
Proceedings of the 12th International Symposium on High-Performance Computer Architecture, 2006

A New Flexible MPI Collective I/O Implementation.
Proceedings of the 2006 IEEE International Conference on Cluster Computing, 2006

2005
Optimization of Collective Communication Operations in MPICH.
Int. J. High Perform. Comput. Appl., 2005

Optimizing the Synchronization Operations in Message Passing Interface One-Sided Communication.
Int. J. High Perform. Comput. Appl., 2005

Implementing Byte-Range Locks Using MPI One-Sided Communication.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2005

Implementing MPI-IO Shared File Pointers Without File System Support.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2005

An Evaluation of Implementation Options for MPI One-Sided Communication.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2005

Implementing MPI-IO atomic mode without file system support.
Proceedings of the 5th International Symposium on Cluster Computing and the Grid (CCGrid 2005), 2005

2004
Minimizing Synchronization Overhead in the Implementation of MPI One-Sided Communication.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2004

The Impact of File Systems on MPI-IO Scalability.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2004

Efficient Implementation of MPI-2 Passive One-Sided Communication on InfiniBand Clusters.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2004

On the Performance of the POSIX I/O Interface to PVFS.
Proceedings of the 12th Euromicro Workshop on Parallel, 2004

RFS: efficient and flexible remote file access for MPI-IO.
Proceedings of the 2004 IEEE International Conference on Cluster Computing (CLUSTER 2004), 2004

Predicting memory-access cost based on data-access patterns.
Proceedings of the 2004 IEEE International Conference on Cluster Computing (CLUSTER 2004), 2004

High performance MPI-2 one-sided communication over InfiniBand.
Proceedings of the 4th IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid 2004), 2004

2003
High-performance scientific data management system.
J. Parallel Distributed Comput., 2003

Parallel netCDF: A Scientific High-Performance I/O Interface
CoRR, 2003

Parallel netCDF: A High-Performance Scientific I/O Interface.
Proceedings of the ACM/IEEE SC2003 Conference on High Performance Networking and Computing, 2003

Improving the Performance of Collective Operations in MPICH.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface,10th European PVM/MPI Users' Group Meeting, Venice, Italy, September 29, 2003

Using MPI-2: Advanced Features of the Message Passing Interface.
Proceedings of the 2003 IEEE International Conference on Cluster Computing (CLUSTER 2003), 2003

Improving the Performance of MPI Derived Datatypes by Optimizing Memory-Access Cost.
Proceedings of the 2003 IEEE International Conference on Cluster Computing (CLUSTER 2003), 2003

2002
Optimizing noncontiguous accesses in MPI-IO.
Parallel Comput., 2002

2001
Evaluation of Collective I/O Implementations on Parallel Architectures.
J. Parallel Distributed Comput., 2001

High-performance file I/O in Java: Existing approaches and bulk I/O extensions.
Concurr. Comput. Pract. Exp., 2001

A Scientific Data Management System for Irregular Applications.
Proceedings of the 15th International Parallel & Distributed Processing Symposium (IPDPS-01), 2001

2000
Data management for large-scale scientific computations in high performance distributed systems.
Clust. Comput., 2000

Integrating Parallel File I/O and Database Support for High-Performance Scientific Data Management.
Proceedings of the Proceedings Supercomputing 2000, 2000

An evaluation of Java's I/O capabilities for high-performance computing.
Proceedings of the ACM 2000 Java Grande Conference, San Francisco, CA, USA, 2000

Parallel I/O and Storage Technology.
Proceedings of the Euro-Par 2000, Parallel Processing, 6th International Euro-Par Conference, Munich, Germany, August 29, 2000

PVFS: A Parallel File System for Linux Clusters.
Proceedings of the 4th Annual Linux Showcase & Conference 2000, 2000

1999
Improving Collective I/O Performance Using Threads.
Proceedings of the 13th International Parallel Processing Symposium / 10th Symposium on Parallel and Distributed Processing (IPPS / SPDP '99), 1999

On Implementing MPI-IO Portably and with High Performance.
Proceedings of the Sixth Workshop on I/O in Parallel and Distributed Systems, 1999

Data Management for Large-Scale Scientific Computations in High Performance Distributed Systems.
Proceedings of the Eighth IEEE International Symposium on High Performance Distributed Computing, 1999

1998
I/O in Parallel Applications: the Weakest Link.
Int. J. High Perform. Comput. Appl., 1998

A Case for Using MPI's Derived Datatypes to Improve I/O Performance.
Proceedings of the ACM/IEEE Conference on Supercomputing, 1998

1996
Efficient Algorithms for Array Redistribution.
IEEE Trans. Parallel Distributed Syst., 1996

An Extended Two-Phase Method for Accessing Sections of Out-of-Core Arrays.
Sci. Program., 1996

Passion: Optimized I/O for Parallel Applications.
Computer, 1996

An Experimental Evaluation of the Parallel I/O Systems of the IBM SP and Intel Paragon Using a Production Application.
Proceedings of the Parallel Computation, 1996

Runtime Support for Out-of-Core Parallel Programs.
Proceedings of the Input/Output in Parallel and Distributed Computer Systems., 1996

1995
Complete exchange on the CM-5 and Touchstone Delta.
J. Supercomput., 1995

1994
Compilation of out-of-core data parallel programs for distributed memory machines.
SIGARCH Comput. Archit. News, 1994

Connected Component Labeling on Coarse Grain Parallel Computers: An Experimental Study.
J. Parallel Distributed Comput., 1994

Complete Exchange on a Wormhole Routed Mesh.
Proceedings of the MASCOTS '94, Proceedings of the Second International Workshop on Modeling, Analysis, and Simulation On Computer and Telecommunication Systems, January 31, 1994

All-to-All Communication on Meshes with Wormhole Routing.
Proceedings of the 8th International Symposium on Parallel Processing, 1994

Compiler and runtime support for out-of-core HPF programs.
Proceedings of the 8th international conference on Supercomputing, 1994

1993
Experimental Performance Evaluation of the CM-5.
J. Parallel Distributed Comput., 1993

1992
Scheduling Regular and Irregular Communication Patterns on the CM-5.
Proceedings of the Proceedings Supercomputing '92, 1992

Evaluation of Connected Component Labeling Algorithms on Shared and Distributed Memory Multiprocessors.
Proceedings of the 6th International Parallel Processing Symposium, 1992


  Loading...