Craig B. Stunkel

Orcid: 0000-0002-8265-933X

  • IBM Research

According to our database1, Craig B. Stunkel authored at least 48 papers between 1988 and 2024.

Collaborative distances:


IEEE Fellow

IEEE Fellow 2013, "For contributions to design and implementation of high-performance interconnection networks".



In proceedings 
PhD thesis 


Online presence:



Optimizing Application Performance with BlueField: Accelerating Large-Message Blocking and Nonblocking Collective Operations.
Proceedings of the ISC High Performance 2024 Research Paper Proceedings (39th International Conference), 2024

Data Movement Accelerator Engines on a Prototype Power10 Processor.
IEEE Micro, 2023

NSF/IEEE-TCPP Curriculum on Parallel and Distributed Computing for Undergraduates - Version II - Big Data, Energy, and Distributed Computing.
Proceedings of the 54th ACM Technical Symposium on Computer Science Education, Volume 2, 2023

NVIDIA's Quantum InfiniBand Network Congestion Control Technology and Its Impact on Application Performance.
Proceedings of the High Performance Computing - 37th International Conference, 2022

The high-speed networks of the Summit and Sierra supercomputers.
IBM J. Res. Dev., 2020

An Evaluation of Network Architectures for Next Generation Supercomputers.
Proceedings of the 7th International Workshop on Performance Modeling, 2016

Space Performance Tradeoffs in Compressing MPI Group Data Structures.
Proceedings of the 23rd European MPI Users' Group Meeting, EuroMPI 2016, 2016

Performance benefits of optical circuit switches for large-scale dragonfly networks.
Proceedings of the Optical Fiber Communications Conference and Exhibition, 2016


Special issue on Communication Architectures for Scalable Systems.
J. Parallel Distributed Comput., 2012

CASS Introduction.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

Welcome to CAC/SSPS 2010.
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Interconnection Networks for Parallel Computers.
Proceedings of the Wiley Encyclopedia of Computer Science and Engineering, 2008

Harnessing massive parallelism in the era of parallelism for the masses.
Proceedings of the 21th Annual International Conference on Supercomputing, 2007

On the Feasibility of Optical Circuit Switching for High Performance Computing Systems.
Proceedings of the ACM/IEEE SC2005 Conference on High Performance Networking and Computing, 2005

Improved Point-to-Point and Collective Communication Performance with Output-Queued High-Radix Routers.
Proceedings of the High Performance Computing, 2005

What are the future trends in high-performance inter.connects for parallel computers? [Panel 1].
Proceedings of the 12th Annual IEEE Symposium on High Performance Interconnects, 2004

HIPIQS: A High-Performance Switch Architecture Using Input Queuing.
IEEE Trans. Parallel Distributed Syst., 2002

Workshop Introduction.
Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS 2002), 2002

Architectural Support for Efficient Multicasting in Irregular Networks.
IEEE Trans. Parallel Distributed Syst., 2001

Adaptive Routing on the New Switch Chip for IBM SP Systems.
J. Parallel Distributed Comput., 2001

Implementing Multidestination Worms in Switch-Based Parallel Systems: Architectural Alternatives and Their Impact.
IEEE Trans. Parallel Distributed Syst., 2000

Adaptive Routing in RS/6000 SP-Like Bidirectional Multistage Interconnection Networks.
Proceedings of the 14th International Parallel & Distributed Processing Symposium (IPDPS'00), 2000

A New Switch Chip for IBM RS/6000 SP Systems.
Proceedings of the ACM/IEEE Conference on Supercomputing, 1999

Efficient Broadcast and Multicast on Multistage Interconnection Networks Using Multiport Encoding.
IEEE Trans. Parallel Distributed Syst., 1998

Where to Provide Support for Efficient Multicasting in Irregular Networks: Network Interface or Switch?
Proceedings of the 1998 International Conference on Parallel Processing (ICPP '98), 1998

IBM RS/6000 SP Interconnection Network Topologies for Large Systems.
Proceedings of the 1998 International Conference on Parallel Processing (ICPP '98), 1998

Convergence Points on Commercial Parallel Systems: Do We Have the Node Architecture? Do We Have the Network? Do We Have the Programming Paradigm?
Proceedings of the 1998 International Conference on Parallel Processing (ICPP '98), 1998

Clock Synchronization on a Multicomputer.
J. Parallel Distributed Comput., 1997

Challenges in the Design of Contemporary Routers.
Proceedings of the Parallel Computer Routing and Communication, 1997

Multicasting in Irregular Networks with Cut-Through Switches Using Tree-Based Multidestination Worms.
Proceedings of the Parallel Computer Routing and Communication, 1997

A Reliable Hardware Barrier Synchronization Scheme.
Proceedings of the 11th International Parallel Processing Symposium (IPPS '97), 1997

Adaptive Source Routing in Multistage Interconnection Networks.
Proceedings of IPPS '96, 1996

Commercially Viable MPP Networks.
Proceedings of the 1996 International Conference on Parallel Processing Workshop, 1996

The SP2 High-Performance Switch.
IBM Syst. J., 1995

Time synchronization on SP1 and SP2 parallel systems.
Proceedings of IPPS '95, 1995

Architecture and Implementation of Vulcan.
Proceedings of the 8th International Symposium on Parallel Processing, 1994

Sp2 high-performance switch architecture.
Proceedings of the Hot Interconnects II, 1994

An Analysis of Cache Performance for a Hypercube Multicomputer.
IEEE Trans. Parallel Distributed Syst., 1992

Address tracing of parallel systems via TRAPEDS.
Microprocess. Microsystems, 1992

Address Tracing for Parallel Machines.
Computer, 1991

TRAPEDS address tracing and its application to multicomputer cache performance analysis
PhD thesis, 1990

Algorithm-Based Fault Tolerance on a Hypercube Multiprocessor.
IEEE Trans. Computers, 1990

TRAPEDS: Producing Traces for Multicomputers Via Execution Driven Simulation.
Proceedings of the 1989 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, 1989

Analysis of Hypercube Cache Performance Using Address Traces Generated by TRAPEDS.
Proceedings of the International Conference on Parallel Processing, 1989

An evaluation of system-level fault tolerance on the Intel hypercube multiprocessor.
Proceedings of the Eighteenth International Symposium on Fault-Tolerant Computing, 1988

Hypercube implementation of the simplex algorithm.
Proceedings of the Third Conference on Hypercube Concurrent Computers and Applications, 1988

A novel approach to system-level fault tolerance in hypercube multiprocessors.
Proceedings of the Third Conference on Hypercube Concurrent Computers and Applications, 1988
