Kevin J. Barker

Orcid: 0000-0003-4947-0559

According to our database1, Kevin J. Barker authored at least 84 papers between 1998 and 2023.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2023
Accelerating matrix-centric graph processing on GPUs through bit-level optimizations.
J. Parallel Distributed Comput., July, 2023

Codesign for Extreme Heterogeneity: Integrating Custom Hardware With Commodity Computing Technology to Support Next-Generation HPC Converged Workloads.
IEEE Internet Comput., 2023

MPGemmFI: A Fault Injection Technique for Mixed Precision GEMM in ML Applications.
CoRR, 2023

Evaluating Emerging AI/ML Accelerators: IPU, RDU, and NVIDIA/AMD GPUs.
CoRR, 2023

MGG: Accelerating Graph Neural Networks with Fine-Grained Intra-Kernel Communication-Computation Pipelining on Multi-GPU Platforms.
Proceedings of the 17th USENIX Symposium on Operating Systems Design and Implementation, 2023

Spy in the GPU-box: Covert and Side Channel Attacks on Multi-GPU Systems.
Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023

Assessing Risk in High Performance Computing Attacks.
Proceedings of the 9th International Conference on Information Systems Security and Privacy, 2023

Finding Your Niche: An Evolutionary Approach to HPC Topologies.
Proceedings of the IEEE High Performance Extreme Computing Conference, 2023

2022
Direction-optimizing Label Propagation Framework for Structure Detection in Graphs: Design, Implementation, and Experimental Analysis.
ACM J. Exp. Algorithmics, 2022

MSREP: A Fast yet Light Sparse Matrix Framework for Multi-GPU Systems.
CoRR, 2022

Empowering GNNs with Fine-grained Communication-Computation Pipelining on Multi-GPU Platforms.
CoRR, 2022

Bit-GraphBLAS: Bit-Level Optimizations of Matrix-Centric Graph Processing on GPU.
Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022

DRIPS: Dynamic Rebalancing of Pipelined Streaming Applications on CGRAs.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022

Towards Precision-Aware Fault Tolerance Approaches for Mixed-Precision Applications.
Proceedings of the 12th IEEE/ACM Workshop on Fault Tolerance for HPC at eXtreme Scale, 2022

2021
ARENA: Asynchronous Reconfigurable Accelerator Ring to Enable Data-Centric Parallel Computing.
IEEE Trans. Parallel Distributed Syst., 2021

Denial-of-Service Attack Detection via Differential Analysis of Generalized Entropy Progressions.
CoRR, 2021

Leaky Buddies: Cross-Component Covert Channels on Integrated CPU-GPU Systems.
Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021

Fast and Scalable Sparse Triangular Solver for Multi-GPU Based HPC Architectures.
Proceedings of the ICPP 2021: 50th International Conference on Parallel Processing, Lemont, IL, USA, August 9, 2021

DynPaC: Coarse-Grained, Dynamic, and Partially Reconfigurable Array for Streaming Applications.
Proceedings of the 39th IEEE International Conference on Computer Design, 2021

AURORA: Automated Refinement of Coarse-Grained Reconfigurable Accelerators.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2021

OpenCGRA: Democratizing Coarse-Grained Reconfigurable Arrays.
Proceedings of the 32nd IEEE International Conference on Application-specific Systems, 2021

2020
Evaluating Modern GPU Interconnect: PCIe, NVLink, NV-SLI, NVSwitch and GPUDirect.
IEEE Trans. Parallel Distributed Syst., 2020

ARENA: Asynchronous Reconfigurable Accelerator Ring to Enable Data-Centric Parallel Computing.
CoRR, 2020

A parallel sparse tensor benchmark suite on CPUs and GPUs.
Proceedings of the PPoPP '20: 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2020

A Sparse Tensor Benchmark Suite for CPUs and GPUs.
Proceedings of the IEEE International Symposium on Workload Characterization, 2020

Detecting Anomalous Computation with RNNs on GPU-Accelerated HPC Machines.
Proceedings of the ICPP 2020: 49th International Conference on Parallel Processing, 2020

OpenCGRA: An Open-Source Unified Framework for Modeling, Testing, and Evaluating CGRAs.
Proceedings of the 38th IEEE International Conference on Computer Design, 2020

On the Feasibility of Using Reduced-Precision Tensor Core Operations for Graph Analytics.
Proceedings of the 2020 IEEE High Performance Extreme Computing Conference, 2020

Direction-optimizing label propagation and its application to community detection.
Proceedings of the 17th ACM International Conference on Computing Frontiers, 2020

Indicator-Directed Dynamic Power Management for Iterative Workloads on GPU-Accelerated Systems.
Proceedings of the 20th IEEE/ACM International Symposium on Cluster, 2020

2019
PASTA: a parallel sparse tensor algorithm benchmark suite.
CCF Trans. High Perform. Comput., 2019

BSTC: a novel binarized-soft-tensor-core design for accelerating bit-based approximated neural nets.
Proceedings of the International Conference for High Performance Computing, 2019

Fingerprinting Anomalous Computation with RNN for GPU-accelerated HPC Machines.
Proceedings of the IEEE International Symposium on Workload Characterization, 2019

Efficient and effective sparse tensor reordering.
Proceedings of the ACM International Conference on Supercomputing, 2019

Distributed Direction-Optimizing Label Propagation for Community Detection.
Proceedings of the 2019 IEEE High Performance Extreme Computing Conference, 2019

2018
Tartan: Evaluating Modern GPU Interconnect via a Multi-GPU Benchmark Suite.
Proceedings of the 2018 IEEE International Symposium on Workload Characterization, 2018

Warp-Consolidation: A Novel Execution Model for GPUs.
Proceedings of the 32nd International Conference on Supercomputing, 2018

Optimizing Distributed Data-Intensive Workflows.
Proceedings of the IEEE International Conference on Cluster Computing, 2018

2017
Verification of the Extended Roofline Model for Asynchronous Many Task Runtimes.
Proceedings of the Third International Workshop on Extreme Scale Programming Models and Middleware, 2017

Designing Scalable Distributed Memory Models: A Case Study.
Proceedings of the Computing Frontiers Conference, 2017

2016
Assessing Advanced Technology in CENATE.
Proceedings of the IEEE International Conference on Networking, 2016

Modeling the Impact of Silicon Photonics on Graph Analytics.
Proceedings of the IEEE International Conference on Networking, 2016

Modeling the Performance and Energy Impact of Dynamic Power Steering.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016

LSPP Introduction and Committees.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016

2015
Towards efficient scheduling of data intensive high energy physics workflows.
Proceedings of the 10th Workshop on Workflows in Support of Large-Scale Science, 2015

2014
A performance comparison of current HPC systems: Blue Gene/Q, Cray XE6 and InfiniBand systems.
Future Gener. Comput. Syst., 2014

On the feasibility of dynamic power steering.
Proceedings of the 2nd International Workshop on Energy Efficient Supercomputing, 2014

MIC-SVM: Designing a Highly Efficient Support Vector Machine for Advanced Modern Multi-core and Many-Core Architectures.
Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014

2013
Designing energy efficient communication runtime systems: a view from PGAS models.
J. Supercomput., 2013

A Performance Analysis of Three Generations of Blue gene.
Parallel Process. Lett., 2013

Tracking the Performance Evolution of Blue Gene Systems.
Proceedings of the Supercomputing - 28th International Supercomputing Conference, 2013

Unified performance and power modeling of scientific workloads.
Proceedings of the 1st International Workshop on Energy Efficient Supercomputing, 2013

Building Scalable PGAS Communication Subsystem on Blue Gene/Q.
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

2012
Comparing the Performance of Blue Gene/Q with Leading Cray XE6 and InfiniBand Systems.
Proceedings of the 18th IEEE International Conference on Parallel and Distributed Systems, 2012

2011
Modeling the Performance of Direct numerical Simulation on Parallel Systems.
Parallel Process. Lett., 2011

Codesign Challenges for Exascale Systems: Performance, Power, and Reliability.
Computer, 2011

An early performance analysis of POWER7-IH HPC systems.
Proceedings of the Conference on High Performance Computing Networking, 2011

A Performance Model of Direct Numerical Simulation for Analyzing Large-Scale Systems.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Energy Templates: Exploiting Application Information to Save Energy.
Proceedings of the 2011 IEEE International Conference on Cluster Computing (CLUSTER), 2011

Analyzing the Performance Bottlenecks of the POWER7-IH Network.
Proceedings of the 2011 IEEE International Conference on Cluster Computing (CLUSTER), 2011

2010
Designing Energy Efficient Communication Runtime Systems for Data Centric Programming Models.
Proceedings of the 2010 IEEE/ACM Int'l Conference on Green Computing and Communications, 2010

2009
An MPI Performance Monitoring Interface for Cell Based Compute Nodes.
Parallel Process. Lett., 2009

Performance Prediction via Modeling: a Case Study of the ORNL Cray XT4 Upgrade.
Parallel Process. Lett., 2009

Using Performance Modeling to Design Large-Scale Systems.
Computer, 2009

Application profiling on Cell-based clusters.
Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

Performance modeling in action: Performance prediction of a Cray XT4 system during upgrade.
Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

2008
A Performance Evaluation of the Nehalem Quad-Core Processor for Scientific Computing.
Parallel Process. Lett., 2008

0.374 Pflop/s trillion-particle kinetic modeling of laser plasma interaction on Roadrunner.
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2008

Entering the petaflop era: the architecture and performance of Roadrunner.
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2008

Experiences in scaling scientific applications on current-generation quad-core processors.
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

2007
Analysis of the Weather Research and Forecasting (WRF) Model on Large-Scale Systems.
Proceedings of the Parallel Computing: Architectures, 2007

Performance Analysis of an Optical Circuit Switched Network for Peta-Scale Systems.
Proceedings of the Euro-Par 2007, 2007

Efficient offloading of collective communications in large-scale systems.
Proceedings of the 2007 IEEE International Conference on Cluster Computing, 2007

2006
MPI tools and performance studies - Quantifying the potential benefit of overlapping communication and computation in large-scale scientific applications.
Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

A Performance Model of the Krak Hydrodynamics Application.
Proceedings of the 2006 International Conference on Parallel Processing (ICPP 2006), 2006

2005
On the Feasibility of Optical Circuit Switching for High Performance Computing Systems.
Proceedings of the ACM/IEEE SC2005 Conference on High Performance Networking and Computing, 2005

A Performance Model and Scalability Analysis of the HYCOM Ocean Simulation Application.
Proceedings of the International Conference on Parallel and Distributed Computing Systems, 2005

Practical Performance Model for Optimizing Dynamic Load Balancing of Adaptive Applications.
Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005

Automatic Identification of Application Communication Patterns via Templates.
Proceedings of the ISCA 18th International Conference on Parallel and Distributed Computing Systems, 2005

2004
A Load Balancing Framework for Adaptive and Asynchronous Applications.
IEEE Trans. Parallel Distributed Syst., 2004

A Novel Dynamic Load Balancing Library for Cluster Computing.
Proceedings of the 3rd International Symposium on Parallel and Distributed Computing (ISPDC 2004), 2004

2003
An Evaluation of a Framework for the Dynamic Load Balancing of Highly Adaptive and Irregular Parallel Applications.
Proceedings of the ACM/IEEE SC2003 Conference on High Performance Networking and Computing, 2003

2002
Date movement and control substrate for parallel adaptive applications.
Concurr. Comput. Pract. Exp., 2002

1998
The Mobile Object Layer: A Run-Time Substrate for Mobile Adaptive Computations.
Proceedings of the Computing in Object-Oriented Parallel Environments, 1998


  Loading...