Karen A. Tomko

According to our database1, Karen A. Tomko authored at least 42 papers between 1993 and 2018.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

Homepage:

On csauthors.net:

Bibliography

2018
Machine Learning in High Energy Physics Community White Paper.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
CoRR, 2018

Code Optimization and Stabilization for a High-Resolution Terrain Generation Application.
Proceedings of the Practice and Experience on Advanced Research Computing, 2018

2017
GooFit 2.0.
CoRR, 2017

2016
OpenSHMEM Non-blocking Data Movement Operations with MVAPICH2-X: Early Experiences.
Proceedings of the 2016 PGAS Applications Workshop, 2016

2015
Designing Non-blocking Personalized Collectives with Near Perfect Overlap for RDMA-Enabled Clusters.
Proceedings of the High Performance Computing - 30th International Conference, 2015

Impact of InfiniBand DC Transport Protocol on Energy Consumption of All-to-All Collective Algorithms.
Proceedings of the 23rd IEEE Annual Symposium on High-Performance Interconnects, 2015

2014
Implementation of a Thread-Parallel, GPU-Friendly Function Evaluation Library.
IEEE Access, 2014

Scalable MiniMD Design with Hybrid MPI and OpenSHMEM.
Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models, 2014

Designing Topology-Aware Communication Schedules for Alltoall Operations in Large InfiniBand Clusters.
Proceedings of the 43rd International Conference on Parallel Processing, 2014

Scalable Graph500 design with MPI-3 RMA.
Proceedings of the 2014 IEEE International Conference on Cluster Computing, 2014

High performance OpenSHMEM for Xeon Phi clusters: Extensions, runtime designs and application co-design.
Proceedings of the 2014 IEEE International Conference on Cluster Computing, 2014

2013
GooFit: A library for massively parallelising maximum-likelihood fits.
CoRR, 2013

Designing Scalable Graph500 Benchmark with Hybrid MPI+OpenSHMEM Programming Models.
Proceedings of the Supercomputing - 28th International Supercomputing Conference, 2013

A Novel Functional Partitioning Approach to Design High-Performance MPI-3 Non-blocking Alltoallv Collective on Multi-core Systems.
Proceedings of the 42nd International Conference on Parallel Processing, 2013

A scalable and portable approach to accelerate hybrid HPL on heterogeneous CPU-GPU clusters.
Proceedings of the 2013 IEEE International Conference on Cluster Computing, 2013

2012
Design of a scalable InfiniBand topology service to enable network-topology-aware placement of processes.
Proceedings of the SC Conference on High Performance Computing Networking, 2012

Designing Non-blocking Allreduce with Collective Offload on InfiniBand Clusters: A Case Study with Conjugate Gradient Solvers.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

Can Network-Offload Based Non-blocking Neighborhood MPI Collectives Improve Communication Overheads of Irregular Graph Algorithms?
Proceedings of the 2012 IEEE International Conference on Cluster Computing Workshops, 2012

2011
High-performance and scalable non-blocking all-to-all with collective offload on InfiniBand clusters: a study with parallel 3D FFT.
Computer Science - R&D, 2011

Codesign for InfiniBand Clusters.
IEEE Computer, 2011

Designing Non-blocking Broadcast with Collective Offload on InfiniBand Clusters: A Case Study with HPL.
Proceedings of the IEEE 19th Annual Symposium on High Performance Interconnects, 2011

Design and Evaluation of Network Topology-/Speed- Aware Broadcast Algorithms for InfiniBand Clusters.
Proceedings of the 2011 IEEE International Conference on Cluster Computing (CLUSTER), 2011

2010
Quantifying performance benefits of overlap using MPI-2 in a seismic modeling application.
Proceedings of the 24th International Conference on Supercomputing, 2010

2008
MoCSYS: A Multi-Clock Hybrid Two-Layer Router Architecture and Integrated Topology Synthesis Framework for System-Level Design of FPGA Based On-Chip Networks.
Proceedings of the 21st International Conference on VLSI Design (VLSI Design 2008), 2008

Experiences from Cyberinfrastructure Development for Multi-user Remote Instrumentation.
Proceedings of the Fourth International Conference on e-Science, 2008

2007
MoCReS: an Area-Efficient Multi-Clock On-Chip Network for Reconfigurable Systems.
Proceedings of the 2007 IEEE Computer Society Annual Symposium on VLSI (ISVLSI 2007), 2007

2005
Enhanced reliability of finite-state machines in FPGA through efficient fault detection and correction.
IEEE Trans. Reliability, 2005

Synthetic Simulation of Mesh-Based Parallel Applications Driven by Fine-Grained Profiling.
Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005

2004
Saving Power by Mapping Finite-State Machines into Embedded Memory Blocks in FPGAs.
Proceedings of the 2004 Design, 2004

An Approach for Fine-Grained Profiling of Mesh-Based Parallel Programs.
Proceedings of the ISCA 17th International Conference on Parallel and Distributed Computing Systems, 2004

2003
Scan-chain based watch-points for efficient run-time debugging and verification of FPGA designs.
Proceedings of the 2003 Asia and South Pacific Design Automation Conference, 2003

2001
Data Buffering and Allocation in Mapping Generalized Template Matching on Reconfigurable Systems.
The Journal of Supercomputing, 2001

Dynamic Partitioning of the Divide-and-Conquer Scheme with Migration in PVM Environment.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2001

2000
Automatic Target Recognition with Dynamic Reconfiguration.
VLSI Signal Processing, 2000

Hardware/software co-debugging for reconfigurable computing.
Proceedings of the IEEE International High-Level Design Validation and Test Workshop 2000, 2000

1999
Dynamic Reconfiguration to Support Concurrent Applications.
IEEE Trans. Computers, 1999

Data Buffering and Allocation in Mapping Generalized Template Matching on Reconfigurable Systems.
Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, 1999

Accelerating an IR Automatic Target Recognition Application with FPGAs.
Proceedings of the 7th IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM '99), 1999

1998
Dynamic Reconfiguration to Support Concurrent Applications.
Proceedings of the 6th IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM '98), 1998

1996
Profile Driven Weighted Decomposition.
Proceedings of the 10th international conference on Supercomputing, 1996

1994
Data and program restructuring of irregular applications for cache-coherent multiprocessor.
Proceedings of the 8th international conference on Supercomputing, 1994

1993
Iteration Partitioning for Resolving Stride Conflicts on Cache-Coherent Multiprocessors.
Proceedings of the 1993 International Conference on Parallel Processing, 1993


  Loading...