Wu-chun Feng

Developing Dynamic Profiling and Debugging Support in OpenCL for FPGAs.

[BibT_eX]

[DOI]

Proceedings of the 54th Annual Design Automation Conference, 2017

An Enhanced Image Reconstruction Tool for Computed Tomography on CPUs.

[BibT_eX]

[DOI]

Proceedings of the Computing Frontiers Conference, 2017

GPU-UniCache: Automatic Code Generation of Spatial Blocking for Stencils on CPUs.

[BibT_eX]

[DOI]

Proceedings of the Computing Frontiers Conference, 2017

Robotomata: A framework for approximate pattern matching of big data on an automata processor.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Big Data (IEEE BigData 2017), 2017

2016

OpenDwarfs: Characterization of Dwarf-Based Benchmarks on Fixed and Reconfigurable Architectures.

[BibT_eX]

[DOI]

J. Signal Process. Syst., 2016

MPI-ACC: Accelerator-Aware MPI for Scientific Applications.

[BibT_eX]

[DOI]

John M. Mellor-Crummey

Xiaosong Ma

Rajeev Thakur

IEEE Trans. Parallel Distributed Syst., 2016

Fast Detection of Transformed Data Leaks.

[BibT_eX]

[DOI]

IEEE Trans. Inf. Forensics Secur., 2016

MultiCL: Enabling automatic scheduling for task-parallel workloads in OpenCL.

[BibT_eX]

[DOI]

Parallel Comput., 2016

muBLASTP: database-indexed protein sequence search on multicore CPUs.

[BibT_eX]

[DOI]

BMC Bioinform., 2016

MetaMorph: a library framework for interoperable kernels on multi- and many-core clusters.

[BibT_eX]

[DOI]

Ahmed E. Helal

Paul Sathre

Proceedings of the International Conference for High Performance Computing, 2016

Characterizing Performance and Power towards Efficient Synchronization of GPU Kernels.

[BibT_eX]

[DOI]

Islam Harb

Proceedings of the 24th IEEE International Symposium on Modeling, 2016

An automated framework for characterizing and subsetting GPGPU workloads.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Symposium on Performance Analysis of Systems and Software, 2016

The Right Metric for Efficient Supercomputing: A Ten-Year Retrospective.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016

AAlign: A SIMD Framework for Pairwise Sequence Alignment on x86-Based Multi-and Many-Core Processors.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016

Parallel Programming with Pictures in a Snap!

[BibT_eX]

[DOI]

Annette C. Feng

Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016

Measuring and modeling on-chip interconnect power on real hardware.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Symposium on Workload Characterization, 2016

Parallel Transposition of Sparse Data Structures.

[BibT_eX]

[DOI]

Proceedings of the 2016 International Conference on Supercomputing, 2016

Telescoping Architectures: Evaluating Next-Generation Heterogeneous Computing.

[BibT_eX]

[DOI]

Proceedings of the 23rd IEEE International Conference on High Performance Computing, 2016

Bridging the Performance-Programmability Gap for FPGAs via OpenCL: A Case Study with OpenDwarfs.

[BibT_eX]

[DOI]

Ahmed E. Helal

Anshuman Verma

Proceedings of the 24th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2016

Directive-Based Pipelining Extension for OpenMP.

[BibT_eX]

[DOI]

Xuewen Cui

Proceedings of the 2016 IEEE International Conference on Cluster Computing, 2016

cuART: Fine-Grained Algebraic Reconstruction Technique for Computed Tomography Images on GPUs.

[BibT_eX]

[DOI]

Proceedings of the IEEE/ACM 16th International Symposium on Cluster, 2016

Online Power Estimation of Graphics Processing Units.

[BibT_eX]

[DOI]

Proceedings of the IEEE/ACM 16th International Symposium on Cluster, 2016

Multiscale Approximation with Graphical Processing Units for Multiplicative Speedup in Molecular Dynamics.

[BibT_eX]

[DOI]

Proceedings of the 7th ACM International Conference on Bioinformatics, 2016

Bridging the FPGA programmability-portability Gap via automatic OpenCL code generation and tuning.

[BibT_eX]

[DOI]

Ruchira Sasanka

Proceedings of the 27th IEEE International Conference on Application-specific Systems, 2016

O3FA: A Scalable Finite Automata-based Pattern-Matching Engine for Out-of-Order Deep Packet Inspection.

[BibT_eX]

[DOI]

Proceedings of the 2016 Symposium on Architectures for Networking and Communications Systems, 2016

2015

CoreTSAR: Core Task-Size Adapting Runtime.

[BibT_eX]

[DOI]

Juan Antonio Gómez Pulido

IEEE Trans. Parallel Distributed Syst., 2015

Accelerating Bioinformatics Applications via Emerging Parallel Computing Systems.

[BibT_eX]

[DOI]

Bertil Schmidt

IEEE ACM Trans. Comput. Biol. Bioinform., 2015

On the Energy Proportionality of Scale-Out Workloads.

[BibT_eX]

[DOI]

CoRR, 2015

Towards Energy-Proportional Computing Using Subsystem-Level Power Management.

[BibT_eX]

[DOI]

CoRR, 2015

Design and Evaluation of Scalable Concurrent Queues for Many-Core Architectures.

[BibT_eX]

[DOI]

Proceedings of the 6th ACM/SPEC International Conference on Performance Engineering, Austin, TX, USA, January 31, 2015

On the Performance, Energy, and Power of Data-Access Methods in Heterogeneous Computing Systems.

[BibT_eX]

[DOI]

Rubasri Kalidas

Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 2015

HPPAC Introduction and Committees.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 2015

On the Greenness of In-Situ and Post-Processing Visualization Pipelines.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 2015

Rapid and parallel content screening for detecting transformed data exposure.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE Conference on Computer Communications Workshops, 2015

ASPaS: A Framework for Automatic SIMDization of Parallel Sorting on x86-based Many-core Processors.

[BibT_eX]

[DOI]

Proceedings of the 29th ACM on International Conference on Supercomputing, 2015

GLAF: A Visual Programming and Auto-tuning Framework for Parallel Computing.

[BibT_eX]

[DOI]

Ruchira Sasanka

Proceedings of the 44th International Conference on Parallel Processing, 2015

pDindel: Accelerating indel detection on a multicore CPU architecture with SIMD.

[BibT_eX]

[DOI]

Proceedings of the 5th IEEE International Conference on Computational Advances in Bio and Medical Sciences, 2015

Rapid Screening of Transformed Data Leaks with Efficient Algorithms and Parallel Computing.

[BibT_eX]

[DOI]

Proceedings of the 5th ACM Conference on Data and Application Security and Privacy, 2015

Automatic Command Queue Scheduling for Task-Parallel Workloads in OpenCL.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Cluster Computing, 2015

2014

A power-measurement methodology for large-scale, high-performance computing.

[BibT_eX]

[DOI]

Proceedings of the ACM/SPEC International Conference on Performance Engineering, 2014

Aeromancer: A Workflow Manager for Large-Scale MapReduce-Based Scientific Workflows.

[BibT_eX]

[DOI]

Mohamed Nabeel

Nabanita Maji

Jing Zhang

Nataliya Timoshevskaya

Proceedings of the 13th IEEE International Conference on Trust, 2014

CoreTSAR: Adaptive Worksharing for Heterogeneous Systems.

[BibT_eX]

[DOI]

Proceedings of the Supercomputing - 29th International Conference, 2014

On the Energy Proportionality of Distributed NoSQL Data Stores.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing Systems. Performance Modeling, Benchmarking, and Simulation, 2014

On the performance and energy efficiency of FPGAs and GPUs for polyphase channelization.

[BibT_eX]

[DOI]

Proceedings of the 2014 International Conference on ReConFigurable Computing and FPGAs, 2014

cuBLASTP: Fine-Grained Parallelization of Protein Sequence Search on a GPU.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014

Petascale Application of a Coupled CPU-GPU Algorithm for Simulation and Analysis of Multiphase Flow Solutions in Porous Medium Systems.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014

Delivering Parallel Programmability to the Masses via the Intel MIC Ecosystem: A Case Study.

[BibT_eX]

[DOI]

Proceedings of the 43rd International Conference on Parallel Processing Workshops, 2014

SAIS-OPT: On the characterization and optimization of the SA-IS algorithm for suffix array construction.

[BibT_eX]

[DOI]

Nataliya Timoshevskaya

Proceedings of the IEEE 4th International Conference on Computational Advances in Bio and Medical Sciences, 2014

SLAM: scalable locality-aware middleware for I/O in scientific analysis and visualization.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Symposium on High-Performance Parallel and Distributed Computing, 2014

Towards a performance-portable FFT library for heterogeneous computing.

[BibT_eX]

[DOI]

Carlo C. del Mundo

Proceedings of the Computing Frontiers Conference, CF'14, 2014

Enabling Efficient Power Provisioning for Enterprise Applications.

[BibT_eX]

[DOI]

Proceedings of the 14th IEEE/ACM International Symposium on Cluster, 2014

Runtime Adaptation for Autonomic Heterogeneous Computing.

[BibT_eX]

[DOI]

Proceedings of the 14th IEEE/ACM International Symposium on Cluster, 2014

On the characterization of OpenCL dwarfs on fixed and reconfigurable platforms.

[BibT_eX]

[DOI]

Muhsen Owaida

Proceedings of the IEEE 25th International Conference on Application-Specific Systems, 2014

Locality-aware memory association for multi-target worksharing in OpenMP.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Parallel Architectures and Compilation, 2014

2013

Characterizing the challenges and evaluating the efficacy of a CUDA-to-OpenCL translator.

[BibT_eX]

[DOI]

Parallel Comput., 2013

GBench: benchmarking methodology for evaluating the energy efficiency of supercomputers.

[BibT_eX]

[DOI]

Comput. Sci. Res. Dev., 2013

The Green500 list: escapades to exascale.

[BibT_eX]

[DOI]

Tom Scogland

Comput. Sci. Res. Dev., 2013

Performance characterization of data-intensive kernels on AMD Fusion architectures.

[BibT_eX]

[DOI]

Kenneth S. Lee

Comput. Sci. Res. Dev., 2013

Toward More Transparent and Reproducible Omics Studies Through a Common Metadata Checklist and Data Publications.

[BibT_eX]

[DOI]

Courtney MacNealy-Koch

Alexey I. Nesvizhskii

Big Data, 2013

Towards energy-proportional computing for enterprise-class server workloads.

[BibT_eX]

[DOI]

Proceedings of the ACM/SPEC International Conference on Performance Engineering, 2013

SDAFT: a novel scalable data access framework for parallel BLAST.

[BibT_eX]

[DOI]

Proceedings of the 2013 International Workshop on Data-Intensive Scalable Computing Systems, 2013

Synchronization and Ordering Semantics in Hybrid MPI+GPU Programming.

[BibT_eX]

[DOI]

Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

Online Performance Projection for Clusters with Heterogeneous GPUs.

[BibT_eX]

[DOI]

Proceedings of the 19th IEEE International Conference on Parallel and Distributed Systems, 2013

On the Programmability and Performance of Heterogeneous Platforms.

[BibT_eX]

[DOI]

Proceedings of the 19th IEEE International Conference on Parallel and Distributed Systems, 2013

On the Portability of the OpenCL Dwarfs on Fixed and Reconfigurable Parallel Platforms.

[BibT_eX]

[DOI]

Muhsen Owaida

Proceedings of the 19th IEEE International Conference on Parallel and Distributed Systems, 2013

Wideband Channelization for Software-Defined Radio via Mobile Graphics Processors.

[BibT_eX]

[DOI]

Proceedings of the 19th IEEE International Conference on Parallel and Distributed Systems, 2013

pVOCL: Power-Aware Dynamic Placement and Migration in Virtualized GPU Environments.

[BibT_eX]

[DOI]

Proceedings of the IEEE 33rd International Conference on Distributed Computing Systems, 2013

Seamless Migration of Virtual Machines across Networks.

[BibT_eX]

[DOI]

Proceedings of the 22nd International Conference on Computer Communication and Networks, 2013

Accelerating fast Fourier Transform for wideband channelization.

[BibT_eX]

[DOI]

Carlo C. del Mundo

Proceedings of IEEE International Conference on Communications, 2013

On the efficacy of GPU-integrated MPI for scientific applications.

[BibT_eX]

[DOI]

John M. Mellor-Crummey

Xiaosong Ma

Rajeev Thakur

Proceedings of the 22nd International Symposium on High-Performance Parallel and Distributed Computing, 2013

Consolidating Applications for Energy Efficiency in Heterogeneous Computing Systems.

[BibT_eX]

[DOI]

Proceedings of the 10th IEEE International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing, 2013

Trends in energy-efficient computing: A perspective from the Green500.

[BibT_eX]

[DOI]

Proceedings of the International Green Computing Conference, 2013

Cascaded TCP: Applying pipelining to TCP for efficient communication over wide-area networks.

[BibT_eX]

[DOI]

Proceedings of the 2013 IEEE Global Communications Conference, 2013

Optimizing Burrows-Wheeler Transform-Based Sequence Alignment on Multicore Architectures.

[BibT_eX]

[DOI]

Proceedings of the 13th IEEE/ACM International Symposium on Cluster, 2013

2012

Parallel Mining of Neuronal Spike Streams on Graphics Processing Units.

[BibT_eX]

[DOI]

Int. J. Parallel Program., 2012

Reliable MapReduce computing on opportunistic resources.

[BibT_eX]

[DOI]

Xiaosong Ma

Clust. Comput., 2012

Multi-dimensional characterization of electrostatic surface potential computation on graphics processors.

[BibT_eX]

[DOI]

BMC Bioinform., 2012

OpenCL and the 13 dwarfs: a work in progress.

[BibT_eX]

[DOI]

Proceedings of the Third Joint WOSP/SIPEW International Conference on Performance Engineering, 2012

Automatic NUMA characterization using Cbench.

[BibT_eX]

[DOI]

Ryan K. Braithwaite

Patrick S. McCormick

Proceedings of the Third Joint WOSP/SIPEW International Conference on Performance Engineering, 2012

Poster: Cascaded TCP: BIG Throughput for BIG DATA Applications in Distributed HPC.

[BibT_eX]

[DOI]

Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Abstract: Cascaded TCP: BIG Throughput for BIG DATA Applications in Distributed HPC.

[BibT_eX]

[DOI]

Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Generalizing the Utility of GPUs in Large-Scale Heterogeneous Computing Systems.

[BibT_eX]

[DOI]

Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

The Green Index: A Metric for Evaluating System-Wide Energy Efficiency in HPC Systems.

[BibT_eX]

[DOI]

Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

Heterogeneous Task Scheduling for Accelerated OpenMP.

[BibT_eX]

[DOI]

Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

Efficient Intranode Communication in GPU-Accelerated Systems.

[BibT_eX]

[DOI]

Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

Characterizing the Performance and Energy Efficiency of Simultaneous Multithreading in Multicore Environments.

[BibT_eX]

[DOI]

Proceedings of the 41st International Conference on Parallel Processing Workshops, 2012

Lost in Translation: Challenges in Automating CUDA-to-OpenCL Translation.

[BibT_eX]

[DOI]

Paul Sathre

Proceedings of the 41st International Conference on Parallel Processing Workshops, 2012

DMA-Assisted, Intranode Communication in GPU Accelerated Systems.

[BibT_eX]

[DOI]

Proceedings of the 14th IEEE International Conference on High Performance Computing and Communication & 9th IEEE International Conference on Embedded Software and Systems, 2012

MPI-ACC: An Integrated and Extensible Approach to Data Movement in Accelerator-based Systems.

[BibT_eX]

[DOI]

Proceedings of the 14th IEEE International Conference on High Performance Computing and Communication & 9th IEEE International Conference on Embedded Software and Systems, 2012

Transparent Accelerator Migration in a Virtualized GPU Environment.

[BibT_eX]

[DOI]

Proceedings of the 12th IEEE/ACM International Symposium on Cluster, 2012

2011

Homology to Sequence Alignment, From.

[BibT_eX]

[DOI]

Proceedings of the Encyclopedia of Parallel Computing, 2011

Coordinating Computation and I/O in Massively Parallel Sequence Search.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2011

Poster: characterizing the impact of memory-access techniques on AMD fusion.

[BibT_eX]

[DOI]

Kenneth S. Lee

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, 2011

Accelerating Protein Sequence Search in a Heterogeneous Computing System.

[BibT_eX]

[DOI]

Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Emerging Trends on the Evolving Green500: Year Three.

[BibT_eX]

[DOI]

Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Optimizing Dynamic Programming on Graphics Processing Units via Adaptive Thread-Level Parallelism.

[BibT_eX]

[DOI]

Proceedings of the 17th IEEE International Conference on Parallel and Distributed Systems, 2011

CU2CL: A CUDA-to-OpenCL Translator for Multi- and Many-Core Architectures.

[BibT_eX]

[DOI]

Gabriel Martinez

Proceedings of the 17th IEEE International Conference on Parallel and Distributed Systems, 2011

StreamMR: An Optimized MapReduce Framework for AMD GPUs.

[BibT_eX]

[DOI]

Proceedings of the 17th IEEE International Conference on Parallel and Distributed Systems, 2011

Architecture-Aware Mapping and Optimization on a 1600-Core GPU.

[BibT_eX]

[DOI]

Proceedings of the 17th IEEE International Conference on Parallel and Distributed Systems, 2011

AVS video decoder on multicore systems: Optimizations and tradeoffs.

[BibT_eX]

[DOI]

Proceedings of the 2011 IEEE International Conference on Multimedia and Expo, 2011

Restoring End-to-End Resilience in the Presence of Middleboxes.

[BibT_eX]

[DOI]

Proceedings of 20th International Conference on Computer Communications and Networks, 2011

Towards accelerating molecular modeling via multi-scale approximation on a GPU.

[BibT_eX]

[DOI]

Proceedings of the IEEE 1st International Conference on Computational Advances in Bio and Medical Sciences, 2011

High-performance biocomputing for simulating the spread of contagion over large contact networks.

[BibT_eX]

[DOI]

Proceedings of the IEEE 1st International Conference on Computational Advances in Bio and Medical Sciences, 2011

Energy-efficient E-puting everywhere.

[BibT_eX]

[DOI]

Proceedings of the 20th ACM International Symposium on High Performance Distributed Computing, 2011

Performance Characterization and Optimization of Atomic Operations on AMD GPUs.

[BibT_eX]

[DOI]

Marwa K. Elteir

Proceedings of the 2011 IEEE International Conference on Cluster Computing (CLUSTER), 2011

Bounding the effect of partition camping in GPU kernels.

[BibT_eX]

[DOI]

Proceedings of the 8th Conference on Computing Frontiers, 2011

2010

Power saving experiments for large-scale global optimisation.

[BibT_eX]

[DOI]

Int. J. Parallel Emergent Distributed Syst., 2010

A first look at integrated GPUs for green high-performance computing.

[BibT_eX]

[DOI]

Comput. Sci. Res. Dev., 2010

Global-scale distributed I/O with ParaMEDIC.

[BibT_eX]

[DOI]

Concurr. Comput. Pract. Exp., 2010

Missing genes in the annotation of prokaryotic genomes.

[BibT_eX]

[DOI]

BMC Bioinform., 2010

Broadening accessibility to computer science for K-12 education.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual SIGCSE Conference on Innovation and Technology in Computer Science Education, 2010

To GPU synchronize or not GPU synchronize?

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Circuits and Systems (ISCAS 2010), May 30, 2010

Inter-block GPU communication via fast barrier synchronization.

[BibT_eX]

[DOI]

Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

The Green500 List: Year two.

[BibT_eX]

[DOI]

Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Enhancing MapReduce via Asynchronous Data Processing.

[BibT_eX]

[DOI]

Marwa K. Elteir

Proceedings of the 16th IEEE International Conference on Parallel and Distributed Systems, 2010

On the Goodput of TCP NewReno in Mobile Networks.

[BibT_eX]

[DOI]

Donald W. Gillies

Proceedings of the 19th International Conference on Computer Communications and Networks, 2010

MOON: MapReduce On Opportunistic eNvironments.

[BibT_eX]

[DOI]

Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, 2010

Understanding Power Measurement Implications in the Green500 List.

[BibT_eX]

[DOI]

Proceedings of the 2010 IEEE/ACM Int'l Conference on Green Computing and Communications, 2010

Statistical Power and Performance Modeling for Optimizing the Energy Efficiency of Scientific Computing.

[BibT_eX]

[DOI]

Proceedings of the 2010 IEEE/ACM Int'l Conference on Green Computing and Communications, 2010

Power and Performance Characterization of Computational Kernels on the GPU.

[BibT_eX]

[DOI]

Proceedings of the 2010 IEEE/ACM Int'l Conference on Green Computing and Communications, 2010

GPU-RMAP: Accelerating Short-Read Mapping on Graphics Processors.

[BibT_eX]

[DOI]

Liqing Zhang

Proceedings of the 13th IEEE International Conference on Computational Science and Engineering, 2010

Towards chip-on-chip neuroscience: fast mining of neuronal spike streams using graphics hardware.

[BibT_eX]

[DOI]

Proceedings of the 7th Conference on Computing Frontiers, 2010

2009

Towards Chip-on-Chip Neuroscience: Fast Mining of Frequent Episodes Using Graphics Processors

[BibT_eX]

[DOI]

CoRR, 2009

Tools and Environments for Multicore and Many-Core Architectures.

[BibT_eX]

[DOI]

Computer, 2009

On the energy efficiency of graphics processing units for scientific computing.

[BibT_eX]

[DOI]

Song Huang

Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

The Green500 List: Year one.

[BibT_eX]

[DOI]

Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

Multi-dimensional characterization of temporal data mining on graphics processors.

[BibT_eX]

[DOI]

Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

GePSeA: A General-Purpose Software Acceleration Framework for Lightweight Task Offloading.

[BibT_eX]

[DOI]

Ajeet Singh

Proceedings of the ICPP 2009, 2009

On the Robust Mapping of Dynamic Programming onto a Graphics Processing Unit.

[BibT_eX]

[DOI]

Proceedings of the 15th IEEE International Conference on Parallel and Distributed Systems, 2009

2008

Algorithms for Integrated Routing and Scheduling for Aggregating Data from Distributed Resources on a Lambda Grid.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2008

Green Supercomputing Comes of Age.

[BibT_eX]

[DOI]

Xizhou Feng

Rong Ge

IT Prof., 2008

Asymmetric interactions in symmetric multi-core systems: analysis, enhancements and evaluation.

[BibT_eX]

[DOI]

Proceedings of the ACM/IEEE Conference on High Performance Computing, 2008

Massively parallel genomic sequence search on the Blue Gene/P architecture.

[BibT_eX]

[DOI]

Proceedings of the ACM/IEEE Conference on High Performance Computing, 2008

Semantics-based distributed I/O for mpiBLAST.

[BibT_eX]

[DOI]

Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2008

Impact of Network Sharing in Multi-Core Architectures.

[BibT_eX]

[DOI]

Ganesh Narayanaswamy

Proceedings of the 17th International Conference on Computer Communications and Networks, 2008

Semantic-based distributed i/o with the paramedic framework.

[BibT_eX]

[DOI]

Proceedings of the 17th International Symposium on High-Performance Distributed Computing (HPDC-17 2008), 2008

Making a Case for Proactive Flow Control in Optical Circuit-Switched Networks.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing, 2008

Cell-SWat: modeling and scheduling wavefront computations on the cell broadband engine.

[BibT_eX]

[DOI]

Dimitrios S. Nikolopoulos

Filip Blagojevic

Proceedings of the 5th Conference on Computing Frontiers, 2008

Achieving Edge-Based Fairness in a Multi-Hop Environment.

[BibT_eX]

[DOI]

Mustafa Arisoylu

Proceedings of the 5th IEEE Consumer Communications and Networking Conference, 2008

Optimizing performance, cost, and sensitivity in pairwise sequence search on a cluster of PlayStations.

[BibT_eX]

[DOI]

Proceedings of the 8th IEEE International Conference on Bioinformatics and Bioengineering, 2008

2007

High-performance computing using accelerators.

[BibT_eX]

[DOI]

Dinesh Manocha

Parallel Comput., 2007

The Green500 List: Encouraging Sustainable Supercomputing.

[BibT_eX]

[DOI]

Kirk W. Cameron

Computer, 2007

Analyzing the impact of supporting out-of-order communication on in-order performance with iWARP.

[BibT_eX]

[DOI]

Proceedings of the ACM/IEEE Conference on High Performance Networking and Computing, 2007

Green Supercomputing in a Desktop Box.

[BibT_eX]

[DOI]

Avery Ching

Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

A Maintainable Software Architecture for Fast and Modular Bioinformatics Sequence Search.

[BibT_eX]

[DOI]

Jeremy S. Archuleta

Eli Tilevich

Proceedings of the 23rd IEEE International Conference on Software Maintenance (ICSM 2007), 2007

CPU MISER: A Performance-Directed, Run-Time System for Power-Aware Clusters.

[BibT_eX]

[DOI]

Proceedings of the 2007 International Conference on Parallel Processing (ICPP 2007), 2007

An Analysis of 10-Gigabit Ethernet Protocol Stacks in Multicore Environments.

[BibT_eX]

[DOI]

Ganesh Narayanaswamy

Proceedings of the 15th Annual IEEE Symposium on High-Performance Interconnects, 2007

Parallel genomic sequence-search on a massively parallel system.

[BibT_eX]

[DOI]

Proceedings of the 4th Conference on Computing Frontiers, 2007

2006

Bridging the Ethernet-Ethernot Performance Gap.

[BibT_eX]

[DOI]

Dhabaleswar K. Panda

IEEE Micro, 2006

Grid applications - Parallel genomic sequence-searching on an ad-hoc grid: experiences, lessons learned, and implications.

[BibT_eX]

[DOI]

Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

Grid networks and portals - End-system aware, rate-adaptive protocol for network transport in LambdaGrid environments.

[BibT_eX]

[DOI]

Pallab Datta

Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

Making a case for a Green500 list.

[BibT_eX]

[DOI]

Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

RAPID: an end-system aware protocol for intelligent data transfer over lambda grids.

[BibT_eX]

[DOI]

Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

When Optical Networking Meets Grid Computing?

[BibT_eX]

[DOI]

Malathi Veeraraghavan

Proceedings of the 15th International Conference On Computer Communications and Networks, 2006

Exploring I/O Strategies for Parallel Sequence-Search Tools with S3aSim.

[BibT_eX]

[DOI]

Proceedings of the 15th IEEE International Symposium on High Performance Distributed Computing, 2006

A Feedback Mechanism for Network Scheduling in LambdaGrids.

[BibT_eX]

[DOI]

Pallab Datta

Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2006), 2006

2005

FAST TCP: from theory to experiments.

[BibT_eX]

[DOI]

IEEE Netw., 2005

Anatomy of UDP and M-VIA for cluster communication.

[BibT_eX]

[DOI]

Xiao Zhang

Laxmi N. Bhuyan

J. Parallel Distributed Comput., 2005

Analyzing MPI performance over 10-Gigabit ethernet.

[BibT_eX]

[DOI]

Justin Gus Hurwitz

J. Parallel Distributed Comput., 2005

A Power-Aware Run-Time System for High-Performance Computing.

[BibT_eX]

[DOI]

Proceedings of the ACM/IEEE SC2005 Conference on High Performance Networking and Computing, 2005

Towards Efficient Supercomputing: A Quest for the Right Metric.

[BibT_eX]

[DOI]

Jeremy S. Archuleta

Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005

Q-Composer and CpR: a probabilistic synthesizer and regulator of traffic (a probabilistic control of buffer occupancy).

[BibT_eX]

[DOI]

Sami Ayyorgun

Sarut Vanichpun

Proceedings of the INFOCOM 2005. 24th Annual Joint Conference of the IEEE Computer and Communications Societies, 2005

Performance Characterization of a 10-Gigabit Ethernet TOE.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual IEEE Symposium on High Performance Interconnects (HOTIC 2005), 2005

A Feasibility Analysis of Power Awareness in Commodity-Based High-Performance Clusters.

[BibT_eX]

[DOI]

Proceedings of the 2005 IEEE International Conference on Cluster Computing (CLUSTER 2005), September 26, 2005

Head-to-TOE Evaluation of High-Performance Sockets over Protocol Offload Engines.

[BibT_eX]

[DOI]

Proceedings of the 2005 IEEE International Conference on Cluster Computing (CLUSTER 2005), September 26, 2005

2004

End-to-End Performance of 10-Gigabit Ethernet on Commodity Systems.

[BibT_eX]

[DOI]

Justin Gus Hurwitz

IEEE Micro, 2004

User-space auto-tuning for TCP flow control in computational grids.

[BibT_eX]

[DOI]

Comput. Commun., 2004

Effective Dynamic Voltage Scaling Through CPU-Boundedness Detection.

[BibT_eX]

[DOI]

Proceedings of the Power-Aware Computer Systems, 4th International Workshop, 2004

Re-Architecting Flow Control Adaptation for Grid Environments.

[BibT_eX]

[DOI]

Adam Engelhart

Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), 2004

A Systematic Approach for Providing End-to-End Probabilistic QoS Guarantees.

[BibT_eX]

[DOI]

Sami Ayyorgun

Proceedings of the International Conference On Computer Communications and Networks (ICCCN 2004), 2004

A Multimodal Interface for the Immediate Transcription of Radiology Dictation.

[BibT_eX]

[DOI]

Proceedings of the 17th IEEE Symposium on Computer-Based Medical Systems (CBMS 2004), 2004

2003

Making a Case for Efficient Supercomputing.

[BibT_eX]

[DOI]

ACM Queue, 2003

Scheduling and Transport for File Transfers on High-Speed Optical Circuits.

[BibT_eX]

[DOI]

Malathi Veeraraghavan

J. Grid Comput., 2003

Automatic Flow-Control Adaptation for Enhancing Network Performance in Computational Grids.

[BibT_eX]

[DOI]

J. Grid Comput., 2003

Optimizing 10-Gigabit Ethernet for Networks of Workstations, Clusters, and Grids: A Case Study.

[BibT_eX]

[DOI]

Proceedings of the ACM/IEEE SC2003 Conference on High Performance Networking and Computing, 2003

Enabling Compatibility Between TCP Reno and TCP Vegas.

[BibT_eX]

[DOI]

Sarut Vanichpun

Proceedings of the 2003 Symposium on Applications and the Internet (SAINT 2003), 27-31 January 2003, 2003

Green Destiny + mpiBLAST = Bioinfomagic.

[BibT_eX]

Proceedings of the Parallel Computing: Software Technology, 2003

MUSE: A Software Oscilloscope for Clusters and Grids.

[BibT_eX]

[DOI]

Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003

Optimizing GridFTP through Dynamic Right-Sizing.

[BibT_eX]

[DOI]

Proceedings of the 12th International Symposium on High-Performance Distributed Computing (HPDC-12 2003), 2003

Initial end-to-end performance evaluation of 10-Gigabit Ethernet.

[BibT_eX]

[DOI]

Justin Gus Hurwitz

Proceedings of the 11th Annual IEEE Symposium on High Performance Interconnects, 2003

An Integrated Multimedia Environment for Speech Recognition Using Handwriting and Written Gestures.

[BibT_eX]

[DOI]

Proceedings of the 36th Hawaii International Conference on System Sciences (HICSS-36 2003), 2003

MAGNET: A Tool for Debugging, Analyzing and Adapting Computing Systems.

[BibT_eX]

[DOI]

Proceedings of the 3rd IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2003), 2003

2002

Packet Spacing: An Enabling Mechanism for Delivering Multimedia Content in Computational Grids.

[BibT_eX]

[DOI]

J. Supercomput., 2002

The MAGNeT Toolkit: Design, Implementation and Evaluation.

[BibT_eX]

[DOI]

Jeffrey R. Hay

J. Supercomput., 2002

The Quadrics Network: High-Performance Clustering Technology.

[BibT_eX]

[DOI]

IEEE Micro, 2002

High-density computing: a 240-processor Beowulf in one cubic meter.

[BibT_eX]

[DOI]

Michael S. Warren

Proceedings of the 2002 ACM/IEEE conference on Supercomputing, 2002

Dynamic Right-Sizing: An Automated, Lightweight, and Scalable Technique for Enhancing Grid Performance.

[BibT_eX]

[DOI]

Proceedings of the Protocols for High Speed Networks, 2002

Honey, I Shrunk the Beowulf!

[BibT_eX]

[DOI]

Michael S. Warren

Proceedings of the 31st International Conference on Parallel Processing (ICPP 2002), 2002

On the transient behavior of TCP Vegas.

[BibT_eX]

[DOI]

Sarut Vanichpun

Proceedings of the 11th International Conference on Computer Communications and Networks, 2002

Using Steady- State TCP Behavior for Proactive Queue Management.

[BibT_eX]

Proceedings of the International Conference on Internet Computing, 2002

A Comparison of TCP Automatic Tuning Techniques for Distributed Computing.

[BibT_eX]

[DOI]

Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing (HPDC-11 2002), 2002

Dynamic Right-Sizing in FTP (drsFTP): Enhancing Grid Performance in User-Space.

[BibT_eX]

[DOI]

Mike Fisk

Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing (HPDC-11 2002), 2002

GREEN: proactive queue management over a best-effort network.

[BibT_eX]

[DOI]

Apu Kapadia

Proceedings of the Global Telecommunications Conference, 2002

The Bladed Beowulf: A Cost-Effective Alternative to Traditional Beowulf.

[BibT_eX]

[DOI]

Michael S. Warren

Proceedings of the 2002 IEEE International Conference on Cluster Computing (CLUSTER 2002), 2002

2001

Improved resource utilization with buffered coscheduling.

[BibT_eX]

[DOI]

Parallel Algorithms Appl., 2001

Performance Evaluation of the Quadrics Interconnection Network.

[BibT_eX]

[DOI]

Proceedings of the 15th International Parallel & Distributed Processing Symposium (IPDPS-01), 2001

Gang Scheduling with Lightweight User-Level Communication.

[BibT_eX]

[DOI]

Proceedings of the 30th International Workshops on Parallel Processing (ICPP 2001 Workshops), 2001

The Effects of Inter-packet Spacing on the Delivery of Multimedia Content.

[BibT_eX]

[DOI]

Apu Kapadia

Annette C. Feng

Proceedings of the 21st International Conference on Distributed Computing Systems (ICDCS 2001), 2001

Dynamic right-sizing: a simulation study.

[BibT_eX]

[DOI]

Proceedings of the 10th International Conference on Computer Communications and Networks, 2001

MAGNeT: monitor for application-generated network traffic.

[BibT_eX]

[DOI]

Jeffrey R. Hay

Proceedings of the 10th International Conference on Computer Communications and Networks, 2001

A Case for TCP Vegas in High-Performance Computational Grids.

[BibT_eX]

[DOI]

Proceedings of the 10th IEEE International Symposium on High Performance Distributed Computing (HPDC-10 2001), 2001

The Quadrics network (QsNet): high-performance clustering technology.

[BibT_eX]

[DOI]

Proceedings of the Ninth Symposium on High Performance Interconnects, 2001

Capturing Network Traffic with a MAGNeT.

[BibT_eX]

[DOI]

Jeffrey R. Hay

Proceedings of the 5th Annual Linux Showcase & Conference 2001, 2001

2000

The Failure of TCP in High-Performance Computational Grids.

[BibT_eX]

[DOI]

Peerapol Tinnakornsrisuphap

Proceedings of the Proceedings Supercomputing 2000, 2000

Time-Sharing Parallel Jobs in the Presence of Multiple Resource Requirements.

[BibT_eX]

[DOI]

Proceedings of the Job Scheduling Strategies for Parallel Processing, IPDPS 2000 Workshop, 2000

Buffered Coscheduling: A New Methodology for Multitasking Parallel Jobs on Distributed Systems.

[BibT_eX]

[DOI]

Proceedings of the 14th International Parallel & Distributed Processing Symposium (IPDPS'00), 2000

The Adverse Impact of the TCP Congestion-Control Mechanism in Heterogeneous Computing Systems.

[BibT_eX]

[DOI]

Peerapol Tinnakornsrisuphap

Proceedings of the 2000 International Conference on Parallel Processing, 2000

On the Burstiness of the TCP Congestion-Control Mechanism in a Distributed Computing System.

[BibT_eX]

[DOI]

Peerapol Tinnakornsrisuphap

Ian R. Philp

Proceedings of the 20th International Conference on Distributed Computing Systems, 2000

Scheduling with Global Information in Distributed Systems.

[BibT_eX]

[DOI]

Proceedings of the 20th International Conference on Distributed Computing Systems, 2000

1999

The Design of an Open Real-Time System Using CORBA.

[BibT_eX]

[DOI]

Proceedings of the 1999 International Conference on Parallel Processing Workshops, 1999

Dynamic Client-Side Scheduling in a Real-Time CORBA System.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Computer Software and Applications Conference (COMPSAC '99), 1999

1997

Algorithms for Scheduling Real-Time Tasks with Input Error and End-to-End Deadlines.

[BibT_eX]

[DOI]