Wu-chun Feng

According to our database1, Wu-chun Feng
  • authored at least 221 papers between 1989 and 2017.
  • has a "Dijkstra number"2 of three.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

On csauthors.net:

Bibliography

2017
cuBLASTP: Fine-Grained Parallelization of Protein Sequence Search on CPU+GPU.
IEEE/ACM Trans. Comput. Biology Bioinform., 2017

Parallel programming with pictures is a Snap!
J. Parallel Distrib. Comput., 2017

A runtime estimation framework for ALICE.
Future Generation Comp. Syst., 2017

Eliminating Irregularities of Protein Sequence Search on Multicore Architectures.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium, 2017

PaPar: A Parallel Data Partitioning Framework for Big Data Applications.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium, 2017

Auto-Tuning Strategies for Parallelizing Sparse Matrix-Vector (SpMV) Multiplication on Multi- and Many-Core Processors.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium Workshops, 2017

Directive-Based Partitioning and Pipelining for Graphics Processing Units.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium, 2017

Characterizing and Modeling Power and Energy for Extreme-Scale In-Situ Visualization.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium, 2017

A framework for fast and fair evaluation of automata processing hardware.
Proceedings of the 2017 IEEE International Symposium on Workload Characterization, 2017

AutoMatch: An automated framework for relative performance estimation and workload distribution on heterogeneous HPC systems.
Proceedings of the 2017 IEEE International Symposium on Workload Characterization, 2017

Demystifying automata processing: GPUs, FPGAs or Micron's AP?
Proceedings of the International Conference on Supercomputing, 2017

Fast segmented sort on GPUs.
Proceedings of the International Conference on Supercomputing, 2017

Developing Dynamic Profiling and Debugging Support in OpenCL for FPGAs.
Proceedings of the 54th Annual Design Automation Conference, 2017

An Enhanced Image Reconstruction Tool for Computed Tomography on CPUs.
Proceedings of the Computing Frontiers Conference, 2017

GPU-UniCache: Automatic Code Generation of Spatial Blocking for Stencils on CPUs.
Proceedings of the Computing Frontiers Conference, 2017

2016
OpenDwarfs: Characterization of Dwarf-Based Benchmarks on Fixed and Reconfigurable Architectures.
Signal Processing Systems, 2016

MPI-ACC: Accelerator-Aware MPI for Scientific Applications.
IEEE Trans. Parallel Distrib. Syst., 2016

Fast Detection of Transformed Data Leaks.
IEEE Trans. Information Forensics and Security, 2016

MultiCL: Enabling automatic scheduling for task-parallel workloads in OpenCL.
Parallel Computing, 2016

muBLASTP: database-indexed protein sequence search on multicore CPUs.
BMC Bioinformatics, 2016

MetaMorph: a library framework for interoperable kernels on multi- and many-core clusters.
Proceedings of the International Conference for High Performance Computing, 2016

Characterizing Performance and Power towards Efficient Synchronization of GPU Kernels.
Proceedings of the 24th IEEE International Symposium on Modeling, 2016

An automated framework for characterizing and subsetting GPGPU workloads.
Proceedings of the 2016 IEEE International Symposium on Performance Analysis of Systems and Software, 2016

The Right Metric for Efficient Supercomputing: A Ten-Year Retrospective.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016

AAlign: A SIMD Framework for Pairwise Sequence Alignment on x86-Based Multi-and Many-Core Processors.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016

Parallel Programming with Pictures in a Snap!
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016

Measuring and modeling on-chip interconnect power on real hardware.
Proceedings of the 2016 IEEE International Symposium on Workload Characterization, 2016

Parallel Transposition of Sparse Data Structures.
Proceedings of the 2016 International Conference on Supercomputing, 2016

Telescoping Architectures: Evaluating Next-Generation Heterogeneous Computing.
Proceedings of the 23rd IEEE International Conference on High Performance Computing, 2016

Bridging the Performance-Programmability Gap for FPGAs via OpenCL: A Case Study with OpenDwarfs.
Proceedings of the 24th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2016

Directive-Based Pipelining Extension for OpenMP.
Proceedings of the 2016 IEEE International Conference on Cluster Computing, 2016

cuART: Fine-Grained Algebraic Reconstruction Technique for Computed Tomography Images on GPUs.
Proceedings of the IEEE/ACM 16th International Symposium on Cluster, 2016

Online Power Estimation of Graphics Processing Units.
Proceedings of the IEEE/ACM 16th International Symposium on Cluster, 2016

Multiscale Approximation with Graphical Processing Units for Multiplicative Speedup in Molecular Dynamics.
Proceedings of the 7th ACM International Conference on Bioinformatics, 2016

Bridging the FPGA programmability-portability Gap via automatic OpenCL code generation and tuning.
Proceedings of the 27th IEEE International Conference on Application-specific Systems, 2016

O3FA: A Scalable Finite Automata-based Pattern-Matching Engine for Out-of-Order Deep Packet Inspection.
Proceedings of the 2016 Symposium on Architectures for Networking and Communications Systems, 2016

2015
CoreTSAR: Core Task-Size Adapting Runtime.
IEEE Trans. Parallel Distrib. Syst., 2015

Accelerating Bioinformatics Applications via Emerging Parallel Computing Systems.
IEEE/ACM Trans. Comput. Biology Bioinform., 2015

On the Energy Proportionality of Scale-Out Workloads.
CoRR, 2015

Towards Energy-Proportional Computing Using Subsystem-Level Power Management.
CoRR, 2015

Design and Evaluation of Scalable Concurrent Queues for Many-Core Architectures.
Proceedings of the 6th ACM/SPEC International Conference on Performance Engineering, Austin, TX, USA, January 31, 2015

On the Performance, Energy, and Power of Data-Access Methods in Heterogeneous Computing Systems.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 2015

HPPAC Introduction and Committees.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 2015

On the Greenness of In-Situ and Post-Processing Visualization Pipelines.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 2015

Rapid and parallel content screening for detecting transformed data exposure.
Proceedings of the 2015 IEEE Conference on Computer Communications Workshops, 2015

ASPaS: A Framework for Automatic SIMDization of Parallel Sorting on x86-based Many-core Processors.
Proceedings of the 29th ACM on International Conference on Supercomputing, 2015

GLAF: A Visual Programming and Auto-tuning Framework for Parallel Computing.
Proceedings of the 44th International Conference on Parallel Processing, 2015

pDindel: Accelerating indel detection on a multicore CPU architecture with SIMD.
Proceedings of the 5th IEEE International Conference on Computational Advances in Bio and Medical Sciences, 2015

Rapid Screening of Transformed Data Leaks with Efficient Algorithms and Parallel Computing.
Proceedings of the 5th ACM Conference on Data and Application Security and Privacy, 2015

Automatic Command Queue Scheduling for Task-Parallel Workloads in OpenCL.
Proceedings of the 2015 IEEE International Conference on Cluster Computing, 2015

2014
SDAFT: A novel scalable data access framework for parallel BLAST.
Parallel Computing, 2014

A power-measurement methodology for large-scale, high-performance computing.
Proceedings of the ACM/SPEC International Conference on Performance Engineering, 2014

Aeromancer: A Workflow Manager for Large-Scale MapReduce-Based Scientific Workflows.
Proceedings of the 13th IEEE International Conference on Trust, 2014

CoreTSAR: Adaptive Worksharing for Heterogeneous Systems.
Proceedings of the Supercomputing - 29th International Conference, 2014

On the Energy Proportionality of Distributed NoSQL Data Stores.
Proceedings of the High Performance Computing Systems. Performance Modeling, Benchmarking, and Simulation, 2014

On the performance and energy efficiency of FPGAs and GPUs for polyphase channelization.
Proceedings of the 2014 International Conference on ReConFigurable Computing and FPGAs, 2014

cuBLASTP: Fine-Grained Parallelization of Protein Sequence Search on a GPU.
Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014

Petascale Application of a Coupled CPU-GPU Algorithm for Simulation and Analysis of Multiphase Flow Solutions in Porous Medium Systems.
Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014

Delivering Parallel Programmability to the Masses via the Intel MIC Ecosystem: A Case Study.
Proceedings of the 43rd International Conference on Parallel Processing Workshops, 2014

SAIS-OPT: On the characterization and optimization of the SA-IS algorithm for suffix array construction.
Proceedings of the IEEE 4th International Conference on Computational Advances in Bio and Medical Sciences, 2014

SLAM: scalable locality-aware middleware for I/O in scientific analysis and visualization.
Proceedings of the 23rd International Symposium on High-Performance Parallel and Distributed Computing, 2014

Towards a performance-portable FFT library for heterogeneous computing.
Proceedings of the Computing Frontiers Conference, CF'14, 2014

Enabling Efficient Power Provisioning for Enterprise Applications.
Proceedings of the 14th IEEE/ACM International Symposium on Cluster, 2014

Runtime Adaptation for Autonomic Heterogeneous Computing.
Proceedings of the 14th IEEE/ACM International Symposium on Cluster, 2014

On the characterization of OpenCL dwarfs on fixed and reconfigurable platforms.
Proceedings of the IEEE 25th International Conference on Application-Specific Systems, 2014

Locality-aware memory association for multi-target worksharing in OpenMP.
Proceedings of the International Conference on Parallel Architectures and Compilation, 2014

2013
Characterizing the challenges and evaluating the efficacy of a CUDA-to-OpenCL translator.
Parallel Computing, 2013

GBench: benchmarking methodology for evaluating the energy efficiency of supercomputers.
Computer Science - R&D, 2013

The Green500 list: escapades to exascale.
Computer Science - R&D, 2013

Performance characterization of data-intensive kernels on AMD Fusion architectures.
Computer Science - R&D, 2013

Towards energy-proportional computing for enterprise-class server workloads.
Proceedings of the ACM/SPEC International Conference on Performance Engineering, 2013

SDAFT: a novel scalable data access framework for parallel BLAST.
Proceedings of the 2013 International Workshop on Data-Intensive Scalable Computing Systems, 2013

Synchronization and Ordering Semantics in Hybrid MPI+GPU Programming.
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

Online Performance Projection for Clusters with Heterogeneous GPUs.
Proceedings of the 19th IEEE International Conference on Parallel and Distributed Systems, 2013

On the Programmability and Performance of Heterogeneous Platforms.
Proceedings of the 19th IEEE International Conference on Parallel and Distributed Systems, 2013

On the Portability of the OpenCL Dwarfs on Fixed and Reconfigurable Parallel Platforms.
Proceedings of the 19th IEEE International Conference on Parallel and Distributed Systems, 2013

Wideband Channelization for Software-Defined Radio via Mobile Graphics Processors.
Proceedings of the 19th IEEE International Conference on Parallel and Distributed Systems, 2013

pVOCL: Power-Aware Dynamic Placement and Migration in Virtualized GPU Environments.
Proceedings of the IEEE 33rd International Conference on Distributed Computing Systems, 2013

Seamless Migration of Virtual Machines across Networks.
Proceedings of the 22nd International Conference on Computer Communication and Networks, 2013

Accelerating fast Fourier Transform for wideband channelization.
Proceedings of IEEE International Conference on Communications, 2013

On the efficacy of GPU-integrated MPI for scientific applications.
Proceedings of the 22nd International Symposium on High-Performance Parallel and Distributed Computing, 2013

Consolidating Applications for Energy Efficiency in Heterogeneous Computing Systems.
Proceedings of the 10th IEEE International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing, 2013

Trends in energy-efficient computing: A perspective from the Green500.
Proceedings of the International Green Computing Conference, 2013

Cascaded TCP: Applying pipelining to TCP for efficient communication over wide-area networks.
Proceedings of the 2013 IEEE Global Communications Conference, 2013

Optimizing Burrows-Wheeler Transform-Based Sequence Alignment on Multicore Architectures.
Proceedings of the 13th IEEE/ACM International Symposium on Cluster, 2013

2012
Parallel Mining of Neuronal Spike Streams on Graphics Processing Units.
International Journal of Parallel Programming, 2012

Reliable MapReduce computing on opportunistic resources.
Cluster Computing, 2012

Multi-dimensional characterization of electrostatic surface potential computation on graphics processors.
BMC Bioinformatics, 2012

OpenCL and the 13 dwarfs: a work in progress.
Proceedings of the Third Joint WOSP/SIPEW International Conference on Performance Engineering, 2012

Automatic NUMA characterization using Cbench.
Proceedings of the Third Joint WOSP/SIPEW International Conference on Performance Engineering, 2012

Poster: Cascaded TCP: BIG Throughput for BIG DATA Applications in Distributed HPC.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Abstract: Cascaded TCP: BIG Throughput for BIG DATA Applications in Distributed HPC.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Generalizing the Utility of GPUs in Large-Scale Heterogeneous Computing Systems.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

The Green Index: A Metric for Evaluating System-Wide Energy Efficiency in HPC Systems.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

Heterogeneous Task Scheduling for Accelerated OpenMP.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

Efficient Intranode Communication in GPU-Accelerated Systems.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

Characterizing the Performance and Energy Efficiency of Simultaneous Multithreading in Multicore Environments.
Proceedings of the 41st International Conference on Parallel Processing Workshops, 2012

Lost in Translation: Challenges in Automating CUDA-to-OpenCL Translation.
Proceedings of the 41st International Conference on Parallel Processing Workshops, 2012

DMA-Assisted, Intranode Communication in GPU Accelerated Systems.
Proceedings of the 14th IEEE International Conference on High Performance Computing and Communication & 9th IEEE International Conference on Embedded Software and Systems, 2012

MPI-ACC: An Integrated and Extensible Approach to Data Movement in Accelerator-based Systems.
Proceedings of the 14th IEEE International Conference on High Performance Computing and Communication & 9th IEEE International Conference on Embedded Software and Systems, 2012

Transparent Accelerator Migration in a Virtualized GPU Environment.
Proceedings of the 12th IEEE/ACM International Symposium on Cluster, 2012

2011
Homology to Sequence Alignment, From.
Proceedings of the Encyclopedia of Parallel Computing, 2011

Coordinating Computation and I/O in Massively Parallel Sequence Search.
IEEE Trans. Parallel Distrib. Syst., 2011

Poster: characterizing the impact of memory-access techniques on AMD fusion.
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, 2011

Accelerating Protein Sequence Search in a Heterogeneous Computing System.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Emerging Trends on the Evolving Green500: Year Three.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Optimizing Dynamic Programming on Graphics Processing Units via Adaptive Thread-Level Parallelism.
Proceedings of the 17th IEEE International Conference on Parallel and Distributed Systems, 2011

CU2CL: A CUDA-to-OpenCL Translator for Multi- and Many-Core Architectures.
Proceedings of the 17th IEEE International Conference on Parallel and Distributed Systems, 2011

StreamMR: An Optimized MapReduce Framework for AMD GPUs.
Proceedings of the 17th IEEE International Conference on Parallel and Distributed Systems, 2011

Architecture-Aware Mapping and Optimization on a 1600-Core GPU.
Proceedings of the 17th IEEE International Conference on Parallel and Distributed Systems, 2011

AVS video decoder on multicore systems: Optimizations and tradeoffs.
Proceedings of the 2011 IEEE International Conference on Multimedia and Expo, 2011

Restoring End-to-End Resilience in the Presence of Middleboxes.
Proceedings of 20th International Conference on Computer Communications and Networks, 2011

Towards accelerating molecular modeling via multi-scale approximation on a GPU.
Proceedings of the IEEE 1st International Conference on Computational Advances in Bio and Medical Sciences, 2011

High-performance biocomputing for simulating the spread of contagion over large contact networks.
Proceedings of the IEEE 1st International Conference on Computational Advances in Bio and Medical Sciences, 2011

Energy-efficient E-puting everywhere.
Proceedings of the 20th ACM International Symposium on High Performance Distributed Computing, 2011

Performance Characterization and Optimization of Atomic Operations on AMD GPUs.
Proceedings of the 2011 IEEE International Conference on Cluster Computing (CLUSTER), 2011

Bounding the effect of partition camping in GPU kernels.
Proceedings of the 8th Conference on Computing Frontiers, 2011

2010
Power saving experiments for large-scale global optimisation.
IJPEDS, 2010

A first look at integrated GPUs for green high-performance computing.
Computer Science - R&D, 2010

Global-scale distributed I/O with ParaMEDIC.
Concurrency and Computation: Practice and Experience, 2010

Missing genes in the annotation of prokaryotic genomes.
BMC Bioinformatics, 2010

Broadening accessibility to computer science for K-12 education.
Proceedings of the 15th Annual SIGCSE Conference on Innovation and Technology in Computer Science Education, 2010

To GPU synchronize or not GPU synchronize?
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2010), May 30, 2010

Inter-block GPU communication via fast barrier synchronization.
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

The Green500 List: Year two.
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Enhancing MapReduce via Asynchronous Data Processing.
Proceedings of the 16th IEEE International Conference on Parallel and Distributed Systems, 2010

On the Goodput of TCP NewReno in Mobile Networks.
Proceedings of the 19th International Conference on Computer Communications and Networks, 2010

MOON: MapReduce On Opportunistic eNvironments.
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, 2010

Understanding Power Measurement Implications in the Green500 List.
Proceedings of the 2010 IEEE/ACM Int'l Conference on Green Computing and Communications, 2010

Statistical Power and Performance Modeling for Optimizing the Energy Efficiency of Scientific Computing.
Proceedings of the 2010 IEEE/ACM Int'l Conference on Green Computing and Communications, 2010

Power and Performance Characterization of Computational Kernels on the GPU.
Proceedings of the 2010 IEEE/ACM Int'l Conference on Green Computing and Communications, 2010

GPU-RMAP: Accelerating Short-Read Mapping on Graphics Processors.
Proceedings of the 13th IEEE International Conference on Computational Science and Engineering, 2010

Towards chip-on-chip neuroscience: fast mining of neuronal spike streams using graphics hardware.
Proceedings of the 7th Conference on Computing Frontiers, 2010

2009
Towards Chip-on-Chip Neuroscience: Fast Mining of Frequent Episodes Using Graphics Processors
CoRR, 2009

Tools and Environments for Multicore and Many-Core Architectures.
IEEE Computer, 2009

On the energy efficiency of graphics processing units for scientific computing.
Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

The Green500 List: Year one.
Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

Multi-dimensional characterization of temporal data mining on graphics processors.
Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

GePSeA: A General-Purpose Software Acceleration Framework for Lightweight Task Offloading.
Proceedings of the ICPP 2009, 2009

On the Robust Mapping of Dynamic Programming onto a Graphics Processing Unit.
Proceedings of the 15th IEEE International Conference on Parallel and Distributed Systems, 2009

2008
Algorithms for Integrated Routing and Scheduling for Aggregating Data from Distributed Resources on a Lambda Grid.
IEEE Trans. Parallel Distrib. Syst., 2008

Green Supercomputing Comes of Age.
IT Professional, 2008

Asymmetric interactions in symmetric multi-core systems: analysis, enhancements and evaluation.
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2008

Massively parallel genomic sequence search on the Blue Gene/P architecture.
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2008

Semantics-based distributed I/O for mpiBLAST.
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2008

Impact of Network Sharing in Multi-Core Architectures.
Proceedings of the 17th International Conference on Computer Communications and Networks, 2008

Semantic-based distributed i/o with the paramedic framework.
Proceedings of the 17th International Symposium on High-Performance Distributed Computing (HPDC-17 2008), 2008

Making a Case for Proactive Flow Control in Optical Circuit-Switched Networks.
Proceedings of the High Performance Computing, 2008

Cell-SWat: modeling and scheduling wavefront computations on the cell broadband engine.
Proceedings of the 5th Conference on Computing Frontiers, 2008

Optimizing performance, cost, and sensitivity in pairwise sequence search on a cluster of PlayStations.
Proceedings of the 8th IEEE International Conference on Bioinformatics and Bioengineering, 2008

2007
High-performance computing using accelerators.
Parallel Computing, 2007

The Green500 List: Encouraging Sustainable Supercomputing.
IEEE Computer, 2007

Analyzing the impact of supporting out-of-order communication on in-order performance with iWARP.
Proceedings of the ACM/IEEE Conference on High Performance Networking and Computing, 2007

Green Supercomputing in a Desktop Box.
Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

A Maintainable Software Architecture for Fast and Modular Bioinformatics Sequence Search.
Proceedings of the 23rd IEEE International Conference on Software Maintenance (ICSM 2007), 2007

CPU MISER: A Performance-Directed, Run-Time System for Power-Aware Clusters.
Proceedings of the 2007 International Conference on Parallel Processing (ICPP 2007), 2007

An Analysis of 10-Gigabit Ethernet Protocol Stacks in Multicore Environments.
Proceedings of the 15th Annual IEEE Symposium on High-Performance Interconnects, 2007

Parallel genomic sequence-search on a massively parallel system.
Proceedings of the 4th Conference on Computing Frontiers, 2007

2006
Bridging the Ethernet-Ethernot Performance Gap.
IEEE Micro, 2006

Grid applications - Parallel genomic sequence-searching on an ad-hoc grid: experiences, lessons learned, and implications.
Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

Grid networks and portals - End-system aware, rate-adaptive protocol for network transport in LambdaGrid environments.
Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

Making a case for a Green500 list.
Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

RAPID: an end-system aware protocol for intelligent data transfer over lambda grids.
Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

When Optical Networking Meets Grid Computing?
Proceedings of the 15th International Conference On Computer Communications and Networks, 2006

Exploring I/O Strategies for Parallel Sequence-Search Tools with S3aSim.
Proceedings of the 15th IEEE International Symposium on High Performance Distributed Computing, 2006

A Feedback Mechanism for Network Scheduling in LambdaGrids.
Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2006), 2006

2005
FAST TCP: from theory to experiments.
IEEE Network, 2005

Anatomy of UDP and M-VIA for cluster communication.
J. Parallel Distrib. Comput., 2005

Analyzing MPI performance over 10-Gigabit ethernet.
J. Parallel Distrib. Comput., 2005

A Power-Aware Run-Time System for High-Performance Computing.
Proceedings of the ACM/IEEE SC2005 Conference on High Performance Networking and Computing, 2005

Towards Efficient Supercomputing: A Quest for the Right Metric.
Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005

Q-Composer and CpR: a probabilistic synthesizer and regulator of traffic (a probabilistic control of buffer occupancy).
Proceedings of the INFOCOM 2005. 24th Annual Joint Conference of the IEEE Computer and Communications Societies, 2005

Performance Characterization of a 10-Gigabit Ethernet TOE.
Proceedings of the 13th Annual IEEE Symposium on High Performance Interconnects (HOTIC 2005), 2005

A Feasibility Analysis of Power Awareness in Commodity-Based High-Performance Clusters.
Proceedings of the 2005 IEEE International Conference on Cluster Computing (CLUSTER 2005), September 26, 2005

Head-to-TOE Evaluation of High-Performance Sockets over Protocol Offload Engines.
Proceedings of the 2005 IEEE International Conference on Cluster Computing (CLUSTER 2005), September 26, 2005

2004
End-to-End Performance of 10-Gigabit Ethernet on Commodity Systems.
IEEE Micro, 2004

User-space auto-tuning for TCP flow control in computational grids.
Computer Communications, 2004

Effective Dynamic Voltage Scaling Through CPU-Boundedness Detection.
Proceedings of the Power-Aware Computer Systems, 4th International Workshop, 2004

Re-Architecting Flow Control Adaptation for Grid Environments.
Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), 2004

A Systematic Approach for Providing End-to-End Probabilistic QoS Guarantees.
Proceedings of the International Conference On Computer Communications and Networks (ICCCN 2004), 2004

A Multimodal Interface for the Immediate Transcription of Radiology Dictation.
Proceedings of the 17th IEEE Symposium on Computer-Based Medical Systems (CBMS 2004), 2004

2003
Making a Case for Efficient Supercomputing.
ACM Queue, 2003

Scheduling and Transport for File Transfers on High-Speed Optical Circuits.
J. Grid Comput., 2003

Automatic Flow-Control Adaptation for Enhancing Network Performance in Computational Grids.
J. Grid Comput., 2003

Optimizing 10-Gigabit Ethernet for Networks of Workstations, Clusters, and Grids: A Case Study.
Proceedings of the ACM/IEEE SC2003 Conference on High Performance Networking and Computing, 2003

Enabling Compatibility Between TCP Reno and TCP Vegas.
Proceedings of the 2003 Symposium on Applications and the Internet (SAINT 2003), 27-31 January 2003, 2003

MUSE: A Software Oscilloscope for Clusters and Grids.
Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003

Optimizing GridFTP through Dynamic Right-Sizing.
Proceedings of the 12th International Symposium on High-Performance Distributed Computing (HPDC-12 2003), 2003

Initial end-to-end performance evaluation of 10-Gigabit Ethernet.
Proceedings of the 11th Annual IEEE Symposium on High Performance Interconnects, 2003

An Integrated Multimedia Environment for Speech Recognition Using Handwriting and Written Gestures.
Proceedings of the 36th Hawaii International Conference on System Sciences (HICSS-36 2003), 2003

MAGNET: A Tool for Debugging, Analyzing and Adapting Computing Systems.
Proceedings of the 3rd IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2003), 2003

2002
Packet Spacing: An Enabling Mechanism for Delivering Multimedia Content in Computational Grids.
The Journal of Supercomputing, 2002

The MAGNeT Toolkit: Design, Implementation and Evaluation.
The Journal of Supercomputing, 2002

The Quadrics Network: High-Performance Clustering Technology.
IEEE Micro, 2002

High-density computing: a 240-processor Beowulf in one cubic meter.
Proceedings of the 2002 ACM/IEEE conference on Supercomputing, 2002

Dynamic Right-Sizing: An Automated, Lightweight, and Scalable Technique for Enhancing Grid Performance.
Proceedings of the Protocols for High Speed Networks, 2002

Honey, I Shrunk the Beowulf!
Proceedings of the 31st International Conference on Parallel Processing (ICPP 2002), 2002

On the transient behavior of TCP Vegas.
Proceedings of the 11th International Conference on Computer Communications and Networks, 2002

Using Steady- State TCP Behavior for Proactive Queue Management.
Proceedings of the International Conference on Internet Computing, 2002

A Comparison of TCP Automatic Tuning Techniques for Distributed Computing.
Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing (HPDC-11 2002), 2002

Dynamic Right-Sizing in FTP (drsFTP): Enhancing Grid Performance in User-Space.
Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing (HPDC-11 2002), 2002

GREEN: proactive queue management over a best-effort network.
Proceedings of the Global Telecommunications Conference, 2002

The Bladed Beowulf: A Cost-Effective Alternative to Traditional Beowulf.
Proceedings of the 2002 IEEE International Conference on Cluster Computing (CLUSTER 2002), 2002

2001
Improved resource utilization with buffered coscheduling.
Parallel Algorithms Appl., 2001

Performance Evaluation of the Quadrics Interconnection Network.
Proceedings of the 15th International Parallel & Distributed Processing Symposium (IPDPS-01), 2001

Gang Scheduling with Lightweight User-Level Communication.
Proceedings of the 30th International Workshops on Parallel Processing (ICPP 2001 Workshops), 2001

The Effects of Inter-packet Spacing on the Delivery of Multimedia Content.
Proceedings of the 21st International Conference on Distributed Computing Systems (ICDCS 2001), 2001

Dynamic right-sizing: a simulation study.
Proceedings of the 10th International Conference on Computer Communications and Networks, 2001

MAGNeT: monitor for application-generated network traffic.
Proceedings of the 10th International Conference on Computer Communications and Networks, 2001

A Case for TCP Vegas in High-Performance Computational Grids.
Proceedings of the 10th IEEE International Symposium on High Performance Distributed Computing (HPDC-10 2001), 2001

The Quadrics network (QsNet): high-performance clustering technology.
Proceedings of the Ninth Symposium on High Performance Interconnects, 2001

Capturing Network Traffic with a MAGNeT.
Proceedings of the 5th Annual Linux Showcase & Conference 2001, 2001

2000
The Failure of TCP in High-Performance Computational Grids.
Proceedings of the Proceedings Supercomputing 2000, 2000

Time-Sharing Parallel Jobs in the Presence of Multiple Resource Requirements.
Proceedings of the Job Scheduling Strategies for Parallel Processing, IPDPS 2000 Workshop, 2000

Buffered Coscheduling: A New Methodology for Multitasking Parallel Jobs on Distributed Systems.
Proceedings of the 14th International Parallel & Distributed Processing Symposium (IPDPS'00), 2000

The Adverse Impact of the TCP Congestion-Control Mechanism in Heterogeneous Computing Systems.
Proceedings of the 2000 International Conference on Parallel Processing, 2000

On the Burstiness of the TCP Congestion-Control Mechanism in a Distributed Computing System.
Proceedings of the 20th International Conference on Distributed Computing Systems, 2000

Scheduling with Global Information in Distributed Systems.
Proceedings of the 20th International Conference on Distributed Computing Systems, 2000

1999
The Design of an Open Real-Time System Using CORBA.
Proceedings of the 1999 International Conference on Parallel Processing Workshops, 1999

1997
Algorithms for Scheduling Real-Time Tasks with Input Error and End-to-End Deadlines.
IEEE Trans. Software Eng., 1997

1989
Map Data Processing in Geographic Information Systems.
IEEE Computer, 1989


  Loading...