Laxmi N. Bhuyan

Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020

Swan: a two-step power management for distributed search engines.

[BibT_eX]

[DOI]

Proceedings of the ISLPED '20: ACM/IEEE International Symposium on Low Power Electronics and Design, 2020

SAOU: safe adaptive overclocking and undervolting for energy-efficient GPU computing.

[BibT_eX]

[DOI]

Proceedings of the ISLPED '20: ACM/IEEE International Symposium on Low Power Electronics and Design, 2020

Slumber: static-power management for GPGPU register files.

[BibT_eX]

[DOI]

Proceedings of the ISLPED '20: ACM/IEEE International Symposium on Low Power Electronics and Design, 2020

2019

P4NFV: P4 Enabled NFV Systems with SmartNICs.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Network Function Virtualization and Software Defined Networks, 2019

GreenMM: energy efficient GPU matrix multiplication through undervolting.

[BibT_eX]

[DOI]

Proceedings of the ACM International Conference on Supercomputing, 2019

Goldilocks: Adaptive Resource Provisioning in Containerized Data Centers.

[BibT_eX]

[DOI]

Proceedings of the 39th IEEE International Conference on Distributed Computing Systems, 2019

μDPM: Dynamic Power Management for the Microsecond Era.

[BibT_eX]

[DOI]

Daniel Wong

Proceedings of the 25th IEEE International Symposium on High Performance Computer Architecture, 2019

DREAM: DistRibuted Energy-Aware traffic Management for Data Center Networks.

[BibT_eX]

[DOI]

Proceedings of the Tenth ACM International Conference on Future Energy Systems, 2019

2018

Juggler: a dependence-aware task-based execution framework for GPUs.

[BibT_eX]

[DOI]

Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2018

Joint Server and Network Energy Saving in Data Centers for Latency-Sensitive Applications.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium, 2018

CAMO: A novel cache management organization for GPGPUs.

[BibT_eX]

[DOI]

Proceedings of the 23rd Asia and South Pacific Design Automation Conference, 2018

2017

Enabling Work-Efficiency for High Performance Vertex-Centric Graph Analytics on GPUs.

[BibT_eX]

[DOI]

Proceedings of the Seventh Workshop on Irregular Applications: Architectures and Algorithms, 2017

Wireframe: supporting data-dependent parallelism through dependency graph execution in GPUs.

[BibT_eX]

[DOI]

AmirAli Abdolrashidi

Devashree Tripathy

Mehmet Esat Belviranli

Laxmi Narayan Bhuyan

Daniel Wong

Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, 2017

TailCut: Power Reduction under Quality and Latency Constraints in Distributed Search Systems.

[BibT_eX]

[DOI]

Shaolei Ren

Proceedings of the 37th IEEE International Conference on Distributed Computing Systems, 2017

2016

Tumbler: An Effective Load-Balancing Technique for Multi-CPU Multicore Systems.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2016

GreenLA: green linear algebra software for GPU-accelerated heterogeneous computing.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2016

DynSleep: Fine-grained Power Management for a Latency-Critical Data Center Application.

[BibT_eX]

[DOI]

Daniel Wong

Proceedings of the 2016 International Symposium on Low Power Electronics and Design, 2016

Eliminating Intra-Warp Load Imbalance in Irregular Nested Patterns via Collaborative Task Engagement.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016

CuMAS: Data Transfer Aware Multi-Application Scheduling for Shared GPUs.

[BibT_eX]

[DOI]

Proceedings of the 2016 International Conference on Supercomputing, 2016

2015

Design and analysis of collaborative EPC and RAN caching for LTE mobile networks.

[BibT_eX]

[DOI]

Comput. Networks, 2015

Efficient warp execution in presence of divergence with collaborative context collection.

[BibT_eX]

[DOI]

Farzad Khorasani

Proceedings of the 48th International Symposium on Microarchitecture, 2015

PeerWave: Exploiting Wavefront Parallelism on GPUs with Peer-SM Synchronization.

[BibT_eX]

[DOI]

Proceedings of the 29th ACM on International Conference on Supercomputing, 2015

A multicore vacation scheme for thermal-aware packet processing.

[BibT_eX]

[DOI]

Proceedings of the 33rd IEEE International Conference on Computer Design, 2015

Scalable SIMD-Efficient Graph Processing on GPUs.

[BibT_eX]

[DOI]

Farzad Khorasani

Proceedings of the 2015 International Conference on Parallel Architectures and Compilation, 2015

Stadium Hashing: Scalable and Flexible Hashing on GPUs.

[BibT_eX]

[DOI]

Proceedings of the 2015 International Conference on Parallel Architectures and Compilation, 2015

2014

Lock contention aware thread migrations.

[BibT_eX]

[DOI]

Laxmi Narayan Bhuyan

Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2014

LightPlay: Efficient Replay with GPUs.

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for Parallel Computing, 2014

Optimistic Parallelism on GPUs.

[BibT_eX]

[DOI]

Proceedings of the Languages and Compilers for Parallel Computing, 2014

fAHRW<sup>+</sup>: Fairness-aware and locality-enhanced scheduling for multi-server systems.

[BibT_eX]

[DOI]

Qin Liu

Proceedings of the 20th IEEE International Conference on Parallel and Distributed Systems, 2014

A scalable hash scheduler for decoding of multiple H.264/AVC streams on multi-core architecture.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2014

An efficient dynamic scheduling scheme for H.264/AVC encoding on multi-core architecture.

[BibT_eX]

[DOI]

Jeremy Castillo

Proceedings of the IEEE International Conference on Multimedia and Expo, 2014

CuSha: vertex-centric graph processing on GPUs.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Symposium on High-Performance Parallel and Distributed Computing, 2014

A paradigm shift in GP-GPU computing: task based execution of applications with dynamic data dependencies.

[BibT_eX]

[DOI]

Proceedings of the DIDC'14, 2014

Thermal-aware vacation and rate adaptation for network packet processing.

[BibT_eX]

[DOI]

Proceedings of the tenth ACM/IEEE symposium on Architectures for networking and communications systems, 2014

Shuffling: a framework for lock contention aware thread scheduling for multicore multiprocessor systems.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Parallel Architectures and Compilation, 2014

2013

ADAPT: A framework for coscheduling multithreaded programs.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2013

A dynamic self-scheduling scheme for heterogeneous multiprocessor architectures.

[BibT_eX]

[DOI]

Mehmet E. Belviranli

ACM Trans. Archit. Code Optim., 2013

A hybrid shared memory heterogeneous execution platform for PCIe-based GPGPUs.

[BibT_eX]

[DOI]

Sambit Kumar Shukla

Proceedings of the 20th Annual International Conference on High Performance Computing, 2013

Shared memory heterogeneous computation on PCIe-supported platforms.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Conference on Field programmable Logic and Applications, 2013

Thermal prediction and scheduling of network applications on multicore processors.

[BibT_eX]

[DOI]

Mehmet E. Belviranli

Proceedings of the Symposium on Architecture for Networking and Communications Systems, 2013

2012

Maintaining Data Consistency in Structured P2P Systems.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2012

An Efficient Parallelized L7-Filter Design for Multicore Servers.

[BibT_eX]

[DOI]

Bin Liu

IEEE/ACM Trans. Netw., 2012

Load-Balancing Multipath Switching System with Flow Slice.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2012

Thread Tranquilizer: Dynamically reducing performance variation.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2012

Analyzing performance and power efficiency of network processing over 10 GbE.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 2012

Peer-to-peer indirect reciprocity via personal currency.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 2012

P2P consistency support for large-scale interactive applications.

[BibT_eX]

[DOI]

Comput. Networks, 2012

Speculative parallelization on GPGPUs.

[BibT_eX]

[DOI]

Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2012

Improving the throughput and delay performance of network processors by applying push model.

[BibT_eX]

[DOI]

Proceedings of the 20th IEEE International Workshop on Quality of Service, 2012

An efficient dynamic multiple-candidate motion vector approach for GPU-based hierarchical motion estimation.

[BibT_eX]

[DOI]

Yang Yang

Proceedings of the 31st IEEE International Performance Computing and Communications Conference, 2012

An Adaptive Dynamic Scheduling Scheme for H.264/AVC Decoding on Multicore Architecture.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Conference on Multimedia and Expo, 2012

Traffic-aware power optimization for network applications on multicore servers.

[BibT_eX]

[DOI]

Raymond Klefstad

Proceedings of the 49th Annual Design Automation Conference 2012, 2012

2011

A QoS aware multicore hash scheduler for network applications.

[BibT_eX]

[DOI]

Proceedings of the INFOCOM 2011. 30th IEEE International Conference on Computer Communications, 2011

Thread reinforcer: Dynamically determining number of threads via OS level monitoring.

[BibT_eX]

[DOI]

Proceedings of the 2011 IEEE International Symposium on Workload Characterization, 2011

A new server I/O architecture for high speed networks.

[BibT_eX]

[DOI]

Xia Zhu

Proceedings of the 17th International Conference on High-Performance Computer Architecture (HPCA-17 2011), 2011

E-AHRW: An Energy-Efficient Adaptive Hash Scheduler for Stream Processing on Multi-core Servers.

[BibT_eX]

[DOI]

Proceedings of the 2011 ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS), 2011

Predictive Model-Based Thermal Management for Network Applications.

[BibT_eX]

[DOI]

Proceedings of the 2011 ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS), 2011

No More Backstabbing... A Faithful Scheduling Policy for Multithreaded Programs.

[BibT_eX]

[DOI]

Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques, 2011

2010

Performance characterization of multi-thread and multi-core processors based XML application oriented networking systems.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 2010

Optimizing Throughput and Latency under Given Power Budget for Network Packet Processing.

[BibT_eX]

[DOI]

Proceedings of the INFOCOM 2010. 29th IEEE International Conference on Computer Communications, 2010

A Balanced Consistency Maintenance Protocol for Structured P2P Systems.

[BibT_eX]

[DOI]

Proceedings of the INFOCOM 2010. 29th IEEE International Conference on Computer Communications, 2010

Understanding Power Efficiency of TCP/IP Packet Processing over 10GbE.

[BibT_eX]

[DOI]

Proceedings of the IEEE 18th Annual Symposium on High Performance Interconnects, 2010

Experience on Applying Push Model to Packet Processors in High Performance Routers.

[BibT_eX]

[DOI]

Proceedings of the Global Communications Conference, 2010

A new IP lookup cache for high performance IP routers.

[BibT_eX]

[DOI]

Heeyeol Yu

Proceedings of the 47th Design Automation Conference, 2010

LATA: a latency and throughput-aware packet processing system.

[BibT_eX]

[DOI]

Proceedings of the 47th Design Automation Conference, 2010

A new TCB cache to efficiently manage TCP sessions for web servers.

[BibT_eX]

[DOI]

Proceedings of the 2010 ACM/IEEE Symposium on Architecture for Networking and Communications Systems, 2010

Power optimization for multimedia transcoding on multicore servers.

[BibT_eX]

[DOI]

Proceedings of the 2010 ACM/IEEE Symposium on Architecture for Networking and Communications Systems, 2010

2009

Editorial: EIC Farewell and New EIC Introduction.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2009

Editor's Note.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2009

Budget-Based Self-Optimized Incentive Search in Unstructured P2P Networks.

[BibT_eX]

[DOI]

Proceedings of the INFOCOM 2009. 28th IEEE International Conference on Computer Communications, 2009

Performance characterization and cache-aware core scheduling in a virtualized multi-core server under 10GbE.

[BibT_eX]

[DOI]

Proceedings of the 2009 IEEE International Symposium on Workload Characterization, 2009

A Hash-based Scalable IP lookup using Bloom and Fingerprint Filters.

[BibT_eX]

[DOI]

Heeyeol Yu

Rabi N. Mahapatra

Proceedings of the 17th annual IEEE International Conference on Network Protocols, 2009

Performance Measurement of an Integrated NIC Architecture with 10GbE.

[BibT_eX]

[DOI]

Proceedings of the 17th IEEE Symposium on High Performance Interconnects, 2009

EINIC: an architecture for high bandwidth network I/O on multi-core processors.

[BibT_eX]

[DOI]

Proceedings of the 2009 ACM/IEEE Symposium on Architecture for Networking and Communications Systems, 2009

An adaptive hash-based multilayer scheduler for L7-filter on a highly threaded hierarchical multi-core server.

[BibT_eX]

[DOI]

Proceedings of the 2009 ACM/IEEE Symposium on Architecture for Networking and Communications Systems, 2009

2008

Ordered Round-Robin: An Efficient Sequence Preserving Packet Scheduler.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2008

Fair link striping with FIFO delivery on heterogeneous channels.

[BibT_eX]

[DOI]

Comput. Commun., 2008

POND: The Power of Zone Overlapping in DHT Networks.

[BibT_eX]

[DOI]

Jizhong Han

Proceedings of The 2008 IEEE International Conference on Networking, 2008

Performance Characterization of a Dual Quad-Core Based Application Oriented Networking System.

[BibT_eX]

[DOI]

Proceedings of The 2008 IEEE International Conference on Networking, 2008

An effective pointer replication algorithm in P2P networks.

[BibT_eX]

[DOI]

Jian Zhou

Anirban Banerjee

Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

PROD: Relayed file retrieving in overlay networks.

[BibT_eX]

[DOI]

Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

Cyber-Fraud is One Typo Away.

[BibT_eX]

[DOI]

Proceedings of the INFOCOM 2008. 27th IEEE International Conference on Computer Communications, 2008

Quantum-Adaptive Scheduling for Multi-Core Network Processors.

[BibT_eX]

[DOI]

Proceedings of the 28th IEEE International Conference on Distributed Computing Systems (ICDCS 2008), 2008

A Novel Service-Aware Message Scheduler for Cisco Application Oriented Networking Systems.

[BibT_eX]

[DOI]

Proceedings of the 17th International Conference on Computer Communications and Networks, 2008

Intelligent Message Scheduling in Application Oriented Networking Systems.

[BibT_eX]

[DOI]

Proceedings of IEEE International Conference on Communications, 2008

Revisiting the Cache Effect on Multicore Multithreaded Network Processors.

[BibT_eX]

[DOI]

Proceedings of the 11th Euromicro Conference on Digital System Design: Architectures, 2008

Software techniques to improve virtualized I/O performance on multi-core systems.

[BibT_eX]

[DOI]

Proceedings of the 2008 ACM/IEEE Symposium on Architecture for Networking and Communications Systems, 2008

A scalable multithreaded L7-filter design for multi-core servers.

[BibT_eX]

[DOI]

Proceedings of the 2008 ACM/IEEE Symposium on Architecture for Networking and Communications Systems, 2008

2007

Hardware Support for Accelerating Data Movement in Server Platform.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2007

Conserving network processor power consumption by exploiting traffic variability.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2007

The P2P War: Someone Is Monitoring Your Activities!

[BibT_eX]

[DOI]

Anirban Banerjee

Michalis Faloutsos

Proceedings of the NETWORKING 2007. Ad Hoc and Sensor Networks, 2007

Scalable and Decentralized Content-Aware Dispatching in Web Clusters.

[BibT_eX]

[DOI]

Jizhong Han

Proceedings of the 26th IEEE International Performance Computing and Communications Conference, 2007

Adaptive Max-Min Fair Scheduling in Buffered Crossbar Switches Without Speedup.

[BibT_eX]

[DOI]

Proceedings of the INFOCOM 2007. 26th IEEE International Conference on Computer Communications, 2007

Lexicographic Fairness in WDM Optical Cross-Connects.

[BibT_eX]

[DOI]

Proceedings of the INFOCOM 2007. 26th IEEE International Conference on Computer Communications, 2007

Clustered K-Center: Effective Replica Placement in Peer-to-Peer Systems.

[BibT_eX]

[DOI]

Proceedings of the Global Communications Conference, 2007

Program Mapping onto Network Processors by Recursive Bipartitioning and Refining.

[BibT_eX]

[DOI]

Proceedings of the 44th Design Automation Conference, 2007

Flow-slice: a novel load-balancing scheme for multi-path switching systems.

[BibT_eX]

[DOI]

Proceedings of the 2007 ACM/IEEE Symposium on Architecture for Networking and Communications Systems, 2007

Compiling PCRE to FPGA for accelerating SNORT IDS.

[BibT_eX]

[DOI]

Abhishek Mitra

Walid A. Najjar

Proceedings of the 2007 ACM/IEEE Symposium on Architecture for Networking and Communications Systems, 2007

2006

Load Balancing in a Cluster-Based Web Server for Multimedia Applications.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2006

Editorial: A Message from the New Editor-in-Chief.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2006

Tulip: A New Hash Based Cooperative Web Caching Architecture.

[BibT_eX]

[DOI]

Yiming Hu

J. Supercomput., 2006

A Network Processor-Based, Content-Aware Switch.

[BibT_eX]

[DOI]

IEEE Micro, 2006

Application Oriented Networking (AON): Adding Intelligence to Next-Generation Internet Routers.

[BibT_eX]

[DOI]

Proceedings of the Wireless Algorithms, 2006

Computing Real Time Jobs in P2P Networks.

[BibT_eX]

[DOI]

Jian Zhou

Proceedings of the LCN 2006, 2006

Fair Scheduling over multiple servers with flow-dependent server rate.

[BibT_eX]

[DOI]

Proceedings of the LCN 2006, 2006

Efficient server cooperation mechanism in content delivery network.

[BibT_eX]

[DOI]

Yiming Hu

Proceedings of the 25th IEEE International Performance Computing and Communications Conference, 2006

Effective Load Balancing in P2P Systems.

[BibT_eX]

[DOI]

Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2006), 2006

2005

EaseCAM: An Energy and Storage Efficient TCAM-Based Router Architecture for IP Lookup.

[BibT_eX]

[DOI]

V. C. Ravikumar

Rabi N. Mahapatra

IEEE Trans. Computers, 2005

Anatomy of UDP and M-VIA for cluster communication.

[BibT_eX]

[DOI]

Wu-chun Feng

J. Parallel Distributed Comput., 2005

An Experimental Evaluation of the HP V-Class and SGI Origin 2000 Multiprocessors using Microbenchmarks and Scientific Applications.

[BibT_eX]

[DOI]

Int. J. Parallel Program., 2005

Anatomy and Performance of SSL Processing.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2005

QoS Aware Job Scheduling in a Cluster-Based Web Server for Multimedia Applications.

[BibT_eX]

[DOI]

Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005

Efficient file sharing strategy in DHT based P2P systems.

[BibT_eX]

[DOI]

Zhinyong Xu

Xubin He

Proceedings of the 24th IEEE International Performance Computing and Communications Conference, 2005

An efficient packet scheduling algorithm in network processors.

[BibT_eX]

[DOI]

Proceedings of the INFOCOM 2005. 24th Annual Joint Conference of the IEEE Computer and Communications Societies, 2005

Hardware Support for Bulk Data Movement in Server Platforms.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Conference on Computer Design (ICCD 2005), 2005

On fair scheduling in heterogeneous link aggregated services.

[BibT_eX]

[DOI]

Proceedings of the 14th International Conference On Computer Communications and Networks, 2005

Design and Implementation of a Content-Aware Switch Using a Network Processor.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual IEEE Symposium on High Performance Interconnects (HOTIC 2005), 2005

Performance Characterization of a 10-Gigabit Ethernet TOE.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual IEEE Symposium on High Performance Interconnects (HOTIC 2005), 2005

Enhancing Network Processor Simulation Speed with Statistical Input Sampling.

[BibT_eX]

[DOI]

Proceedings of the High Performance Embedded Architectures and Compilers, 2005

Achieving fairness and throughput for best-effort traffic in input-queued crossbar switches.

[BibT_eX]

[DOI]

Proceedings of the Global Telecommunications Conference, 2005. GLOBECOM '05, St. Louis, Missouri, USA, 28 November, 2005

Optimal network processor topologies for efficient packet processing.

[BibT_eX]

[DOI]

Proceedings of the Global Telecommunications Conference, 2005. GLOBECOM '05, St. Louis, Missouri, USA, 28 November, 2005

Distributed packet processing in P2P networks.

[BibT_eX]

[DOI]

Proceedings of the Global Telecommunications Conference, 2005. GLOBECOM '05, St. Louis, Missouri, USA, 28 November, 2005

QoS-aware object replica placement in CDNs.

[BibT_eX]

[DOI]

Proceedings of the Global Telecommunications Conference, 2005. GLOBECOM '05, St. Louis, Missouri, USA, 28 November, 2005

Guaranteed smooth switch scheduling with low complexity.

[BibT_eX]

[DOI]

Proceedings of the Global Telecommunications Conference, 2005. GLOBECOM '05, St. Louis, Missouri, USA, 28 November, 2005

Low power network processor design using clock gating.

[BibT_eX]

[DOI]

Proceedings of the 42nd Design Automation Conference, 2005

SpliceNP: a TCP splicer using a network processor.

[BibT_eX]

[DOI]

Proceedings of the 2005 ACM/IEEE Symposium on Architecture for Networking and Communications Systems, 2005

2004

NePSim: A Network Processor Simulator with a Power Evaluation Framework.

[BibT_eX]

[DOI]

IEEE Micro, 2004

Assertion Based Verification and Analysis of Network Processor Architectures.

[BibT_eX]

[DOI]

Des. Autom. Embed. Syst., 2004

An Efficient and Robust Web Caching System.

[BibT_eX]

[DOI]

Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), 2004

Exploiting Client Cache: A Scalable and Efficient Approach to Build Large Web Cache.

[BibT_eX]

[DOI]

Yiming Hu

Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), 2004

Load Balancing of DNS-Based Distributed Web Server Systems with Page Caching.

[BibT_eX]

[DOI]

Zhong Xu

Rong Huang

Proceedings of the 10th International Conference on Parallel and Distributed Systems, 2004

An efficient scheduling algorithm for combined input-crosspoint-queued (CICQ) switches.

[BibT_eX]

[DOI]

Proceedings of the Global Telecommunications Conference, 2004. GLOBECOM '04, Dallas, Texas, USA, 29 November, 2004

Scheduling real-time multimedia tasks in network processors.

[BibT_eX]

[DOI]

Proceedings of the Global Telecommunications Conference, 2004. GLOBECOM '04, Dallas, Texas, USA, 29 November, 2004

Utilizing Formal Assertions for System Design of Network Processors.

[BibT_eX]

[DOI]

Proceedings of the 2004 Design, 2004

2003

Shared memory multiprocessor architectures for software IP routers.

[BibT_eX]

[DOI]

Yan Luo

Laxmi Narayan Bhuyan

Xi Chen

IEEE Trans. Parallel Distributed Syst., 2003

Switch MSHR: A Technique to Reduce Remote Read Memory Access Time in CC-NUMA Multiprocessors.

[BibT_eX]

[DOI]

Hu-Jun Wang

IEEE Trans. Computers, 2003

Deficit round-robin scheduling for input-queued switches.

[BibT_eX]

[DOI]

IEEE J. Sel. Areas Commun., 2003

Load Sharing in a Transcoding Cluster.

[BibT_eX]

[DOI]

Proceedings of the Distributed Computing, 2003

A Cluster-Based Active Router Architecture Supporting Video/Audio Stream Transcoding Service.

[BibT_eX]

[DOI]

Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003

Architectural analysis and instruction-set optimization for design of network protocol processors.

[BibT_eX]

[DOI]

Haiyong Xie

Li Zhao

Proceedings of the 1st IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, 2003

Power efficient encoding techniques for off-chip data buses.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Compilers, 2003

2002

Fair Scheduling in Internet Routers.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2002

Design and analysis of static memory management policies for CC-NUMA multiprocessors.

[BibT_eX]

[DOI]

Ravishankar R. Iyer

Hu-Jun Wang

J. Syst. Archit., 2002

Comparing the Memory System Performance of DSS Workloads on the HP V-Class and SGI Origin 2000.

[BibT_eX]

[DOI]

Rong Yu

Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS 2002), 2002

Fair Scheduling and Buffer Management in Internet Routers.

[BibT_eX]

[DOI]

Proceedings of the Proceedings IEEE INFOCOM 2002, 2002

2001

Execution-Driven Simulation of IP Router Architectures.

[BibT_eX]

[DOI]

Hu-Jun Wang

Proceedings of the IEEE International Symposium on Network Computing and Applications (NCA 2001), 2001

Fair Scheduling for Input Buffered Switches.

[BibT_eX]

[DOI]

Proceedings of the 15th International Parallel & Distributed Processing Symposium (IPDPS-01), 2001

2000

Impact of CC-NUMA Memory Management Policies on the Application Performance of Multistage Switching Networks.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2000

Design and Evaluation of a Switch Cache Architecture for CC-NUMA Multiprocessors.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2000

Exploring the Switch Design Space in a CC-NUMA Multiprocessor Environment.

[BibT_eX]

[DOI]

Proceedings of the 14th International Parallel & Distributed Processing Symposium (IPDPS'00), 2000

Using Switch Directories to Speed Up Cache-to-Cache Transfers in CC-NUMA Multiprocessors.

[BibT_eX]

[DOI]

Proceedings of the 14th International Parallel & Distributed Processing Symposium (IPDPS'00), 2000

Hardware spatial forwarding for widely shared data.

[BibT_eX]

[DOI]

Proceedings of the 14th international conference on Supercomputing, 2000

Hierarchical Simulation of a Multiprocessor Architecture.

[BibT_eX]

[DOI]

Rabi N. Mahapatra

Proceedings of the IEEE International Conference On Computer Design: VLSI In Computers & Processors, 2000

A wave-pipelined router architecture using ternary associative memory.

[BibT_eX]

[DOI]

José G. Delgado-Frias

Jabulani Nyathi

Proceedings of the 10th ACM Great Lakes Symposium on VLSI 2000, 2000

1999

An Efficient Tree Cache Coherence Protocol for Distributed Shared Memory Multiprocessors.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 1999

A Flexible Clustering and Scheduling Scheme for Efficient Parallel Computation.

[BibT_eX]

[DOI]

S. Chingchit

Mohan Kumar

Proceedings of the 13th International Parallel Processing Symposium / 10th Symposium on Parallel and Distributed Processing (IPPS / SPDP '99), 1999

Comparing the memory system performance of the HP V-class and SGI Origin 2000 multiprocessors using microbenchmarks and scientific applications.

[BibT_eX]

[DOI]

Proceedings of the 13th international conference on Supercomputing, 1999

The Impact of Link Arbitration on Switch Performance.

[BibT_eX]

[DOI]

Proceedings of the Fifth International Symposium on High-Performance Computer Architecture, 1999

Switch Cache: A Framework for Improving the Remote Memory Access Latency of CC-NUMA Multiprocessors.

[BibT_eX]

[DOI]

Proceedings of the Fifth International Symposium on High-Performance Computer Architecture, 1999

1998

Impact of Switch Design on the Application Performance of Cache-Coherent Multiprocessors.

[BibT_eX]

[DOI]

Proceedings of the 12th International Parallel Processing Symposium / 9th Symposium on Parallel and Distributed Processing (IPPS/SPDP '98), March 30, 1998

Circular buffered switch design with wormhole routing and virtual channels.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Computer Design: VLSI in Computers and Processors, 1998

1997

Performance of Multistage Bus Networks for a Distributed Shared Memory Multiprocessor.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 1997

Evaluation of multi-queue buffered multistage interconnection networks under uniform and non-uniform traffic patterns.

[BibT_eX]

[DOI]

Int. J. Syst. Sci., 1997

1996

Adaptive System-Level Diagnosis for Hypercube Multiprocessors.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 1996

Equalization of Digital Communication Channen Using Hartley-Neural Technique.

[BibT_eX]

Jitendriya K. Satapathy

Canapati Panda

Proceedings of the Industrial and Engineering Applications of Artificial Intelligence and Expert Systems, 1996

Evaluating Virtual Channels for Cache-Coherent Shared-Memory Multiprocessors.

[BibT_eX]

[DOI]

Proceedings of the 10th international conference on Supercomputing, 1996

An Efficient Hybrid Cache Coherence Protocol for Shared Memory Multiprocessors.

[BibT_eX]

[DOI]

Proceedings of the 1996 International Conference on Parallel Processing, 1996

1995

Subcube Fault Tolerance in Hypercube Multiprocessors.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 1995

A Combinatorial Analysis of Subcube Reliability in Hybercubes.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 1995

Mapping Molecular Dynamics Computations on to Hypercubes.

[BibT_eX]

[DOI]

Vamsee Lakamsani

D. Scott Linthicum

Parallel Comput., 1995

High-performance computer architecture.

[BibT_eX]

[DOI]

Future Gener. Comput. Syst., 1995

Accurate communication models for task scheduling in multicomputers.

[BibT_eX]

[DOI]

Proceedings of the Seventh IEEE Symposium on Parallel and Distributed Processing, 1995

Fault-tolerant sorting in SIMD hypercubes.

[BibT_eX]

[DOI]

Proceedings of IPPS '95, 1995

A Submesh Allocation Scheme for Mesh-Connected Multiprocessor Systems.

[BibT_eX]

Proceedings of the 1995 International Conference on Parallel Processing, 1995

Partitioning an Arbitrary Multicomputer Architecture.

[BibT_eX]

Sumon Shahed

Proceedings of the 1995 International Conference on Parallel Processing, 1995

A dynamic cache sub-block design to reduce false sharing.

[BibT_eX]

[DOI]

Murali Kadiyala

Proceedings of the 1995 International Conference on Computer Design (ICCD '95), 1995

valuation of multi-queue buffered multistage interconnection networks under uniform and nonuniform traffic patterns.

[BibT_eX]

[DOI]

Proceedings of the 4th International Conference on Computer Communications and Networks (ICCCN '95), 1995

1994

Finite Buffer Analysis of Multistage Interconnection Networks.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 1994

A divide-and-conquer methodology for system-level diagnosis of processor arrays.

[BibT_eX]

[DOI]

Proceedings of the Sixth IEEE Symposium on Parallel and Distributed Processing, 1994

Efficient and scalable cache coherence schemes for shared memory hypercube multiprocessors.

[BibT_eX]

[DOI]

Phanindra K. Mannava

Proceedings of the Proceedings Supercomputing '94, 1994

A Distributed Cache Coherence Protocol for Hypercube Multiprocessors.

[BibT_eX]

[DOI]

Proceedings of the 1994 International Conference on Parallel Processing, 1994

Performance and Reliability of the Multistage Bus Network.

[BibT_eX]

[DOI]

Tahsin Askar

Proceedings of the 1994 International Conference on Parallel Processing, 1994

1993

An Availability Model for MIN-Based Multiprocessors.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 1993

Design and Analysis of Cache Coherent Multistage Interconnection Networks.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 1993

Efficient Mapping of Applications on Cache Based Multiprocessors.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 1993

An Adaptive System-Level Diagnosis Approach for Hypercube Multiprocessors.

[BibT_eX]

[DOI]

Proceedings of the Fifth IEEE Symposium on Parallel and Distributed Processing, 1993

Parallel Algorithms for Hypercube Allocation.

[BibT_eX]

[DOI]

Proceedings of the Seventh International Parallel Processing Symposium, 1993

Parallel FFT Algorithms for Cache Based Shared Memory Multiprocessors.

[BibT_eX]

[DOI]

Proceedings of the 1993 International Conference on Parallel Processing, 1993

An Adaptive System-Level Diagnosis Approach for Mesh Connected Multiprocessors.

[BibT_eX]

[DOI]

Proceedings of the 1993 International Conference on Parallel Processing, 1993

An Adaptive Submesh Allocation Strategy For Two-Dimensional Mesh Connected Systems.

[BibT_eX]

[DOI]

Proceedings of the 1993 International Conference on Parallel Processing, 1993

Fault Tolerant Subcube Allocation in Hypercubes.

[BibT_eX]

[DOI]

Proceedings of the 1993 International Conference on Parallel Processing, 1993

1992

Design of an Adaptive Cache Coherence Protocol for Large Scale Multiprocessors.

[BibT_eX]

[DOI]

George Thangadurai

IEEE Trans. Parallel Distributed Syst., 1992

Cache Coherent Shared Memory Hypercube Multiprocessors.

[BibT_eX]

[DOI]

Proceedings of the Fourth IEEE Symposium on Parallel and Distributed Processing, 1992

Mapping Applications onto a Cache Coherent Multiprocessor.

[BibT_eX]

[DOI]

Proceedings of the Proceedings Supercomputing '92, 1992

A Formal Specification and Verification Technique for Cache Coherence Protocols.

[BibT_eX]

Proceedings of the 1992 International Conference on Parallel Processing, 1992

Extending Multistage Interconnection Networks for Multitasking.

[BibT_eX]

Proceedings of the 1992 International Conference on Parallel Processing, 1992

1991

Analysis of Packet-Switched Multiple-Bus Multiprocessor Systems.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 1991

MVAMIN: Mean Value Analysis Algorithms for Multistage Interconnection Networks.

[BibT_eX]

[DOI]

Jogesh K. Muppala

J. Parallel Distributed Comput., 1991

Multistage bus network (MBN): an interconnection network for cache coherent multiprocessors.

[BibT_eX]

[DOI]

Proceedings of the Third IEEE Symposium on Parallel and Distributed Processing, 1991

Performance Analysis of Layered Task Graphs.

[BibT_eX]

Proceedings of the International Conference on Parallel Processing, 1991

Performance Evaluation of Multistage Interconnection Networks with Finite Buffers.

[BibT_eX]

Proceedings of the International Conference on Parallel Processing, 1991

Load balancing with network cooperation.

[BibT_eX]

[DOI]

Proceedings of the 10th International Conference on Distributed Computing Systems (ICDCS 1991), 1991

1990

Performance Evaluation of a Dataflow Architecture.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 1990

Performance of Multiple-Bus Interconnections for Multiprocessors.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 1990

Dependability Modeling for Multiprocessors.

[BibT_eX]

[DOI]

Jeffrey T. Kreulen

Matthew Thazhuthaveetil

Computer, 1990

An adaptive cache coherence scheme for hierarchical shared-memory multiprocessors.

[BibT_eX]

[DOI]

George Thangadurai

Proceedings of the Second IEEE Symposium on Parallel and Distributed Processing, 1990

Approximate Analysis of Multiprocessing Task Graphs.

[BibT_eX]

Proceedings of the 1990 International Conference on Parallel Processing, 1990

Availability evaluation of MIN-connected multiprocessors using decomposition technique.

[BibT_eX]

[DOI]

Lei Tien

Proceedings of the 20th International Symposium on Fault-Tolerant Computing, 1990

1989

Analysis and Comparison of Cache Coherence Protocols for a Packet-Switched Multiprocessor.

[BibT_eX]

[DOI]

Bao-Chyn Liu

IEEE Trans. Computers, 1989

Approximate Analysis of Single and Multiple Ring Networks.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 1989

Arbiter designs for multiprocessor interconnection networks.

[BibT_eX]

[DOI]

Jogesh K. Muppala

Microprocessing and Microprogramming, 1989

Performance of Multiprocessor Interconnection Networks.

[BibT_eX]

[DOI]

Computer, 1989

Analysis of Computation-Communication Issues in Dynamic Dataflow Architectures.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual International Symposium on Computer Architecture. Jerusalem, 1989

Analysis of MIN Based Multiprocessors with Private Cache Memories.

[BibT_eX]

Bao-Chyn Liu

Irshad Ahmed

Proceedings of the International Conference on Parallel Processing, 1989

From Interconnection Network To Task Level Analysis.

[BibT_eX]

Proceedings of the International Conference on Parallel Processing, 1989

A systolic approach to multistage interconnection network design.

[BibT_eX]

[DOI]

Chung-Han Chen

Proceedings of the Computer Design: VLSI in Computers and Processors, 1989

1988

VLSI layout of binary tree structures.

[BibT_eX]

[DOI]

P. Chuavalee

Integr., 1988

Approximate Analysis of Task Graphs for Parallel Processing Systems.

[BibT_eX]

Uday Choudhury

Proceedings of the 1988 ACM SIGMETRICS conference on Measurement and modeling of computer systems, 1988

Design and analysis of multiple token ring networks.

[BibT_eX]

[DOI]

C. H. Chen

Proceedings of the Seventh Annual Joint Conference of the IEEE Computer and Communcations Societies. Networks: Evolution or Revolution?, 1988

A Queueing Network Model for a Cache Coherence Protocol on Multiple-bus Multiprocessors.

[BibT_eX]

Proceedings of the International Conference on Parallel Processing, 1988

1987

Analysis of Interconnection Networks with Different Arbiter Designs.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 1987

Dependability evaluation of interconnection networks.

[BibT_eX]

[DOI]

Inf. Sci., 1987

Guest Editor's Introduction Interconnection Networks for Parallel and Distributed Processing.

[BibT_eX]

[DOI]

Computer, 1987

Performance Analysis of Packet-Switched Multiple-Bus Multiprocessor Systems.

[BibT_eX]

R. Pavaskar

Proceedings of the 8th IEEE Real-Time Systems Symposium (RTSS '87), 1987

Analytical Modeling and Architectural Modifications of a Dataflow Computer.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual International Symposium on Computer Architecture. Pittsburgh, 1987

Design and Analysis of a Decentralized Multiple-Bus Multiprocessor.

[BibT_eX]

Proceedings of the International Conference on Parallel Processing, 1987

Performance Analysis of the MIT Tagged Token Dataflow Architecture.

[BibT_eX]

Proceedings of the International Conference on Parallel Processing, 1987

1986

Dependability Evaluation of Multicomputer Networks.

[BibT_eX]

Proceedings of the International Conference on Parallel Processing, 1986

Effect of Arbitration Policies on the Performance of Interconnection Networks.

[BibT_eX]

Proceedings of the International Conference on Parallel Processing, 1986

1985

Bandwidth Availability of Multiple-Bus Multiprocessors.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 1985

An Analysis of Processor-Memory Interconnection Networks.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 1985

Computation Availability of Multiple-Bus Multiprocessors.

[BibT_eX]

Proceedings of the International Conference on Parallel Processing, 1985

Reliability Simulation of Multiprocessor Systems.

[BibT_eX]

Proceedings of the International Conference on Parallel Processing, 1985

Introduction to session R2 (session overiew): advanced computer architectures.

[BibT_eX]

[DOI]

Proceedings of the 13th ACM Annual Conference on Computer Science, 1985

1984

Generalized Hypercube and Hyperbus Structures for a Computer Network.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 1984

On the Performance of Loosely Coupled Multiprocessors.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Symposium on Computer Architecture, 1984

1983

Performance Analysis of FFT Algorithms on Multiprocessor Systems.

[BibT_eX]

[DOI]

IEEE Trans. Software Eng., 1983

Design and Performance of Generalized Interconnection Networks.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 1983

An Interference Analysis of Interconnection Networks.

[BibT_eX]

C. W. Lee

Proceedings of the International Conference on Parallel Processing, 1983

1982

On the Generalized Binary System.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 1982

A general class of processor interconnection strategies.

[BibT_eX]

[DOI]

Proceedings of the 9th International Symposium on Computer Architecture (ISCA 1982), 1982

Design and performance of a general class of interconnection networks.

[BibT_eX]

Proceedings of the International Conference on Parallel Processing, 1982

VLSI Performance of Multistage Interconnection Network Using 4*4 Switches.

[BibT_eX]

Proceedings of the Proceedings of the 3rd International Conference on Distributed Computing Systems, 1982

Applications of SIMD computers in signal processing.

[BibT_eX]

[DOI]