Vijay S. Pai

According to our database1, Vijay S. Pai authored at least 55 papers between 1996 and 2016.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2016
Symbiote Coprocessor Unit - A Streaming Coprocessor for Data Stream Acceleration.
IEEE Trans. Very Large Scale Integr. Syst., 2016

2015
Runtime-driven shared last-level cache management for task-parallel programs.
Proceedings of the International Conference for High Performance Computing, 2015

Automatic sharing classification and timely push for cache-coherent systems.
Proceedings of the International Conference for High Performance Computing, 2015

Exploiting Process Imbalance to Improve MPI Collective Operations in Hierarchical Systems.
Proceedings of the 29th ACM on International Conference on Supercomputing, 2015

DyReCTape: a <u>dy</u>namically <u>re</u>configurable <u>c</u>ache using domain wall memory <u>tape</u>s.
Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, 2015

2014
MorphStore: A local file system for Big Data with utility-driven replication and load-adaptive access scheduling.
Proceedings of the IEEE 30th Symposium on Mass Storage Systems and Technologies, 2014

Accelerating MPI Collective Communications through Hierarchical Algorithms Without Sacrificing Inter-Node Communication Flexibility.
Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014

Variation Aware Cache Partitioning for Multithreaded Programs.
Proceedings of the 51st Annual Design Automation Conference 2014, 2014

Bridging the Virtualization Performance Gap for HPC Using SR-IOV for InfiniBand.
Proceedings of the 2014 IEEE 7th International Conference on Cloud Computing, Anchorage, AK, USA, June 27, 2014

2013
Imbalanced cache partitioning for balanced data-parallel programs.
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture, 2013

A mathematical hard disk timing model for full system simulation.
Proceedings of the 2012 IEEE International Symposium on Performance Analysis of Systems & Software, 2013

Exploiting domain knowledge to optimize parallel computational mechanics codes.
Proceedings of the International Conference on Supercomputing, 2013

2012
Managing cellular congestion using incentives.
IEEE Commun. Mag., 2012

Integrating High Performance File Systems in a Cloud Computing Environment.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012

2010
Towards architecture independent metrics for multicore performance analysis.
SIGMETRICS Perform. Evaluation Rev., 2010

Using data structure knowledge for efficient lock generation and strong atomicity.
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2010

Modeling advanced collective communication algorithms on cell-based systems.
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2010

Automatic atomic region identification in shared memory SPMD programs.
Proceedings of the 25th Annual ACM SIGPLAN Conference on Object-Oriented Programming, 2010

Storage optimization for a peer-to-peer video-on-demand network.
Proceedings of the First Annual ACM SIGMM Conference on Multimedia Systems, 2010

Multicore-aware reuse distance analysis.
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Accelerating multicore reuse distance analysis with sampling and parallelization.
Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, 2010

2009
Efficient high performance collective communication for the cell blade.
Proceedings of the 23rd international conference on Supercomputing, 2009

Peer-to-peer video on demand: Challenges and solutions.
Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, 2009

2008
Conservative vs. Optimistic Parallelization of Stateful Network Intrusion Detection.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2008

Advanced collective communication in aspen.
Proceedings of the 22nd Annual International Conference on Supercomputing, 2008

2007
Parallel Programmable Ethernet Controllers: Performance and Security.
IEEE Netw., 2007

Expressing and exploiting concurrency in networked applications with aspen.
Proceedings of the 12th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2007

Improving VoD server efficiency with bittorrent.
Proceedings of the 15th International Conference on Multimedia 2007, 2007

Design Alternatives for a High-Performance Self-Securing Ethernet Network Interface.
Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

Achieving Reliable Parallel Performance in a VoD Storage Server Using Randomization and Replication.
Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

A Model and Prototype of a Resource-Efficient Storage Server for High-Bitrate Video-on-Demand.
Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

2006
Seekable sockets: a mechanism to reduce copy overheads in TCP-based messaging.
Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

2005
Network Interface Data Caching.
IEEE Trans. Computers, 2005

Achieving Structural and Composable Modeling of Complex Systems.
Int. J. Parallel Program., 2005

An Efficient Programmable 10 Gigabit Ethernet Network Interface Card.
Proceedings of the 11th International Conference on High-Performance Computer Architecture (HPCA-11 2005), 2005

2004
Isolating the performance impacts of network interface cards through microbenchmarks.
Proceedings of the International Conference on Measurements and Modeling of Computer Systems, 2004

Spinach: a liberty-based simulator for programmable network interface architectures.
Proceedings of the 2004 ACM SIGPLAN/SIGBED Conference on Languages, 2004

Achieving Structural and Composable Modeling of Complex Systems.
Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), 2004

2003
Challenges in Computer Architecture Evaluation.
Computer, 2003

A Flexible and Efficient Application Programming Interface (API) for a Customizable Proxy Cache.
Proceedings of the 4th USENIX Symposium on Internet Technologies and Systems, 2003

Exploiting task-level concurrency in a programmable network interface.
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2003

2002
RSIM: Simulating Shared-Memory Multiprocessors with ILP Processors.
Computer, 2002

Increasing web server throughput with network interface data caching.
Proceedings of the 10th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-X), 2002

2001
Comparing and Combining Read Miss Clustering and Software Prefetching.
Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques (PACT 2001), 2001

2000
Code Transformations to Improve Memory Parallelism.
J. Instr. Level Parallelism, 2000

1999
The Impact of Exploiting Instruction-Level Parallelism on Shared-Memory Multiprocessors.
IEEE Trans. Computers, 1999

Recent advances in memory consistency models for hardware shared memory systems.
Proc. IEEE, 1999

Improving the Accuracy vs. Speed Tradeoff for Simulating Shared-Memory Multiprocessors with ILP Processors.
Proceedings of the Fifth International Symposium on High-Performance Computer Architecture, 1999

1998
Analytic Evaluation of Shared-memory Systems with ILP Processors.
Proceedings of the 25th Annual International Symposium on Computer Architecture, 1998

1997
RSIM: Rice simulator for ILP multiprocessors.
SIGARCH Comput. Archit. News, 1997

RSIM: a simulator for shared-memory multiprocessor and uniprocessor systems that exploit ILP.
Proceedings of the 1997 workshop on Computer architecture education, 1997

Using Speculative Retirement and Larger Instruction Windows to Narrow the Performance Gap Between Memory Consistency Models.
Proceedings of the 9th Annual ACM Symposium on Parallel Algorithms and Architectures, 1997

The Interaction of Software Prefetching with ILP Processors in Shared-Memory Systems.
Proceedings of the 24th International Symposium on Computer Architecture, 1997

The Impact of Instruction-Level Parallelism on Multiprocessor Performance and Simulation Methodology.
Proceedings of the 3rd IEEE Symposium on High-Performance Computer Architecture (HPCA '97), 1997

1996
An Evaluation of Memory Consistency Models for Shared-Memory Systems with ILP Processors.
Proceedings of the ASPLOS-VII Proceedings, 1996


  Loading...