Marc Snir

According to our database1, Marc Snir
  • authored at least 144 papers between 1977 and 2017.
  • has a "Dijkstra number"2 of three.

Awards

ACM Fellow

ACM Fellow 1999, "For contributions to the theory of parallel computation and the development of scaleable parallel systems architectures.".

IEEE Fellow

IEEE Fellow 1996, "For technical leadership in the development of parallel computation and scalable parallel systems architectures.".

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

Homepages:

On csauthors.net:

Bibliography

2017
Eliminating contention bottlenecks in multithreaded MPI.
Parallel Computing, 2017

Predicting HPC parallel program performance based on LLVM compiler.
Cluster Computing, 2017

The informal guide to ACM fellow nominations.
Commun. ACM, 2017

Towards a More Complete Understanding of SDC Propagation.
Proceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing, 2017

LogAider: A tool for mining potential correlations of HPC log events.
Proceedings of the 17th IEEE/ACM International Symposium on Cluster, 2017

2016
Damaris: Addressing Performance Variability in Data Management for Post-Petascale Simulations.
TOPC, 2016

Doing Moore with Less - Leapfrogging Moore's Law with Inexactness for Supercomputing.
CoRR, 2016

Overcoming the power wall by exploiting inexactness and emerging COTS architectural features: Trading precision for improving application quality.
Proceedings of the 29th IEEE International System-on-Chip Conference, 2016

Towards millions of communicating threads.
Proceedings of the 23rd European MPI Users' Group Meeting, EuroMPI 2016, 2016

Reducing Waste in Extreme Scale Systems through Introspective Analysis.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016

2015
Design of a Multithreaded Barnes-Hut Algorithm for Multicore Clusters.
IEEE Trans. Parallel Distrib. Syst., 2015

Towards a more fault resilient multigrid solver.
Proceedings of the Symposium on High Performance Computing, 2015

PPL: an abstract runtime system for hybrid parallel programming.
Proceedings of the First International Workshop on Extreme Scale Programming Models and Middleware, 2015

Pattern-driven parallel I/O tuning.
Proceedings of the 10th Parallel Data Storage Workshop, 2015

Scheduling the I/O of HPC Applications Under Congestion.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium, 2015

A General Space-filling Curve Algorithm for Partitioning 2D Meshes.
Proceedings of the 17th IEEE International Conference on High Performance Computing and Communications, 2015

Distributed Monitoring and Management of Exascale Systems in the Argo Project.
Proceedings of the Distributed Applications and Interoperable Systems, 2015

Understanding the Propagation of Error Due to a Silent Data Corruption in a Sparse Matrix Vector Multiply.
Proceedings of the 2015 IEEE International Conference on Cluster Computing, 2015

Dynamic Model-Driven Parallel I/O Performance Tuning.
Proceedings of the 2015 IEEE International Conference on Cluster Computing, 2015

2014
Addressing failures in exascale computing.
IJHPCA, 2014

Enabling communication concurrency through flexible MPI endpoints.
IJHPCA, 2014

Improved MPI collectives for MPI processes in shared address spaces.
Cluster Computing, 2014

Automatic generation of I/O kernels for HPC applications.
Proceedings of the 9th Parallel Data Storage Workshop, 2014

The future of supercomputing.
Proceedings of the 2014 International Conference on Supercomputing, 2014

Improving parallel I/O autotuning with performance modeling.
Proceedings of the 23rd International Symposium on High-Performance Parallel and Distributed Computing, 2014

FlipIt: An LLVM Based Fault Injector for HPC.
Proceedings of the Euro-Par 2014: Parallel Processing Workshops, 2014

2013
Failure prediction for HPC systems and applications: Current situation and open issues.
IJHPCA, 2013

Programming for Exascale Computers.
Computing in Science and Engineering, 2013

Software Abstractions and Methodologies for HPC Simulation Codes on Future Architectures.
CoRR, 2013

Taming parallel I/O complexity with auto-tuning.
Proceedings of the International Conference for High Performance Computing, 2013

Enabling MPI interoperability through flexible communication endpoints.
Proceedings of the 20th European MPI Users's Group Meeting, 2013

Programming models for extreme-scale computing.
Proceedings of the ACM Symposium on Principles of Distributed Computing, 2013

NUMA-aware shared-memory collective communication for MPI.
Proceedings of the 22nd International Symposium on High-Performance Parallel and Distributed Computing, 2013

Programming Models for High-Performance Computing.
Proceedings of the 13th IEEE/ACM International Symposium on Cluster, 2013

2012
Fault prediction under the microscope: a closer look into HPC systems.
Proceedings of the SC Conference on High Performance Computing Networking, 2012

Automatic datatype generation and optimization.
Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2012

HydEE: Failure Containment without Event Logging for Large Scale Send-Deterministic MPI Applications.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

Damaris: How to Efficiently Leverage Multicore Parallelism to Achieve Scalable, Jitter-free I/O.
Proceedings of the 2012 IEEE International Conference on Cluster Computing, 2012

2011
Reduce and Scan.
Proceedings of the Encyclopedia of Parallel Computing, 2011

Distributed-Memory Multiprocessor.
Proceedings of the Encyclopedia of Parallel Computing, 2011

The International Exascale Software Project roadmap.
IJHPCA, 2011

Computer and information science and engineering: one discipline, many specialties.
Commun. ACM, 2011

Optimizing the Barnes-Hut algorithm in UPC.
Proceedings of the Conference on High Performance Computing Networking, 2011

Performance modeling for systematic performance tuning.
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, 2011

Writing Parallel Libraries with MPI - Common Practice, Issues, and Extensions.
Proceedings of the Recent Advances in the Message Passing Interface, 2011

Uncoordinated Checkpointing Without Domino Effect for Send-Deterministic MPI Applications.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Transformation for class immutability.
Proceedings of the 33rd International Conference on Software Engineering, 2011

Generic topology mapping strategies for large-scale parallel architectures.
Proceedings of the 25th International Conference on Supercomputing, 2011, Tucson, AZ, USA, May 31, 2011

Comparing archival policies for Blue Waters.
Proceedings of the 18th International Conference on High Performance Computing, 2011

2010
Ubiquitous Parallel Computing from Berkeley, Illinois, and Stanford.
IEEE Micro, 2010

Advice to members seeking ACM distinction.
Commun. ACM, 2010

On Communication Determinism in Parallel HPC Applications.
Proceedings of the 19th International Conference on Computer Communications and Networks, 2010

2009
On the Need for a Consortium of Capability Centers.
IJHPCA, 2009

Toward Exascale Resilience.
IJHPCA, 2009

ESoftCheck: Removal of Non-vital Checks for Fault Tolerance.
Proceedings of the CGO 2009, 2009

2008
Efficient software checking for fault tolerance.
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

2007
Techniques for Efficient Software Checking.
Proceedings of the Languages and Compilers for Parallel Computing, 2007

Programming Patterns for Architecture-Level Software Optimizations on Frequent Pattern Mining.
Proceedings of the 23rd International Conference on Data Engineering, 2007

2005
Automatic Tuning Matrix Multiplication Performance on Graphics Hardware.
Proceedings of the 14th International Conference on Parallel Architecture and Compilation Techniques (PACT 2005), 2005

2004
A Note on N-Body Computations with Cutoffs.
Theory Comput. Syst., 2004

A Framework for Measuring Supercomputer Productivity.
IJHPCA, 2004

2003
Best Papers from the 2002 International Parallel and Distributed Processing Symposium.
J. Parallel Distrib. Comput., 2003

2002
Demonstrating the Scalability of a Molecular Dynamics Application on a Petaflops Computer.
International Journal of Parallel Programming, 2002

2001
Generalized Communicators in the Message Passing Interface.
IEEE Trans. Parallel Distrib. Syst., 2001

What Are the Top Ten Most Influential Parallel and Distributed Processing Concepts of the Past Millenium?
J. Parallel Distrib. Comput., 2001

Blue Gene: A vision for protein science using a petaflop supercomputer.
IBM Systems Journal, 2001

Demonstrating the scalability of a molecular dynamics application on a Petaflop computer.
Proceedings of the 15th international conference on Supercomputing, 2001

2000
Java programming for high-performance numerical computing.
IBM Systems Journal, 2000

From Trace Generation to Visualization: A Performance Framework for Distributed Parallel Systems.
Proceedings of the Proceedings Supercomputing 2000, 2000

1999
SP2 System Architecture.
IBM Systems Journal, 1999

1998
Optimizing Array Reference Checking in Java Programs.
IBM Systems Journal, 1998

The NYU Ultracomputer - Designing a MIMD, Shared-Memory Parallel Machine.
Proceedings of the 25 Years of the International Symposia on Computer Architecture (Selected Papers)., 1998

PRISM: An Integrated Architecture for Scalable Shared Memory.
Proceedings of the Fourth International Symposium on High-Performance Computer Architecture, Las Vegas, Nevada, USA, January 31, 1998

1997
Message Proxies for Efficient, Protected Communication on SMP Clusters.
Proceedings of the 3rd IEEE Symposium on High-Performance Computer Architecture (HPCA '97), 1997

1996
Randomized Routing with Shorter Paths.
IEEE Trans. Parallel Distrib. Syst., 1996

A Message Passing Standard for MPP and Workstations.
Commun. ACM, 1996

For a Massive Number of Massively Parallel Machines: What are the Target Applications, Who are the Target Users, and What New R&D is Needed to Hit the Target?
Proceedings of IPPS '96, 1996

MPI-2: Extending the Message-Passing Interface.
Proceedings of the Euro-Par '96 Parallel Processing, 1996

1995
CCL: A Portable and Tunable Collective Communication Library for Scalable Parallel Computers.
IEEE Trans. Parallel Distrib. Syst., 1995

The Communication Software and Parallel Environment of the IBM SP2.
IBM Systems Journal, 1995

Parallel File Systems for the IBM SP Computers.
IBM Systems Journal, 1995

SP2 System Architecture.
IBM Systems Journal, 1995

MPI Programming Environment for IBM SP1/SP2.
Proceedings of the 15th International Conference on Distributed Computing Systems, Vancouver, British Columbia, Canada, May 30, 1995

1994
Calling Names on Nameless Networks
Inf. Comput., August, 1994

The IBM External User Interface for Scalable Parallel Systems.
Parallel Computing, 1994

Memory versus randomization in on-line algorithms.
IBM Journal of Research and Development, 1994

CCL: A Portable and Tunable Collective Communication Library for Scalable Parallel Computers.
Proceedings of the 8th International Symposium on Parallel Processing, 1994

MPI-F: An Efficient Implementation of MPI on IBM-SP1.
Proceedings of the 1994 International Conference on Parallel Processing, 1994

1993
Random Walks on Weighted Graphs and Applications to On-line Algorithms.
J. ACM, 1993

Randomized routing with shorter paths.
SPAA, 1993

Scalable Parallel Computing: The IBM 9076 Scalable POWERparallel 1.
SPAA, 1993

Designing Efficient, Scalable, and Portable Collective Communication Libraries.
Proceedings of the Sixth SIAM Conference on Parallel Processing for Scientific Computing, 1993

Computer Architectures and Programming Models for Scalable Parallel Computing.
Proceedings of the Conference Record of the Twentieth Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 1993

Issues and Directions in Scalable Parallel Computing.
Proceedings of the Twelth Annual ACM Symposium on Principles of Distributed Computing, 1993

1992
Using Visualization Tools to Understand Concurrency.
IEEE Software, 1992

Cost-Performance Tradeoffs for Interconnection Networks.
Discrete Applied Mathematics, 1992

Scalable Parallel Computers and Scalable Parallel Codes: From Theory to Practice.
Proceedings of the Parallel Architectures and Their Efficient Use, 1992

1991
Size-depth Trade-Offs for Monotone Arithmetic Circuits.
Theor. Comput. Sci., 1991

Better Computing on the Anonymous Ring.
J. Algorithms, 1991

1990
A Complexity Theory of Efficient Parallel Algorithms.
Theor. Comput. Sci., 1990

Communication Complexity of PRAMs.
Theor. Comput. Sci., 1990

Efficient Parallel Algorithms for Graph Problems.
Algorithmica, 1990

Random Walks on Weighted Graphs, and Applications to On-line Algorithms (Preliminary Version)
Proceedings of the 22nd Annual ACM Symposium on Theory of Computing, 1990

1989
Techniques for Parallel Manipulation of Sparse Matrices.
Theor. Comput. Sci., 1989

Cost-Bandwidth Tradeoffs for Communication Networks.
SPAA, 1989

On Communication Latency in PRAM Computations.
SPAA, 1989

Memory Versus Randomization in On-line Algorithms (Extended Abstract).
Proceedings of the Automata, Languages and Programming, 16th International Colloquium, 1989

1988
Efficient and Correct Execution of Parallel Programs that Share Memory.
ACM Trans. Program. Lang. Syst., 1988

Efficient Synchronization on Multiprocessors with Shared Memory.
ACM Trans. Program. Lang. Syst., 1988

The Distribution of Waiting Times in Clocked Multistage Interconnection Networks.
IEEE Trans. Computers, 1988

Computing on an anonymous ring.
J. ACM, 1988

A Complexity Theory of Efficient Parallel Algorithms (Extended Abstract).
Proceedings of the Automata, Languages and Programming, 15th International Colloquium, 1988

Better Computing on the Anonymous Ring.
Proceedings of the VLSI Algorithms and Architectures, 3rd Aegean Workshop on Computing, 1988

1987
A Model for Hierarchical Memory
Proceedings of the 19th Annual ACM Symposium on Theory of Computing, 1987

Hierarchical Memory with Block Transfer
Proceedings of the 28th Annual Symposium on Foundations of Computer Science, 1987

1986
A Unified Theory of Interconnection Network Structure.
Theor. Comput. Sci., 1986

Depth-Size Trade-Offs for Parallel Prefix Computation.
J. Algorithms, 1986

Exact Balancing is Not Always Good.
Inf. Process. Lett., 1986

Efficient Synchronization on Multiprocessors with Shared Memory.
Proceedings of the Fifth Annual ACM Symposium on Principles of Distributed Computing, 1986

The Distribution of Waiting Times in Clocked Multistage Interconnection Networks.
Proceedings of the International Conference on Parallel Processing, 1986

Efficient Parallel Algorithms for Graph Models.
Proceedings of the International Conference on Parallel Processing, 1986

1985
Applications of Ramsey's Theorem to Decision Tree Complexity
J. ACM, October, 1985

Lower Bounds on Probabilistic Linear Decision Trees.
Theor. Comput. Sci., 1985

The Power of Parallel Prefix.
IEEE Trans. Computers, 1985

On Parallel Searching.
SIAM J. Comput., 1985

Computing on an Anonymous Ring.
Proceedings of the Fourth Annual ACM Symposium on Principles of Distributed Computing, 1985

Issues Related to MIMD Shared-memory Computers: The NYU Ultracomputer Approach.
Proceedings of the 12th Annual Symposium on Computer Architecture, 1985

The Power of Parallel Prefix.
Proceedings of the International Conference on Parallel Processing, 1985

1984
The Importance of Being Square.
Proceedings of the 11th Annual Symposium on Computer Architecture, 1984

Applications of Ramsey's Theorem to Decision Trees Complexity (Preliminary Version)
Proceedings of the 25th Annual Symposium on Foundations of Computer Science, 1984

1983
The Performance of Multistage Interconnection Networks for Multiprocessors.
IEEE Trans. Computers, 1983

The NYU Ultracomputer - Designing an MIMD Shared Memory Parallel Computer.
IEEE Trans. Computers, 1983

Circuit partitioning with size and connection constraints.
Networks, 1983

1982
Comparisons between Linear Functions can Help.
Theor. Comput. Sci., 1982

Probabilities Over Rich Languages, Testing and Randomness.
J. Symb. Log., 1982

Some Exact Complexity Results for Straight-Line Computations over Semirings.
J. ACM, 1982

On Parallel Searching (Extended Abstract).
Proceedings of the ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing, 1982

The NYU Ultracomputer-designing a MIMD, shared-memory parallel machine (Extended Abstract).
Proceedings of the 9th International Symposium on Computer Architecture (ISCA 1982), 1982

1981
On the Complexity of Simplifying Quadratic Forms.
Inf. Process. Lett., 1981

Proving Lower Bounds for Linar Decision Trees.
Proceedings of the Automata, 1981

1980
On the Depth Complexity of Formulas.
Mathematical Systems Theory, 1980

On the Size Complexity of Monotone Formulas.
Proceedings of the Automata, 1980

1979
The covering problem of complete uniform hypergraphs.
Discrete Mathematics, 1979

1977
A Direct Approach to the Parallel Evaluation of Rational Expressions with a Small Number of Processors.
IEEE Trans. Computers, 1977


  Loading...