Max Grossman

According to our database1, Max Grossman authored at least 31 papers between 2009 and 2020.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

On csauthors.net:

Bibliography

2020
Integrating Inter-Node Communication with a Resilient Asynchronous Many-Task Runtime System.
Proceedings of the 2020 Workshop on Exascale MPI, 2020

HOOVER: Leveraging OpenSHMEM for High Performance, Flexible Streaming Graph Applications.
Proceedings of the 3rd IEEE/ACM Annual Parallel Applications Workshop: Alternatives To MPI+X, 2020

2018
Data-parallel distributed training of very large models beyond GPU capacity.
CoRR, 2018

A One Year Retrospective on a MOOC in Parallel, Concurrent, and Distributed Programming in Java.
Proceedings of the 2018 IEEE/ACM Workshop on Education for High-Performance Computing, 2018

A Unified Runtime for PGAS and Event-Driven Programming.
Proceedings of the 4th International Workshop on Extreme Scale Programming Models and Middleware, 2018

HOOVER: Distributed, Flexible, and Scalable Streaming Graph Processing on OpenSHMEM.
Proceedings of the OpenSHMEM and Related Technologies. OpenSHMEM in the Era of Extreme Heterogeneity, 2018

S2FA: an accelerator automation framework for heterogeneous computing in datacenters.
Proceedings of the 55th Annual Design Automation Conference, 2018

2017
Deadlock avoidance in parallel programs with futures: why parallel tasks should not wait for strangers.
Proc. ACM Program. Lang., 2017

Pedagogy and tools for teaching parallel computing at the sophomore undergraduate level.
J. Parallel Distributed Comput., 2017

Chapel-on-X: Exploring Tasking Runtimes for PGAS Languages.
Proceedings of the Third International Workshop on Extreme Scale Programming Models and Middleware, 2017

Graph500 on OpenSHMEM: Using A Practical Survey of Past Work to Motivate Novel Algorithmic Developments.
Proceedings of PAW@SC 2017: Second Annual PGAS Applications Workshop, 2017

Implementation and Evaluation of OpenSHMEM Contexts Using OFI Libfabric.
Proceedings of the OpenSHMEM and Related Technologies. Big Compute and Big Data Convergence, 2017

Preparing an Online Java Parallel Computing Course.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium Workshops, 2017

A Pluggable Framework for Composable HPC Scheduling Libraries.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium Workshops, 2017

2016
HadoopCL2: Motivating the Design of a Distributed, Heterogeneous Programming System With Machine-Learning Applications.
IEEE Trans. Parallel Distributed Syst., 2016

A survey of sparse matrix-vector multiplication performance on large matrices.
CoRR, 2016

Static Cost Estimation for Data Layout Selection on GPUs.
Proceedings of the 7th International Workshop on Performance Modeling, 2016

Integrating Asynchronous Task Parallelism with OpenSHMEM.
Proceedings of the OpenSHMEM and Related Technologies. Enhancing OpenSHMEM for Hybrid Environments, 2016

OpenMP as a High-Level Specification Language for Parallelism - And its use in Evaluating Parallel Programming Systems.
Proceedings of the OpenMP: Memory, Devices, and Tasks, 2016

Efficient Checkpointing of Multi-threaded Applications as a Tool for Debugging, Performance Tuning, and Resiliency.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016

SWAT: A Programmable, In-Memory, Distributed, High-Performance Computing Platform.
Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing, 2016

2015
Auto-grading for parallel programs.
Proceedings of the Workshop on Education for High-Performance Computing, 2015

HJ-OpenCL: Reducing the Gap Between the JVM and Accelerators.
Proceedings of the Principles and Practices of Programming on The Java Platform, 2015

2013
Accelerating Habanero-Java programs with OpenCL generation.
Proceedings of the 2013 International Conference on Principles and Practices of Programming on the Java Platform: Virtual Machines, 2013

Speculative Execution of Parallel Programs with Precise Exception Semantics on GPUs.
Proceedings of the Languages and Compilers for Parallel Computing, 2013

HadoopCL: MapReduce on Distributed Heterogeneous Platforms through Seamless Integration of Hadoop and OpenCL.
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

Integrating Asynchronous Task Parallelism with MPI.
Proceedings of the 27th IEEE International Symposium on Parallel and Distributed Processing, 2013

Compiler-Driven Data Layout Transformation for Heterogeneous Platforms.
Proceedings of the Euro-Par 2013: Parallel Processing Workshops, 2013

2011
Dynamic Task Parallelism with a GPU Work-Stealing Runtime System.
Proceedings of the Languages and Compilers for Parallel Computing, 2011

2010
CnC-CUDA: Declarative Programming for GPUs.
Proceedings of the Languages and Compilers for Parallel Computing, 2010

2009
JCUDA: A Programmer-Friendly Interface for Accelerating Java Programs with CUDA.
Proceedings of the Euro-Par 2009 Parallel Processing, 2009


  Loading...