Naoya Maruyama

Proceedings of the ACM International Conference on Computing Frontiers, CF'16, 2016

Extreme scale breadth-first search on supercomputers.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Big Data (IEEE BigData 2016), 2016

2015

Data-centric GPU-based adaptive mesh refinement.

[BibT_eX]

[DOI]

Proceedings of the 5th Workshop on Irregular Applications - Architectures and Algorithms, 2015

PDSEC Keynote.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 2015

Automated GPU Kernel Transformations in Large-Scale Production Stencil Applications.

[BibT_eX]

[DOI]

Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing, 2015

2014

Scalable Kernel Fusion for Memory-Bound GPU Applications.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2014

An OpenACC extension for data layout transformation.

[BibT_eX]

[DOI]

Tetsuya Hoshino

Proceedings of the First Workshop on Accelerator Programming using Directives, 2014

Evaluation of Asynchronous MPI Communication in Map-Reduce System on the K Computer.

[BibT_eX]

[DOI]

Motohiko Matsuda

Shin'ichiro Takizawa

Proceedings of the 21st European MPI Users' Group Meeting, 2014

FMI: Fault Tolerant Messaging Interface for Fast and Transparent Recovery.

[BibT_eX]

[DOI]

Bronis R. de Supinski

Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014

A User-Level InfiniBand-Based File System and Checkpoint Strategy for Burst Buffers.

[BibT_eX]

[DOI]

Bronis R. de Supinski

Proceedings of the 14th IEEE/ACM International Symposium on Cluster, 2014

2013

Fork-Join and Data-Driven Execution Models on Multi-core Architectures: Case Study of the FMM.

[BibT_eX]

[DOI]

Proceedings of the Supercomputing - 28th International Supercomputing Conference, 2013

Improving the Computing Efficiency of HPC Systems Using a Combination of Proactive and Preventive Checkpointing.

[BibT_eX]

[DOI]

Mohamed-Slim Bouguerra

Ana Gainaru

Franck Cappello

Proceedings of the 27th IEEE International Symposium on Parallel and Distributed Processing, 2013

Integrating Multi-GPU Execution in an OpenACC Compiler.

[BibT_eX]

[DOI]

Proceedings of the 42nd International Conference on Parallel Processing, 2013

Topic 15: GPU and Accelerator Computing - (Introduction).

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2013 Parallel Processing, 2013

Highly optimized full GPU-acceleration of non-hydrostatic weather model SCALE-LES.

[BibT_eX]

[DOI]

Proceedings of the 2013 IEEE International Conference on Cluster Computing, 2013

K MapReduce: A scalable tool for data-processing and search/ensemble applications on large-scale supercomputers.

[BibT_eX]

[DOI]

Motohiko Matsuda

Shin'ichiro Takizawa

Proceedings of the 2013 IEEE International Conference on Cluster Computing, 2013

CUDA vs OpenACC: Performance Case Studies with Kernel Benchmarks and a Memory-Bound CFD Application.

[BibT_eX]

[DOI]

Proceedings of the 13th IEEE/ACM International Symposium on Cluster, 2013

2012

A Multi GPU Read Alignment Algorithm with Model-Based Performance Optimization.

[BibT_eX]

[DOI]

Aleksandr Drozd

Proceedings of the High Performance Computing for Computational Science, 2012

A Task Parallel Implementation of Fast Multipole Methods.

[BibT_eX]

[DOI]

Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Design and modeling of a non-blocking checkpointing system.

[BibT_eX]

[DOI]

Bronis R. de Supinski

Proceedings of the SC Conference on High Performance Computing Networking, 2012

Sequence Alignment on Massively Parallel Heterogeneous Systems.

[BibT_eX]

[DOI]

Aleksandr Drozd

Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

Multi-GPU Implementation of the NICAM Atmospheric Model.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2012: Parallel Processing Workshops, 2012

Scalable Reed-Solomon-Based Reliable Local Storage for HPC Applications on IaaS Clouds.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2012 Parallel Processing - 18th International Conference, 2012

Hierarchical Clustering Strategies for Fault Tolerance in Large Scale HPC Systems.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Conference on Cluster Computing, 2012

Design and Implementation of Portable and Efficient Non-blocking Collective Communication.

[BibT_eX]

[DOI]

Proceedings of the 12th IEEE/ACM International Symposium on Cluster, 2012

2011

An exact algorithm for energy-efficient acceleration of task trees on CPU/GPU architectures.

[BibT_eX]

[DOI]

Mark Silberstein

Proceedings of of SYSTOR 2011: The 4th Annual Haifa Experimental Systems Conference, Haifa, Israel, May 30, 2011

Peta-scale phase-field simulation for dendritic solidification on the TSUBAME 2.0 supercomputer.

[BibT_eX]

[DOI]

Proceedings of the Conference on High Performance Computing Networking, 2011

Physis: an implicitly parallel programming model for stencil computations on large-scale GPU-accelerated supercomputers.

[BibT_eX]

[DOI]

Proceedings of the Conference on High Performance Computing Networking, 2011

Poster: fast GPU read alignment with burrows wheeler transform based index.

[BibT_eX]

[DOI]

Aleksandr Drozd

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, 2011

FTI: high performance fault tolerance interface for hybrid systems.

[BibT_eX]

[DOI]

Proceedings of the Conference on High Performance Computing Networking, 2011

2010

Model-based Fault Localization: Finding Behavioral Outliers in Large-scale Computing Systems.

[BibT_eX]

[DOI]

New Gener. Comput., 2010

An 80-Fold Speedup, 15.0 TFlops Full GPU Acceleration of Non-Hydrostatic Weather Model ASUCA Production Code.

[BibT_eX]

[DOI]

Proceedings of the Conference on High Performance Computing Networking, 2010

A high-performance fault-tolerant software framework for memory on commodity GPUs.

[BibT_eX]

[DOI]

Akira Nukada

Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Linpack evaluation on a supercomputer with heterogeneous accelerators.

[BibT_eX]

[DOI]

Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Low-overhead diskless checkpoint for hybrid computing systems.

[BibT_eX]

[DOI]

Proceedings of the 2010 International Conference on High Performance Computing, 2010

Statistical power modeling of GPU kernels using performance counters.

[BibT_eX]

[DOI]

Proceedings of the International Green Computing Conference 2010, 2010

Distributed Diskless Checkpoint for Large Scale Systems.

[BibT_eX]

[DOI]

Franck Cappello

Proceedings of the 10th IEEE/ACM International Conference on Cluster, 2010

2009

Adaptive Resource Indexing Technique for Unstructured Peer-to-Peer Networks.

[BibT_eX]

[DOI]

Sumeth Lerthirunwong

Proceedings of the 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, 2009

2008

An efficient, model-based CPU-GPU heterogeneous FFT library.

[BibT_eX]

[DOI]

Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

Model-based fault localization in large-scale computing systems.

[BibT_eX]

[DOI]

Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

Access-pattern and bandwidth aware file replication algorithm in a grid environment.

[BibT_eX]

[DOI]

Proceedings of the 9th IEEE/ACM International Conference on Grid Computing (Grid 2008), Tsukuba, Japan, September 29, 2008

2007

Model-based resource selection for efficient virtual cluster deployment.

[BibT_eX]

[DOI]

Shohei Yamasaki

Proceedings of the 2nd International Workshop on Virtualization Technology in Distributed Computing, 2007

Virtual Clusters on the Fly - Fast, Scalable, and Flexible Installation.

[BibT_eX]

[DOI]

Hideo Nishimura

Alexander V. Mirgorodskiy

Proceedings of the Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2007), 2007

2006

Scalable systems software - Problem diagnosis in large-scale computing environments.

[BibT_eX]

[DOI]

Barton P. Miller

Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

Making Wide-Area, Multi-site MPI Feasible Using Xen VM.

[BibT_eX]

[DOI]

Masaki Tatezono