# Reiji Suda

According to our database

Collaborative distances:

^{1}, Reiji Suda authored at least 60 papers between 1994 and 2018.Collaborative distances:

## Timeline

#### Legend:

Book In proceedings Article PhD thesis Other## Links

#### Homepage:

#### On csauthors.net:

## Bibliography

2018

Fast Generation of Poisson-Disk Samples on Mesh Surfaces by Progressive Sample Projection.

PACMCGIT, 2018

Introduction to iWAPT 2018.

Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium Workshops, 2018

Automatic Hyperparameter Tuning of Machine Learning Models under Time Constraints.

Proceedings of the IEEE International Conference on Big Data, 2018

2017

Second order accuracy finite difference methods for space-fractional partial differential equations.

J. Computational Applied Mathematics, 2017

Introduction to iWAPT Workshop.

Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium Workshops, 2017

Embedded-DSL-Like Code Generation and Optimization of Bayesian Estimation Routines with User-Defined Source-to-Source Code Transformation Framework Xevolver.

Proceedings of the Fifth International Symposium on Computing and Networking, 2017

Fast maximal Poisson-disk sampling by randomized tiling.

Proceedings of High Performance Graphics, 2017

2016

Xevtgen: Fortran code transformer generator for high performance scientific codes.

IJNC, 2016

Efficient Parallel Algorithm for Optimal DAG Structure Search on Parallel Computer with Torus Network.

Proceedings of the Algorithms and Architectures for Parallel Processing, 2016

Xevdriver: A Software System Supporting XML-based Source-to-Source Code Transformations on Fortran Programs.

Proceedings of the Fourth International Symposium on Computing and Networking, 2016

2015

Performance Analysis of the Chebyshev Basis Conjugate Gradient Method on the K Computer.

Proceedings of the Parallel Processing and Applied Mathematics, 2015

Xevtgen: Fortran Code Transformer Generator for High Performance Scientific Codes.

Proceedings of the Third International Symposium on Computing and Networking, 2015

2013

Analysis Of The Girth For Regular Bi-partite Graphs With Degree 3

CoRR, 2013

Enumeration Based Search Algorithm For Finding A Regular Bi-partite Graph Of Maximum Attainable Girth For Specified Degree And Number Of Vertices

CoRR, 2013

An Efficient Task Partitioning and Scheduling Method for Symmetric Multiple GPU Architecture.

Proceedings of the 12th IEEE International Conference on Trust, 2013

Register level sort algorithm on multi-core SIMD processors.

Proceedings of the 3rd Workshop on Irregular Applications - Architectures and Algorithms, 2013

High Performance GPU Accelerated Local Optimization in TSP.

Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

The Future of Accelerator Programming: Abstraction, Performance or Can We Have Both?

Proceedings of the 19th IEEE International Conference on Parallel and Distributed Systems, 2013

A Mathematical Method for Online Autotuning of Power and Energy Consumption with Corrected Temperature Effects.

Proceedings of the International Conference on Computational Science, 2013

2012

Energy-Aware SIMD Algorithm Design on GPU and Multicore Architectures.

Proceedings of the Handbook of Energy-Aware and Green Computing - Two Volume Set., 2012

Global optimization model on power efficiency of GPU and multicore processing element for SIMD computing with CUDA.

Computer Science - R&D, 2012

Partition Parameters for Girth Maximum (m, r) BTUs

CoRR, 2012

Balanced Tanner Units And Their Properties

CoRR, 2012

Automatic Parameter Optimization for Edit Distance Algorithm on GPU.

Proceedings of the High Performance Computing for Computational Science, 2012

Brief announcement: a GPU accelerated iterated local search TSP solver.

Proceedings of the 24th ACM Symposium on Parallelism in Algorithms and Architectures, 2012

Poster: High Performance GPU Accelerated TSP Solver.

Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Abstract: High Performance GPU Accelerated TSP Solver.

Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Accelerating 2-opt and 3-opt local search using GPU in the travelling salesman problem.

Proceedings of the 2012 International Conference on High Performance Computing & Simulation, 2012

MSSM: An Efficient Scheduling Mechanism for CUDA Basing on Task Partition.

Proceedings of the 18th IEEE International Conference on Parallel and Distributed Systems, 2012

An efficient GPU implementation of a multi-start TSP solver for large problem instances.

Proceedings of the Genetic and Evolutionary Computation Conference, 2012

2011

APTCC: Auto Parallelizing Translator From C To CUDA.

Proceedings of the International Conference on Computational Science, 2011

Parallel Monte Carlo Tree Search on GPU.

Proceedings of the Eleventh Scandinavian Conference on Artificial Intelligence, 2011

Parallelizing a Coarse Grain Graph Search Problem Based upon LDPC Codes on a Supercomputer.

Proceedings of the Sixth International Symposium on Parallel Computing in Electrical Engineering (PARELEC 2011), 2011

Large-Scale Parallel Monte Carlo Tree Search on GPU.

Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

A Performance and Energy Consumption Analytical Model for GPU.

Proceedings of the IEEE Ninth International Conference on Dependable, 2011

Parallel Monte Carlo Tree Search Scalability Discussion.

Proceedings of the AI 2011: Advances in Artificial Intelligence, 2011

Experimental Estimation and Analysis of the Power Efficiency of CUDA Processing Element on SIMD Computing.

Proceedings of the 10th IEEE/ACIS International Conference on Computer and Information Science, 2011

2010

Investigation on the power efficiency of multi-core and GPU Processing Element in large scale SIMD computation with CUDA.

Proceedings of the International Green Computing Conference 2010, 2010

Software Automatic Tuning: Concepts and State-of-the-Art Results.

Proceedings of the Software Automatic Tuning, From Concepts to State-of-the-Art Results, 2010

A Bayesian Method of Online Automatic Tuning.

Proceedings of the Software Automatic Tuning, From Concepts to State-of-the-Art Results, 2010

Autotuning Method for Deciding Block Size Parameters in Dynamically Load-Balanced BLAS.

Proceedings of the Software Automatic Tuning, From Concepts to State-of-the-Art Results, 2010

Toward Automatic Performance Tuning for Numerical Simulations in the SILC Matrix Computation Framework.

Proceedings of the Software Automatic Tuning, From Concepts to State-of-the-Art Results, 2010

2009

Parallel Minimax Tree Searching on GPU.

Proceedings of the Parallel Processing and Applied Mathematics, 2009

Modeling and Optimizing the Power Performance of Large Matrices Multiplication on Multi-core and GPU Platform with CUDA.

Proceedings of the Parallel Processing and Applied Mathematics, 2009

Accurate Measurements and Precise Modeling of Power Dissipation of CUDA Kernels toward Power Optimized High Performance CPU-GPU Computing.

Proceedings of the 2009 International Conference on Parallel and Distributed Computing, 2009

Modeling and Estimation for the Power Consumption of Matrix Computation on Multi-core Platform.

Proceedings of the Second International Joint Conference on Computational Sciences and Optimization, 2009

Power Efficient Large Matrices Multiplication by Load Scheduling on Multi-core and GPU Platform with CUDA.

Proceedings of the 12th IEEE International Conference on Computational Science and Engineering, 2009

Aspects of GPU for general purpose high performance computing.

Proceedings of the 14th Asia South Pacific Design Automation Conference, 2009

2008

Divisible load scheduling with improved asymptotic optimality.

Proceedings of the 2008 IEEE International Conference on Cluster Computing, 29 September, 2008

An optimized Dynamic Load Balancing method for parallel 3-D mesh refinement for finite element electromagnetics with Tetrahedra.

Proceedings of the 2008 IEEE International Conference on Cluster Computing, 29 September, 2008

2007

Cloth Simulation in the SILC Matrix Computation Framework: A Case Study.

Proceedings of the Parallel Processing and Applied Mathematics, 2007

High Performance FFT on SGI Altix 3700.

Proceedings of the High Performance Computing and Communications, 2007

2006

Distributed SILC: An Easy-to-Use Interface for MPI-Based Parallel Matrix Computation Libraries.

Proceedings of the Applied Parallel Computing. State of the Art in Scientific Computing, 2006

2005

SILC: A Flexible and Environment-Independent Interface for Matrix Computation Libraries.

Proceedings of the Parallel Processing and Applied Mathematics, 2005

Performance Evaluation of Parallel Sparse Matrix-Vector Products on SGI Altix3700.

Proceedings of the OpenMP Shared Memory Parallel Programming - International Workshops, 2005

2002

A fast spherical harmonics transform algorithm.

Math. Comput., 2002

1999

A high performance parallelization scheme for the Hessenberg double shift QR algorithm.

Parallel Computing, 1999

1998

The Ensparsed LU Decomposition Method for Large Scale Circuit Transient Analysis.

Proceedings of the ASP-DAC '98, 1998

1995

Implementation of Sparta, a Highly Parallel Circuit Simulator by the Preconditioned Jacobi Method, on a Distributed Memory Machine.

Proceedings of the 9th international conference on Supercomputing, 1995

1994

QFP wiring problem-introduction and analytical considerations.

IEEE Trans. on CAD of Integrated Circuits and Systems, 1994