Takeshi Fukaya

Senxi Li

CCF Trans. High Perform. Comput., 2020

An Integer Arithmetic-Based Sparse Linear Solver Using a GMRES Method and Iterative Refinement.

[BibT_eX]

[DOI]

Kengo Suzuki

Proceedings of the 11th IEEE/ACM Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, 2020

Effect of Mixed Precision Computing on H-Matrix Vector Multiplication in BEM Analysis.

[BibT_eX]

[DOI]

Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region, 2020

2019

Enhancement of Algebraic Block Multi-Color Ordering for ILU Preconditioning and Its Performance Evaluation in Preconditioned GMRES Solver.

[BibT_eX]

[DOI]

Senxi Li

J. Inf. Process., 2019

An investigation into the impact of the structured QR kernel on the overall performance of the TSQR algorithm.

[BibT_eX]

[DOI]

Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region, 2019

2018

A Case Study on Modeling the Performance of Dense Matrix Computation: Tridiagonalization in the EigenExa Eigensolver on the K Computer.

[BibT_eX]

[DOI]

Toshiyuki Imamura

Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium Workshops, 2018

Time-space tiling with tile-level parallelism for the 3D FDTD method.

[BibT_eX]

[DOI]

Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region, 2018

2016

Roundoff error analysis of the CholeskyQR2 algorithm in an oblique inner product.

[BibT_eX]

[DOI]

JSIAM Lett., 2016

On Constructing Cost Models for Online Automatic Tuning Using ATMathCoreLib: Case Studies through the SVD Computation on a Multicore Processor.

[BibT_eX]

[DOI]

Seiji Nagashima

Proceedings of the 10th IEEE International Symposium on Embedded Multicore/Many-core Systems-on-Chip, 2016

2015

Performance Analysis of the Chebyshev Basis Conjugate Gradient Method on the K Computer.

[BibT_eX]

[DOI]

Proceedings of the Parallel Processing and Applied Mathematics, 2015

CAHTR: Communication-Avoiding Householder TRidiagonalization.

[BibT_eX]

[DOI]

Proceedings of the Parallel Computing: On the Road to Exascale, 2015

Performance Evaluation of the Eigen Exa Eigensolver on Oakleaf-FX: Tridiagonalization Versus Pentadiagonalization.

[BibT_eX]

[DOI]

Toshiyuki Imamura

Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 2015

2014

Performance Analysis of the Householder-Type Parallel Tall-Skinny QR Factorizations Toward Automatic Algorithm Selection.

[BibT_eX]

[DOI]

Toshiyuki Imamura

Proceedings of the High Performance Computing for Computational Science - VECPAR 2014 - 11th International Conference, Eugene, OR, USA, June 30, 2014

CholeskyQR2: a simple and communication-avoiding algorithm for computing a tall-skinny QR factorization on a large-scale parallel system.

[BibT_eX]

[DOI]

Proceedings of the 5th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, 2014

2011

Acceleration of Hessenberg Reduction for Nonsymmetric Eigenvalue Problems in a Hybrid CPU-GPU Computing Environment.

[BibT_eX]

[DOI]

Int. J. Netw. Comput., 2011

2010

Differential qd algorithm for totally nonnegative Hessenberg matrices: introduction of origin shifts and relationship with the discrete hungry Lotka-Volterra system.

[BibT_eX]

[DOI]

JSIAM Lett., 2010

Dynamic Programming Approaches to Optimizing the Blocking Strategy for Basic Matrix Decompositions.

[BibT_eX]

[DOI]