Takeshi Fukaya

Orcid: 0000-0003-1217-6444

According to our database1, Takeshi Fukaya authored at least 32 papers between 2007 and 2023.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2023
Convergence acceleration of preconditioned conjugate gradient solver based on error vector sampling for a sequence of linear systems.
Numer. Linear Algebra Appl., December, 2023

Numerical Behavior of Mixed Precision Iterative Refinement Using the BiCGSTAB Method.
J. Inf. Process., 2023

Subspace Correction Preconditioning for Solving a Sequence of Asymmetric Linear Systems Using the Bi-CGSTAB Method.
J. Inf. Process., 2023

A novel ILU preconditioning method with a block structure suitable for SIMD vectorization.
J. Comput. Appl. Math., 2023

2022
Performance prediction of massively parallel computation by Bayesian inference.
JSIAM Lett., 2022

Numerical Investigation into the Mixed Precision GMRES(<i>m</i>) Method Using FP64 and FP32.
J. Inf. Process., 2022

A New AINV Preconditioner for the CG Method in Hybrid CPU-GPU Computing Environment.
J. Inf. Process., 2022

Convergence Acceleration of Preconditioned CG Solver Based on Error Vector Sampling for a Sequence of Linear Systems.
CoRR, 2022

Distributed Parallel Tall-Skinny QR Factorization: Performance Evaluation of Various Algorithms on Various Systems.
Proceedings of the Parallel and Distributed Computing, Applications and Technologies, 2022

2021
Accelerating the SpMV kernel on standard CPUs by exploiting the partially diagonal structures.
CoRR, 2021

2020
Shifted Cholesky QR for Computing the QR Factorization of Ill-Conditioned Matrices.
SIAM J. Sci. Comput., 2020

White Paper from Workshop on Large-scale Parallel Numerical Computing Technology (LSPANC 2020): HPC and Computer Arithmetic toward Minimal-Precision Computing.
CoRR, 2020

Hierarchical block multi-color ordering: a new parallel ordering method for vectorization and parallelization of the sparse triangular solver in the ICCG method.
CCF Trans. High Perform. Comput., 2020

An Integer Arithmetic-Based Sparse Linear Solver Using a GMRES Method and Iterative Refinement.
Proceedings of the 11th IEEE/ACM Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, 2020

Effect of Mixed Precision Computing on H-Matrix Vector Multiplication in BEM Analysis.
Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region, 2020

2019
Enhancement of Algebraic Block Multi-Color Ordering for ILU Preconditioning and Its Performance Evaluation in Preconditioned GMRES Solver.
J. Inf. Process., 2019

An investigation into the impact of the structured QR kernel on the overall performance of the TSQR algorithm.
Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region, 2019

2018
A Case Study on Modeling the Performance of Dense Matrix Computation: Tridiagonalization in the EigenExa Eigensolver on the K Computer.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium Workshops, 2018

Time-space tiling with tile-level parallelism for the 3D FDTD method.
Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region, 2018

2016
Roundoff error analysis of the CholeskyQR2 algorithm in an oblique inner product.
JSIAM Lett., 2016

On Constructing Cost Models for Online Automatic Tuning Using ATMathCoreLib: Case Studies through the SVD Computation on a Multicore Processor.
Proceedings of the 10th IEEE International Symposium on Embedded Multicore/Many-core Systems-on-Chip, 2016

2015
Performance Analysis of the Chebyshev Basis Conjugate Gradient Method on the K Computer.
Proceedings of the Parallel Processing and Applied Mathematics, 2015

CAHTR: Communication-Avoiding Householder TRidiagonalization.
Proceedings of the Parallel Computing: On the Road to Exascale, 2015

Performance Evaluation of the Eigen Exa Eigensolver on Oakleaf-FX: Tridiagonalization Versus Pentadiagonalization.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 2015

2014
Performance Analysis of the Householder-Type Parallel Tall-Skinny QR Factorizations Toward Automatic Algorithm Selection.
Proceedings of the High Performance Computing for Computational Science - VECPAR 2014 - 11th International Conference, Eugene, OR, USA, June 30, 2014

CholeskyQR2: a simple and communication-avoiding algorithm for computing a tall-skinny QR factorization on a large-scale parallel system.
Proceedings of the 5th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, 2014

2011
Acceleration of Hessenberg Reduction for Nonsymmetric Eigenvalue Problems in a Hybrid CPU-GPU Computing Environment.
Int. J. Netw. Comput., 2011

2010
Differential qd algorithm for totally nonnegative Hessenberg matrices: introduction of origin shifts and relationship with the discrete hungry Lotka-Volterra system.
JSIAM Lett., 2010

Dynamic Programming Approaches to Optimizing the Blocking Strategy for Basic Matrix Decompositions.
Proceedings of the Software Automatic Tuning, From Concepts to State-of-the-Art Results, 2010

2009
Differential qd algorithm for totally nonnegative band matrices: convergence properties and error analysis.
JSIAM Lett., 2009

2008
A dynamic programming approach to optimizing the blocking strategy for the Householder QR decomposition.
Proceedings of the 2008 IEEE International Conference on Cluster Computing, 29 September, 2008

2007
Accelerating the Singular Value Decomposition of Rectangular Matrices with the CSX600 and the Integrable SVD.
Proceedings of the Parallel Computing Technologies, 2007


  Loading...