José R. Herrero

Orcid: 0000-0002-4060-367X

Affiliations:
  • Polytechnic University of Catalonia (UPC), Department of Computer Architecture, Spain


According to our database1, José R. Herrero authored at least 52 papers between 1996 and 2023.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2023
Programming parallel dense matrix factorizations and inversion for new-generation NUMA architectures.
J. Parallel Distributed Comput., May, 2023

Co-Design of the Dense Linear AlgebravSoftware Stack for Multicore Processors.
CoRR, 2023

Fine-grain task-parallel algorithms for matrix factorizations and inversion on many-threaded CPUs.
Concurr. Comput. Pract. Exp., 2023

2022
Acceleration strategies for large-scale sequential simulations using parallel neighbour search: Non-LVA and LVA scenarios.
Comput. Geosci., 2022

A distributed Monte Carlo based linear algebra solver applied to the analysis of large complex networks.
Future Gener. Comput. Syst., 2022

NUMA-Aware Dense Matrix Factorizations and Inversion with Look-Ahead on Multicore Processors.
Proceedings of the 2022 IEEE 34th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), 2022

2021
Efficient update of determinants for many-electron wave function overlaps.
Comput. Phys. Commun., 2021

A New Generation of Task-Parallel Algorithms for Matrix Inversion in Many-Threaded CPUs.
Proceedings of the PMAM@PPoPP 2021: Proceedings of the Twelfth International Workshop on Programming Models and Applications for Multicores and Manycores, 2021

2020
A highly parallel algorithm for computing the action of a matrix exponential on a vector based on a multilevel Monte Carlo method.
Comput. Math. Appl., 2020

2019
Look-ahead in the two-sided reduction to compact band forms for symmetric eigenvalue problems and the SVD.
Numer. Algorithms, 2019

Resource-aware Elastic Swap Random Forest for Evolving Data Streams.
CoRR, 2019

A Case for Malleable Thread-Level Linear Algebra Libraries: The LU Factorization With Partial Pivoting.
IEEE Access, 2019

2018
Static scheduling of the LU factorization with look-ahead on asymmetric multicore processors.
Parallel Comput., 2018

Energy balance between voltage-frequency scaling and resilience for linear algebra routines on low-power multicore architectures.
Parallel Comput., 2018

Two-sided orthogonal reductions to condensed forms on asymmetric multicore processors.
Parallel Comput., 2018

Multi-threaded dense linear algebra libraries for low-power asymmetric multicore processors.
J. Comput. Sci., 2018

A path-level exact parallelization strategy for sequential simulation.
Comput. Geosci., 2018

2017
Two-Sided Reduction to Compact Band Forms with Look-Ahead.
CoRR, 2017

Reduction to Tridiagonal Form for Symmetric Eigenproblems on Asymmetric Multicore Processors.
Proceedings of the 8th International Workshop on Programming Models and Applications for Multicores and Manycores, 2017

Static Versus Dynamic Task Scheduling of the Lu Factorization on ARM big. LITTLE Architectures.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium Workshops, 2017

Low-latency multi-threaded ensemble learning for dynamic big data streams.
Proceedings of the 2017 IEEE International Conference on Big Data (IEEE BigData 2017), 2017

2016
Echo State Hoeffding Tree Learning.
Proceedings of The 8th Asian Conference on Machine Learning, 2016

2015
Acceleration of the Geostatistical Software Library (GSLIB) by code optimization and hybrid parallel programming.
Comput. Geosci., 2015

Multi-Threaded Dense Linear Algebra Libraries for Low-Power Asymmetric Multicore Processors.
CoRR, 2015

Parallel computing on graphics processing units and heterogeneous platforms.
Concurr. Comput. Pract. Exp., 2015

Tareador: a tool to unveil parallelization strategies at undergraduate level.
Proceedings of the Workshop on Computer Architecture Education, 2015

2014
Tuning and hybrid parallelization of a genetic-based multi-point statistics simulation code.
Parallel Comput., 2014

Evaluation and assessment of professional skills in the Final Year Project.
Proceedings of the IEEE Frontiers in Education Conference, 2014

2013
Level-3 Cholesky Factorization Routines Improve Performance of Many Cholesky Algorithms.
ACM Trans. Math. Softw., 2013

Graphics processing unit computing and exploitation of hardware accelerators.
Concurr. Comput. Pract. Exp., 2013

A Square Block Format for Symmetric Band Matrices.
Proceedings of the Parallel Processing and Applied Mathematics, 2013

2012
On new computational local orders of convergence.
Appl. Math. Lett., 2012

2011
Special Issue: GPU computing.
Concurr. Comput. Pract. Exp., 2011

New Level-3 BLAS Kernels for Cholesky Factorization.
Proceedings of the Parallel Processing and Applied Mathematics, 2011

2009
Parallelizing dense and banded linear algebra libraries using SMPSs.
Concurr. Comput. Pract. Exp., 2009

2008
Hypermatrix oriented supernode amalgamation.
J. Supercomput., 2008

2007
Exploiting computer resources for fast nearest neighbor classification.
Pattern Anal. Appl., 2007

Analysis of a sparse hypermatrix Cholesky with fixed-sized blocking.
Appl. Algebra Eng. Commun. Comput., 2007

New Data Structures for Matrices and Specialized Inner Kernels: Low Overhead for High Performance.
Proceedings of the Parallel Processing and Applied Mathematics, 2007

2006
A framework for efficient execution of matrix computations.
PhD thesis, 2006

Using Non-canonical Array Layouts in Dense Matrix Operations.
Proceedings of the Applied Parallel Computing. State of the Art in Scientific Computing, 2006

Sparse Hypermatrix Cholesky: Customization for High Performance.
Proceedings of the International MultiConference of Engineers and Computer Scientists 2006, 2006

Compiler-Optimized Kernels: An Efficient Alternative to Hand-Coded Inner Kernels.
Proceedings of the Computational Science and Its Applications, 2006

2005
Adapting Linear Algebra Codes to the Memory Hierarchy Using a Hypermatrix Scheme.
Proceedings of the Parallel Processing and Applied Mathematics, 2005

A Study on Load Imbalance in Parallel Hypermatrix Multiplication Using OpenMP.
Proceedings of the Parallel Processing and Applied Mathematics, 2005

Efficient Implementation of Nearest Neighbor Classification.
Proceedings of the Computer Recognition Systems, 2005

2004
Optimization of a Statically Partitioned Hypermatrix Sparse Cholesky Factorization.
Proceedings of the Applied Parallel Computing, 2004

2003
Building Software Via Shared Knowledge.
Proceedings of the International Conference on Software Engineering Research and Practice, 2003

Automatic Benchmarking and Optimization of Codes: An Experience with Numerical Kernels.
Proceedings of the International Conference on Software Engineering Research and Practice, 2003

Improving Performance of Hypermatrix Cholesky Factorization.
Proceedings of the Euro-Par 2003. Parallel Processing, 2003

Operating System Support for Process Confinement.
Proceedings of the International Conference on Security and Management, 2003

1996
Data Prefetching and Multilevel Blocking for Linear Algebra Operations.
Proceedings of the 10th international conference on Supercomputing, 1996


  Loading...