Juan J. Navarro

Proceedings of the 27th International Conference on Computer Design, 2009

2008

Hypermatrix oriented supernode amalgamation.

[BibT_eX]

[DOI]

J. Supercomput., 2008

2007

Exploiting computer resources for fast nearest neighbor classification.

[BibT_eX]

[DOI]

Pattern Anal. Appl., 2007

Analysis of a sparse hypermatrix Cholesky with fixed-sized blocking.

[BibT_eX]

[DOI]

Appl. Algebra Eng. Commun. Comput., 2007

2006

Using Non-canonical Array Layouts in Dense Matrix Operations.

[BibT_eX]

[DOI]

Proceedings of the Applied Parallel Computing. State of the Art in Scientific Computing, 2006

Sparse Hypermatrix Cholesky: Customization for High Performance.

[BibT_eX]

Proceedings of the International MultiConference of Engineers and Computer Scientists 2006, 2006

Compiler-Optimized Kernels: An Efficient Alternative to Hand-Coded Inner Kernels.

[BibT_eX]

[DOI]

Proceedings of the Computational Science and Its Applications, 2006

2005

Adapting Linear Algebra Codes to the Memory Hierarchy Using a Hypermatrix Scheme.

[BibT_eX]

[DOI]

Proceedings of the Parallel Processing and Applied Mathematics, 2005

A Study on Load Imbalance in Parallel Hypermatrix Multiplication Using OpenMP.

[BibT_eX]

[DOI]

Proceedings of the Parallel Processing and Applied Mathematics, 2005

Efficient Implementation of Nearest Neighbor Classification.

[BibT_eX]

[DOI]

Proceedings of the Computer Recognition Systems, 2005

2004

Optimization of a Statically Partitioned Hypermatrix Sparse Cholesky Factorization.

[BibT_eX]

[DOI]

Proceedings of the Applied Parallel Computing, 2004

2003

Building Software Via Shared Knowledge.

[BibT_eX]

Proceedings of the International Conference on Software Engineering Research and Practice, 2003

Automatic Benchmarking and Optimization of Codes: An Experience with Numerical Kernels.

[BibT_eX]

Proceedings of the International Conference on Software Engineering Research and Practice, 2003

CC-Radix: a Cache Conscious Sorting Based on Radix sort.

[BibT_eX]

[DOI]

Proceedings of the 11th Euromicro Workshop on Parallel, 2003

Improving Performance of Hypermatrix Cholesky Factorization.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2003. Parallel Processing, 2003

2002

The Effect of Local Sort on Parallel Sorting Algorithms.

[BibT_eX]

[DOI]

Proceedings of the 10th Euromicro Workshop on Parallel, 2002

Case Study: Memory Conscious Parallel Sorting.

[BibT_eX]

[DOI]

Proceedings of the Algorithms for Memory Hierarchies, 2002

2001

Fast parallel in-memory 64-bit sorting.

[BibT_eX]

[DOI]

Proceedings of the 15th international conference on Supercomputing, 2001

1999

Sorting on the SGI Origin 2000: Comparing MPI and Shared Memory Implementations.

[BibT_eX]

[DOI]

E. Guinovart

Proceedings of the 19th International Conference of the Chilean Computer Science Society (SCCC '99), 1999

Communication conscious radix sort.

[BibT_eX]

[DOI]

Proceedings of the 13th international conference on Supercomputing, 1999

1998

Dynamic History-length Fitting: A Third Level of Adaptivity for Branch Prediction.

[BibT_eX]

[DOI]

Sanji Sanjeevan

Proceedings of the 25th Annual International Symposium on Computer Architecture, 1998

1997

An Analysis of Superscalar Sorting Algorithms on an R8000 Processor.

[BibT_eX]

[DOI]

Proceedings of 17th International Conference of the Chilean Computer Science Society (SCCC '97), 1997

Reducing TLB power requirements.

[BibT_eX]

[DOI]

Tomás Lang

Proceedings of the 1997 International Symposium on Low Power Electronics and Design, 1997

Data Caches for Superscalar Processors.

[BibT_eX]

[DOI]

Olivier Temam

Proceedings of the 11th international conference on Supercomputing, 1997

1996

Review of General and Toeplitz Vector Bidiagonal Solvers.

[BibT_eX]

[DOI]

Oriol Roig

Parallel Comput., 1996

The Difference-bit Cache.

[BibT_eX]

[DOI]

Tomás Lang

Proceedings of the 23rd Annual International Symposium on Computer Architecture, 1996

Block Algorithms for Sparse Matrix Computations on High Performance Workstations.

[BibT_eX]

[DOI]

Elena García-Diego

Proceedings of the 10th international conference on Supercomputing, 1996

Data Prefetching and Multilevel Blocking for Linear Algebra Operations.

[BibT_eX]

[DOI]

Elena García-Diego

Proceedings of the 10th international conference on Supercomputing, 1996

1995

An Analysis of the Parallel Computation of Arbitrarily Branched Cable Neuron Models.

[BibT_eX]

Michael Mascagni

Proceedings of the Seventh SIAM Conference on Parallel Processing for Scientific Computing, 1995

A Generalized Criterion for the Early Termination of R-Cyclic Reduction and Divide and Conquer for Recurrences.

[BibT_eX]

Proceedings of the Seventh SIAM Conference on Parallel Processing for Scientific Computing, 1995

Performance on Distributed Memory Multicomputers of Domain Decomposition Solvers.

[BibT_eX]

Proceedings of the Seventh SIAM Conference on Parallel Processing for Scientific Computing, 1995

1994

MOB forms: a class of multilevel block algorithms for dense linear algebra operations.

[BibT_eX]

[DOI]

Tomás Lang

Proceedings of the 8th international conference on Supercomputing, 1994

A generalized vision of some parallel bidiagonal systems solvers.

[BibT_eX]

[DOI]

Oriol Roig

Proceedings of the 8th international conference on Supercomputing, 1994

1993

Spike Algorithm with savings for strictly diagonal dominant tridiagonal systems.

[BibT_eX]

[DOI]

Microprocess. Microprogramming, 1993

A Parallel Tridiagonal Solver for Vector Uniprocessors.

[BibT_eX]

Proceedings of the Sixth SIAM Conference on Parallel Processing for Scientific Computing, 1993

1992

A method for implementation of one-dimensional systolic algorithms with data contraflow using pipelined functional units.

[BibT_eX]

[DOI]

J. VLSI Signal Process., 1992

Increasing the Number of Strides for Conflict-Free Vector Access.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual International Symposium on Computer Architecture. Gold Coast, 1992

1991

Conflict-Free Strides for Vectors in Matched Memories.

[BibT_eX]

[DOI]

Parallel Process. Lett., 1991

Performance evaluation of transputer systems with linear algebra problems.

[BibT_eX]

[DOI]

Microprocessing and Microprogramming, 1991

Interleaving Partitions of Systolic Algorithms for Programming Distributed Memory Multiprocessors.

[BibT_eX]

[DOI]

Proceedings of the Distributed Memory Computing, 2nd European Conference, 1991

Mapping QR decomposition of a banded matrix on a ID systolic array with data contraflow and pipelined functional units.

[BibT_eX]

Proceedings of the Algorithms and Parallel VLSI Architectures II, 1991

Transformation of systolic algorithms for interleaving partitions.

[BibT_eX]

[DOI]

Proceedings of the Application Specific Array Processors, 1991

1990

Implementation of systolic algorithms using pipelined functional units.

[BibT_eX]

[DOI]

Proceedings of the Application Specific Array Processors, 1990

1989

Systematic Hardware Adaptation of Systolic Algorithms.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual International Symposium on Computer Architecture. Jerusalem, 1989

1987

Partitioning: An Essential Step in Mapping Algorithms Into Systolic Array Processors.

[BibT_eX]

[DOI]

José M. Llabería

Mateo Valero

Computer, 1987

1986

Computing Size-Independent Matrix Problems on Systolic Array Processors.

[BibT_eX]

[DOI]

José M. Llabería

Mateo Valero

Proceedings of the 13th Annual Symposium on Computer Architecture, Tokyo, Japan, June 1986, 1986

Solving Matrix Problems with No Size Restriction on a Systolic Array Processor.

[BibT_eX]