José María Llabería

Orcid: 0000-0002-3753-4108

Affiliations:
  • Polytechnic University of Catalonia, Barcelona, Spain


According to our database1, José María Llabería authored at least 67 papers between 1983 and 2023.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2023
MNEMOSENE++: Scalable Multi-Tile Design with Enhanced Buffering and VGSOT-MRAM based Compute-in-Memory Crossbar Array.
Proceedings of the 30th IEEE International Conference on Electronics, Circuits and Systems, 2023

2022
L2C2: Last-Level Compressed-Cache NVM and a Procedure to Forecast Performance and Lifetime.
CoRR, 2022

Forecasting lifetime and performance of a novel NVM last-level cache with compression.
CoRR, 2022

2021
Near-optimal replacement policies for shared caches in multicore processors.
J. Supercomput., 2021

2019
ReD: A reuse detector for content selection in exclusive shared last-level caches.
J. Parallel Distributed Comput., 2019

2018
Reuse Detector: Improving the Management of STT-RAM SLLCs.
Comput. J., 2018

2013
Exploiting reuse locality on inclusive shared last-level caches.
ACM Trans. Archit. Code Optim., 2013

The reuse cache: downsizing the shared last-level cache.
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture, 2013

2012
ABS: A low-cost adaptive controller for prefetching in a banked shared last-level cache.
ACM Trans. Archit. Code Optim., 2012

Effcient Handling of Lock Hand-off in DSM Multiprocessors with Buffering Coherence Controllers.
J. Comput. Sci. Technol., 2012

2011
Filtering directory lookups in CMPs.
Microprocess. Microsystems, 2011

Filtering Directory Lookups in CMPs with Write-Through Caches.
Proceedings of the Euro-Par 2011 Parallel Processing - 17th International Conference, 2011

2009
Store Buffer Design for Multibanked Data Caches.
IEEE Trans. Computers, 2009

On reducing misspeculations in a pipelined scheduler.
Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

A Methodology to Characterize Critical Section Bottlenecks in DSM Multiprocessors.
Proceedings of the Euro-Par 2009 Parallel Processing, 2009

2007
A comparison of two policies for issuing instructions speculatively.
J. Syst. Archit., 2007

Characterization of Apache web server with Specweb2005.
Proceedings of the 2007 workshop on MEmory performance, 2007

On reducing energy-consumption by late-inserting instructions into the issue queue.
Proceedings of the 2007 International Symposium on Low Power Electronics and Design, 2007

2006
An Enhancement for a Scheduling Logic Pipelined over two Cycles .
Proceedings of the 24th International Conference on Computer Design (ICCD 2006), 2006

Speeding-Up Synchronizations in DSM Multiprocessors.
Proceedings of the Euro-Par 2006, Parallel Processing, 12th International Euro-Par Conference, Dresden, Germany, August 28, 2006

2005
Tradeoffs in buffering speculative memory state for thread-level speculation in multiprocessors.
ACM Trans. Archit. Code Optim., 2005

Store Buffer Design in First-Level Multibanked Data Caches.
Proceedings of the 32st International Symposium on Computer Architecture (ISCA 2005), 2005

2004
Contents Management in First-Level Multibanked Data Caches.
Proceedings of the Euro-Par 2004 Parallel Processing, 2004

A Mechanism for Verifying Data Speculation.
Proceedings of the Euro-Par 2004 Parallel Processing, 2004

2003
A Cost-Effective Implementation of Multilevel Tiling.
IEEE Trans. Parallel Distributed Syst., 2003

Tradeoffs in Buffering Memory State for Thread-Level Speculation in Multiprocessors.
Proceedings of the Ninth International Symposium on High-Performance Computer Architecture (HPCA'03), 2003

Counteracting Bank Misprediction in Sliced First-Level Caches.
Proceedings of the Euro-Par 2003. Parallel Processing, 2003

Using Software Logging to Support Multi-Version Buffering in Thread-Level Speculation.
Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques (PACT 2003), 27 September, 2003

2002
Register tiling in nonrectangular iteration spaces.
ACM Trans. Program. Lang. Syst., 2002

2001
Recovery Mechanism for Latency Misprediction.
Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques (PACT 2001), 2001

2000
Modeling load address behaviour through recurrences.
Proceedings of the 2000 IEEE International Symposium on Performance Analysis of Systems and Software, 2000

On the Performance of Hand vs. Automatically Optimized Numerical Codes.
Proceedings of the Sixth International Symposium on High-Performance Computer Architecture, 2000

Two-Level Address Storage and Address Prediction (Research Note).
Proceedings of the Euro-Par 2000, Parallel Processing, 6th International Euro-Par Conference, Munich, Germany, August 29, 2000

1999
Minimizing Conflicts Between Vector Streams in Interleaved Memory Systems.
IEEE Trans. Computers, 1999

Looking at History to Filter Allocations in Prediction Tables.
Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques, 1999

1998
New Access Order to Reduce Inter-Vector-Conflicts.
Proceedings of the Vector and Parallel Processing, 1998

Loop bounds computation for multilevel tiling.
Proceedings of the Sixth Euromicro Workshop on Parallel and Distributed Processing, 1998

A General Algorithm for Tiling the Register Level.
Proceedings of the 12th international conference on Supercomputing, 1998

Performance Evaluation of Tiling for the Register Level.
Proceedings of the Fourth International Symposium on High-Performance Computer Architecture, Las Vegas, Nevada, USA, January 31, 1998

Split Last-Address Predictor.
Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques, 1998

1996
Increasing the Effective Bandwidth of Complex Memory Systems in Multivector Processors.
Proceedings of the 1996 ACM/IEEE Conference on Supercomputing, 1996

Reducing Inter-Vector-Conflicts in Complex Memory Systems.
Proceedings of the 10th international conference on Supercomputing, 1996

A Unified Transformation Technique for Multilevel Blocking.
Proceedings of the Euro-Par '96 Parallel Processing, 1996

Increasing the Effective Memory Bandwidth in Multivector Processors.
Proceedings of the 22rd EUROMICRO Conference '96, 1996

1995
Loop Transformation Using Nonunimodular Matrices.
IEEE Trans. Parallel Distributed Syst., 1995

Access order to avoid inter-vector-conflicts in complex memory systems.
Proceedings of IPPS '95, 1995

1994
Out-of-order access to vector elements in order to reduce conflicts in vector processors.
Proceedings of the Sixth IEEE Symposium on Parallel and Distributed Processing, 1994

1993
Reducing Branch Delay to Zero in Pipelined Processors.
IEEE Trans. Computers, 1993

1992
A method for implementation of one-dimensional systolic algorithms with data contraflow using pipelined functional units.
J. VLSI Signal Process., 1992

Evaluation of A + B = K Conditions Without Carry Propagation.
IEEE Trans. Computers, 1992

Increasing the Number of Strides for Conflict-Free Vector Access.
Proceedings of the 19th Annual International Symposium on Computer Architecture. Gold Coast, 1992

Scheduling partitions in systolic algorithms.
Proceedings of the Application Specific Array Processors, 1992

1991
Conflict-Free Strides for Vectors in Matched Memories.
Parallel Process. Lett., 1991

Performance evaluation of transputer systems with linear algebra problems.
Microprocessing and Microprogramming, 1991

Balanced Loop Partitioning Using GTS.
Proceedings of the Languages and Compilers for Parallel Computing, 1991

On Automatic Loop Data-Mapping for Distributed-Memory Multiprocessors.
Proceedings of the Distributed Memory Computing, 2nd European Conference, 1991

Interleaving Partitions of Systolic Algorithms for Programming Distributed Memory Multiprocessors.
Proceedings of the Distributed Memory Computing, 2nd European Conference, 1991

Transformation of systolic algorithms for interleaving partitions.
Proceedings of the Application Specific Array Processors, 1991

1990
Implementation of systolic algorithms using pipelined functional units.
Proceedings of the Application Specific Array Processors, 1990

1989
Systematic Hardware Adaptation of Systolic Algorithms.
Proceedings of the 16th Annual International Symposium on Computer Architecture. Jerusalem, 1989

Instruction fetch unit for parallel execution of branch instructions.
Proceedings of the 3rd international conference on Supercomputing, 1989

1988
A mechanism for reducing the cost of branches in RISC architectures.
Microprocess. Microprogramming, 1988

1987
Partitioning: An Essential Step in Mapping Algorithms Into Systolic Array Processors.
Computer, 1987

1986
Computing Size-Independent Matrix Problems on Systolic Array Processors.
Proceedings of the 13th Annual Symposium on Computer Architecture, Tokyo, Japan, June 1986, 1986

Solving Matrix Problems with No Size Restriction on a Systolic Array Processor.
Proceedings of the International Conference on Parallel Processing, 1986

1985
Analysis and Simulation of Multiplexed Single-Bus Networks With and Without Buffering.
Proceedings of the 12th Annual Symposium on Computer Architecture, 1985

1983
A performance evaluation of the multiple bus network for multiprocessor systems.
Proceedings of the International Conference on Measurements and Modeling of Computer Systems, 1983


  Loading...