Joan-Manuel Parcerisa

Proceedings of the 32nd International Conference on Parallel Architectures and Compilation Techniques, 2023

2022

Omega-Test: A Predictive Early-Z Culling to Improve the Graphics Pipeline Energy-Efficiency.

[BibT_eX]

[DOI]

David Corbalán-Navarro

IEEE Trans. Vis. Comput. Graph., 2022

Dynamic sampling rate: harnessing frame coherence in graphics applications for energy-efficient GPUs.

[BibT_eX]

[DOI]

J. Supercomput., 2022

Triangle Dropping: An Occluded-geometry Predictor for Energy-efficient Mobile GPUs.

[BibT_eX]

[DOI]

David Corbalán-Navarro

ACM Trans. Archit. Code Optim., 2022

DTM-NUCA: Dynamic Texture Mapping-NUCA for Energy-Efficient Graphics Rendering.

[BibT_eX]

[DOI]

David Corbalán-Navarro

Proceedings of the 30th Euromicro International Conference on Parallel, 2022

DTexL: Decoupled Raster Pipeline for Texture Locality.

[BibT_eX]

[DOI]

Diya Joseph

Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

TCOR: A Tile Cache with Optimal Replacement.

[BibT_eX]

[DOI]

Diya Joseph

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022

2019

Visibility Rendering Order: Improving Energy Efficiency on Mobile GPUs through Frame Coherence.

[BibT_eX]

[DOI]

Pedro Marcuello

IEEE Trans. Parallel Distributed Syst., 2019

Rendering Elimination: Early Discard of Redundant Tiles in the Graphics Pipeline.

[BibT_eX]

[DOI]

Pedro Marcuello

Proceedings of the 25th IEEE International Symposium on High Performance Computer Architecture, 2019

Early Visibility Resolution for Removing Ineffectual Computations in the Graphics Pipeline.

[BibT_eX]

[DOI]

Proceedings of the 25th IEEE International Symposium on High Performance Computer Architecture, 2019

2016

An Energy-Efficient Memory Unit for Clustered Microarchitectures.

[BibT_eX]

[DOI]

Stefan Bieschewski

IEEE Trans. Computers, 2016

2015

Ultra-low power render-based collision detection for CPU/GPU systems.

[BibT_eX]

[DOI]

Pedro Marcuello

Proceedings of the 48th International Symposium on Microarchitecture, 2015

2014

Eliminating redundant fragment shader executions on a mobile GPU via hardware memoization.

[BibT_eX]

[DOI]

Proceedings of the ACM/IEEE 41st International Symposium on Computer Architecture, 2014

2013

TEAPOT: a toolset for evaluating performance, power and image quality on mobile graphics systems.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Supercomputing, 2013

Parallel frame rendering: Trading responsiveness for energy on a mobile GPU.

[BibT_eX]

[DOI]

Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, 2013

2012

Boosting mobile GPU performance with a decoupled access/execute fragment processor.

[BibT_eX]

[DOI]

Proceedings of the 39th International Symposium on Computer Architecture (ISCA 2012), 2012

2010

Leveraging Register Windows to Reduce Physical Registers to the Bare Minimum.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2010

2007

Improving Branch Prediction and Predicated Execution in Out-of-Order Processors.

[BibT_eX]

[DOI]

Proceedings of the 13st International Conference on High-Performance Computer Architecture (HPCA-13 2007), 2007

Early Register Release for Out-of-Order Processors with RegisterWindows.

[BibT_eX]

[DOI]

Proceedings of the 16th International Conference on Parallel Architectures and Compilation Techniques (PACT 2007), 2007

2006

Selective predicate prediction for out-of-order processors.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual International Conference on Supercomputing, 2006

2005

On-Chip Interconnects and Instruction Steering Schemes for Clustered Microarchitectures.

[BibT_eX]

[DOI]

Julio Sahuquillo

José Duato

IEEE Trans. Parallel Distributed Syst., 2005

Memory Bank Predictors.

[BibT_eX]

[DOI]

Stefan Bieschewski

Proceedings of the 23rd International Conference on Computer Design (ICCD 2005), 2005

2004

Design of Clustered Superscalar Microarchitectures.

[BibT_eX]

[DOI]

PhD thesis, 2004

2002

Efficient Interconnects for Clustered Microarchitectures.

[BibT_eX]

[DOI]

Julio Sahuquillo

José Duato

Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques (PACT 2002), 2002

2001

Improving Latency Tolerance of Multithreading through Decoupling.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2001

Dynamic Code Partitioning for Clustered Architectures.

[BibT_eX]

[DOI]

Ramon Canal

Int. J. Parallel Program., 2001

2000

Reducing wire delay penalty through value prediction.

[BibT_eX]

[DOI]

Proceedings of the 33rd Annual IEEE/ACM International Symposium on Microarchitecture, 2000

Dynamic Cluster Assignment Mechanisms.

[BibT_eX]

[DOI]

Ramon Canal

Proceedings of the Sixth International Symposium on High-Performance Computer Architecture, 2000

1999

The Synergy of Multithreading and Access/Execute Decoupling.

[BibT_eX]

[DOI]

Proceedings of the Fifth International Symposium on High-Performance Computer Architecture, 1999

A Cost-Effective Clustered Architecture.

[BibT_eX]

[DOI]

Ramon Canal

Proceedings of the 1999 International Conference on Parallel Architectures and Compilation Techniques, 1999

1998

The Latency Hiding Effectiveness of Decoupled Access/Execute Processors.

[BibT_eX]

[DOI]

Proceedings of the 24th EUROMICRO '98 Conference, 1998

1997

Eliminating Cache Conflict Misses through XOR-Based Placement Functions.

[BibT_eX]

[DOI]

Mateo Valero

Nigel P. Topham