José-Ángel Gregorio

Orcid: 0000-0003-2214-303X

Affiliations:
  • Universidad de Cantabria


According to our database1, José-Ángel Gregorio authored at least 67 papers between 1992 and 2023.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2023
Performance Characterization of Popular DNN Models on Out-of-Order CPUs.
Proceedings of the 32nd International Conference on Parallel Architectures and Compilation Techniques, 2023

2022
Top-Down Performance Profiling on NVIDIA's GPUs.
Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022

2021
Fast, Accurate Processor Evaluation Through Heterogeneous, Sample-Based Benchmarking.
IEEE Trans. Parallel Distributed Syst., 2021

2020
Rainbow: A composable coherence protocol for multi-chip servers.
Concurr. Comput. Pract. Exp., 2020

SPECcast: A Methodology for Fast Performance Evaluation with SPEC CPU 2017 Multiprogrammed Workloads.
Proceedings of the ICPP 2020: 49th International Conference on Parallel Processing, 2020

2019
CLASSIC: A cortex-inspired hardware accelerator.
J. Parallel Distributed Comput., 2019

Accuracy vs. Computational Cost Tradeoff in Distributed Computer System Simulation.
CoRR, 2019

Architecting Racetrack Memory Preshift through Pattern-Based Prediction Mechanisms.
Proceedings of the 2019 IEEE International Parallel and Distributed Processing Symposium, 2019

2018
Memory Hierarchy Characterization of NoSQL Applications through Full-System Simulation.
IEEE Trans. Parallel Distributed Syst., 2018

Mosaic: A Scalable Coherence Protocol.
Int. J. Parallel Program., 2018

2017
An adaptive cache coherence protocol: Trading storage for traffic.
J. Parallel Distributed Comput., 2017

2016
AC-WAR: Architecting the Cache Hierarchy to Improve the Lifetime of a Non-Volatile Endurance-Limited Main Memory.
IEEE Trans. Parallel Distributed Syst., 2016

CLAASIC: a Cortex-Inspired Hardware Accelerator.
CoRR, 2016

2015
Improving last level shared cache performance through mobile insertion policies (MIP).
Parallel Comput., 2015

Flask coherence: A morphable hybrid coherence protocol to balance energy, performance and scalability.
Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture, 2015

2013
LIGERO: A light but efficient router conceived for cache-coherent chip multiprocessors.
ACM Trans. Archit. Code Optim., 2013

CMP off-chip bandwidth scheduling guided by instruction criticality.
Proceedings of the International Conference on Supercomputing, 2013

Interaction of NoC Design and Coherence Protocol in 3D-Stacked CMPs.
Proceedings of the 2013 Euromicro Conference on Digital System Design, 2013

The case for a scalable coherence protocol for complex on-chip cache hierarchies in many-core systems.
Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, 2013

2012
Adaptive-Tree Multicast: Efficient Multidestination Support for CMP Communication Substrate.
IEEE Trans. Parallel Distributed Syst., 2012

Balancing Performance and Cost in CMP Interconnection Networks.
IEEE Trans. Parallel Distributed Syst., 2012

The Necessity for Hardware QoS Support for Server Consolidation and Cloud Computing
CoRR, 2012

LOCKE Detailed Specification Tables
CoRR, 2012

TOPAZ: An Open-Source Interconnection Network Simulator for Chip Multiprocessors and Supercomputers.
Proceedings of the 2012 Sixth IEEE/ACM International Symposium on Networks-on-Chip (NoCS), 2012

BIXBAR: A low cost solution to support dynamic link reconfiguration in networks on chip.
Proceedings of the 30th International IEEE Conference on Computer Design, 2012

Improving coherence protocol reactiveness by trading bandwidth for latency.
Proceedings of the Computing Frontiers Conference, CF'12, 2012

2011
Multilevel Cache Modeling for Chip-Multiprocessor Systems.
IEEE Comput. Archit. Lett., 2011

2010
ESP-NUCA: A low-cost adaptive Non-Uniform Cache Architecture.
Proceedings of the 16th International Conference on High-Performance Computer Architecture (HPCA-16 2010), 2010

2009
MRR: Enabling fully adaptive multicast routing for CMP interconnection networks.
Proceedings of the 15th International Conference on High-Performance Computer Architecture (HPCA-15 2009), 2009

2008
Immunet: Dependable Routing for Interconnection Networks with Arbitrary Topology.
IEEE Trans. Computers, 2008

SP-NUCA: a cost effective dynamic non-uniform cache architecture.
SIGARCH Comput. Archit. News, 2008

Improving the performance of large interconnection networks using congestion-control mechanisms.
Perform. Evaluation, 2008

Reducing the Interconnection Network Cost of Chip Multiprocessors.
Proceedings of the Second International Symposium on Networks-on-Chips, 2008

2007
Immucube: Scalable Fault-Tolerant Routing for k-ary n-cube Networks.
IEEE Trans. Parallel Distributed Syst., 2007

Rotary router: an efficient architecture for CMP interconnection networks.
Proceedings of the 34th International Symposium on Computer Architecture (ISCA 2007), 2007

2006
High-performance adaptive routing for networks with arbitrary topology.
J. Syst. Archit., 2006

Effects of Injection Pressure on Network Throughput.
Proceedings of the 14th Euromicro International Conference on Parallel, 2006

Topic 13: Routing and Communication in Interconnection Networks.
Proceedings of the Euro-Par 2006, Parallel Processing, 12th International Euro-Par Conference, Dresden, Germany, August 28, 2006

2005
Evaluation of Interconnection Network Performance Under Heavy Non-uniform Loads.
Proceedings of the Distributed and Parallel Computing, 2005

2004
Evaluating kilo-instruction multiprocessors.
Proceedings of the 3rd Workshop on Memory Performance Issues, 2004

Parallelization of a Neural Net Training Program in a Grid Environment.
Proceedings of the 12th Euromicro Workshop on Parallel, 2004

Simulation Methodology for Decision Support Workloads.
Proceedings of the 12th Euromicro Workshop on Parallel, 2004

Immunet: A Cheap and Robust Fault-Tolerant Packet Routing Mechanism.
Proceedings of the 31st International Symposium on Computer Architecture (ISCA 2004), 2004

Load Unbalance in k-ary n-Cube Networks.
Proceedings of the Euro-Par 2004 Parallel Processing, 2004

A first glance at Kilo-instruction based multiprocessors.
Proceedings of the First Conference on Computing Frontiers, 2004

2003
On the Design of a High-Performance Adaptive Router for CC-NUMA Multiprocessors.
IEEE Trans. Parallel Distributed Syst., 2003

Chordal Topologies for Interconnection Networks.
Proceedings of the High Performance Computing, 5th International Symposium, 2003

A Low Cost Fault Tolerant Packet Routing for Parallel Computers.
Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003

2002
Modeling of interconnection subsystems for massively parallel computers.
Perform. Evaluation, 2002

SICOSYS: An Integrated Framework for studying Interconnection Network Performance in Multiprocessor Systems.
Proceedings of the 10th Euromicro Workshop on Parallel, 2002

2001
The Adaptive Bubble Router.
J. Parallel Distributed Comput., 2001

A new routing mechanism for networks with irregular topology.
Proceedings of the 2001 ACM/IEEE conference on Supercomputing, 2001

A New Communication Mechanism for Cluster Computing.
Proceedings of the Euro-Par 2001: Parallel Processing, 2001

2000
A case study of trace-driven simulation for analyzing interconnection networks: cc-NUMAs with ILP processors.
Proceedings of the Eight Euromicro Workshop on Parallel and Distributed Processing, 2000

Pipelining Router Design Improves Parallel System Performance.
Proceedings of the 5th International Symposium on Parallel Architectures, 2000

Improving parallel system performance by changing the arrangement of the network links.
Proceedings of the 14th international conference on Supercomputing, 2000

1999
Performance evaluation of the bubble algorithm: benefits for k-ary n-cubes.
Proceedings of the Seventh Euromicro Workshop on Parallel and Distributed Processing. PDP'99, 1999

Low-level router design and its impact on supercomputer system performance.
Proceedings of the 13th international conference on Supercomputing, 1999

Adaptive Bubble Router: A Design to Improve Performance in Torus Networks.
Proceedings of the International Conference on Parallel Processing 1999, 1999

Impact of the Head-of-Line Blocking on Parallel Computer Networks: Hardware to Applications.
Proceedings of the Euro-Par '99 Parallel Processing, 5th International Euro-Par Conference, Toulouse, France, August 31, 1999

1998
Ghost packets: a deadlock-free solution for k-ary n-cube networks.
Proceedings of the Sixth Euromicro Workshop on Parallel and Distributed Processing, 1998

1997
A flow control mechanism to avoid message deadlock in k-ary n-cube networks.
Proceedings of the Fourth International on High-Performance Computing, 1997

1996
A two-level programming strategy for distributed systems.
Microprocess. Microprogramming, 1996

Assessing the Performance of the New IBM SP2 Communication Subsystem.
IEEE Parallel Distributed Technol. Syst. Appl., 1996

1995
Petri Net Modeling of Interconnection Networks for Massively Parallel Architectures.
Proceedings of the 9th international conference on Supercomputing, 1995

1994
Shared Memory Multimicroprocessor Operating System with an Extended Petri Net Model.
IEEE Trans. Parallel Distributed Syst., 1994

1992
Performance Evaluation of Parallel Systems by Using Unbounded Generalized Stochastic Petri Nets.
IEEE Trans. Software Eng., 1992


  Loading...