Salvador Petit

According to our database1, Salvador Petit authored at least 82 papers between 2000 and 2018.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

Homepages:

On csauthors.net:

Bibliography

2018
Designing lab sessions focusing on real processors for computer architecture courses: A practical perspective.
J. Parallel Distrib. Comput., 2018

Accurately modeling the on-chip and off-chip GPU memory subsystem.
Future Generation Comp. Syst., 2018

Improving System Turnaround Time with Intel CAT by Identifying LLC Critical Applications.
Proceedings of the Euro-Par 2018: Parallel Processing, 2018

Improving GPU Cache Hierarchy Performance with a Fetch and Replacement Cache.
Proceedings of the Euro-Par 2018: Parallel Processing, 2018

2017
On Microarchitectural Mechanisms for Cache Wearout Reduction.
IEEE Trans. VLSI Syst., 2017

A Hardware Approach to Fairly Balance the Inter-Thread Interference in Shared Caches.
IEEE Trans. Parallel Distrib. Syst., 2017

Improving IBM POWER8 Performance Through Symbiotic Job Scheduling.
IEEE Trans. Parallel Distrib. Syst., 2017

Perf&Fair: A Progress-Aware Scheduler to Enhance Performance and Fairness in SMT Multicores.
IEEE Trans. Computers, 2017

A research-oriented course on Advanced Multicore Architecture: Contents and active learning methodologies.
J. Parallel Distrib. Comput., 2017

Exploiting Data Compression to Mitigate Aging in GPU Register Files.
Proceedings of the 29th International Symposium on Computer Architecture and High Performance Computing, 2017

Modeling a Photonic Network for Exascale Computing.
Proceedings of the 2017 International Conference on High Performance Computing & Simulation, 2017

Application Clustering Policies to Address System Fairness with Intel's Cache Allocation Technology.
Proceedings of the 26th International Conference on Parallel Architectures and Compilation Techniques, 2017

2016
Bandwidth-Aware On-Line Scheduling in SMT Multicores.
IEEE Trans. Computers, 2016

A dynamic execution time estimation model to save energy in heterogeneous multicores running periodic tasks.
Future Generation Comp. Syst., 2016

Enhancing the L1 Data Cache Design to Mitigate HCI.
Computer Architecture Letters, 2016

Impact of Memory-Level Parallelism on the Performance of GPU Coherence Protocols.
Proceedings of the 24th Euromicro International Conference on Parallel, 2016

Accurately modeling a photonic NoC in a detailed CMP simulation framework.
Proceedings of the International Conference on High Performance Computing & Simulation, 2016

Symbiotic job scheduling on the IBM POWER8.
Proceedings of the 2016 IEEE International Symposium on High Performance Computer Architecture, 2016

Student Research Poster: A Low Complexity Cache Sharing Mechanism to Address System Fairness.
Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, 2016

2015
Design of Hybrid Second-Level Caches.
IEEE Trans. Computers, 2015

A reuse-based refresh policy for energy-aware eDRAM caches.
Microprocessors and Microsystems - Embedded Hardware Design, 2015

A Research-Oriented Course on Advanced Multicore Architecture.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 2015

Addressing Fairness in SMT Multicores with a Progress-Aware Scheduler.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium, 2015

Current challenges in simulations of HPC systems.
Proceedings of the 2015 International Conference on High Performance Computing & Simulation, 2015

Accurately modeling the GPU memory subsystem.
Proceedings of the 2015 International Conference on High Performance Computing & Simulation, 2015

2014
Efficient Register Renaming and Recovery for High-Performance Processors.
IEEE Trans. VLSI Syst., 2014

Cache-Hierarchy Contention-Aware Scheduling in CMPs.
IEEE Trans. Parallel Distrib. Syst., 2014

Addressing bandwidth contention in SMT multicores through scheduling.
Proceedings of the 2014 International Conference on Supercomputing, 2014

Dynamic WCET Estimation for Real-Time Multicore Embedded Systems Supporting DVFS.
Proceedings of the 2014 IEEE International Conference on High Performance Computing and Communications, 2014

Analyzing the Optimal Voltage/Frequency Pair in Fault-Tolerant Caches.
Proceedings of the 2014 IEEE International Conference on High Performance Computing and Communications, 2014

2013
Hardware-Based Generation of Independent Subtraces of Instructions in Clustered Processors.
IEEE Trans. Computers, 2013

Power-aware scheduling with effective task migration for real-time multicore embedded systems.
Concurrency and Computation: Practice and Experience, 2013

Exploiting reuse information to reduce refresh energy in on-chip eDRAM caches.
Proceedings of the International Conference on Supercomputing, 2013

Using Huge Pages and Performance Counters to Determine the LLC Architecture.
Proceedings of the International Conference on Computational Science, 2013

Combining RAM technologies for hard-error recovery in L1 data caches working at very-low power modes.
Proceedings of the Design, Automation and Test in Europe, 2013

L1-bandwidth aware thread allocation in multicore SMT processors.
Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, 2013

2012
Impact on Performance and Energy of the Retention Time and Processor Frequency in L1 Macrocell-Based Data Caches.
IEEE Trans. VLSI Syst., 2012

A Sequentially Consistent Multiprocessor Architecture for Out-of-Order Retirement of Instructions.
IEEE Trans. Parallel Distrib. Syst., 2012

A cost-effective heuristic to schedule local and remote memory in cluster computers.
The Journal of Supercomputing, 2012

Design, Performance, and Energy Consumption of eDRAM/SRAM Macrocells for L1 Data Caches.
IEEE Trans. Computers, 2012

Combining recency of information with selective random and a victim cache in last-level caches.
TACO, 2012

Efficiently Handling Memory Accesses to Improve QoS in Multicore Systems under Real-Time Constraints.
Proceedings of the IEEE 24th International Symposium on Computer Architecture and High Performance Computing, 2012

Understanding Cache Hierarchy Contention in CMPs to Improve Job Scheduling.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

Page-Based Memory Allocation Policies of Local and Remote Memory in Cluster Computers.
Proceedings of the 18th IEEE International Conference on Parallel and Distributed Systems, 2012

Analyzing the optimal ratio of SRAM banks in hybrid caches.
Proceedings of the 30th International IEEE Conference on Computer Design, 2012

OMHI 2012: First International Workshop on On-chip Memory Hierarchies and Interconnects: Organization, Management and Implementation.
Proceedings of the Euro-Par 2012: Parallel Processing Workshops, 2012

2011
A New Energy-Aware Dynamic Task Set Partitioning Algorithm for Soft and Hard Embedded Real-Time Systems.
Comput. J., 2011

MRU-Tour-based Replacement Algorithms for Last-Level Caches.
Proceedings of the 23rd International Symposium on Computer Architecture and High Performance Computing, 2011

A Cluster Computer Performance Predictor for Memory Scheduling.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2011

A Dynamic Power-Aware Partitioner with Task Migration for Multicore Embedded Systems.
Proceedings of the Euro-Par 2011 Parallel Processing - 17th International Conference, 2011

Improving Last-Level Cache Performance by Exploiting the Concept of MRU-Tour.
Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques, 2011

2010
Dynamic task set partitioning based on balancing resource requirements and utilization to reduce power consumption.
Proceedings of the 2010 ACM Symposium on Applied Computing (SAC), 2010

Balancing Task Resource Requirements in Embedded Multithreaded Multicore Processors to Reduce Power Consumption.
Proceedings of the 18th Euromicro Conference on Parallel, 2010

Out-of-order retirement of instructions in sequentially consistent multiprocessors.
Proceedings of the 28th International Conference on Computer Design, 2010

Extending a Multicore Multithread Simulator to Model Power-Aware Hard Real-Time Systems.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2010

A Scheduling Heuristic to Handle Local and Remote Memory in Cluster Computers.
Proceedings of the 12th IEEE International Conference on High Performance Computing and Communications, 2010

Exploiting subtrace-level parallelism in clustered processors.
Proceedings of the 19th International Conference on Parallel Architecture and Compilation Techniques, 2010

2009
A Complexity-Effective Out-of-Order Retirement Microarchitecture.
IEEE Trans. Computers, 2009

Power Reduction In Advanced Embedded IPC Processors.
Intelligent Automation & Soft Computing, 2009

An hybrid eDRAM/SRAM macrocell to implement first-level data caches.
Proceedings of the 42st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-42 2009), 2009

Dynamic task set partitioning based on balancing memory requirements to reduce power consumption.
Proceedings of the 23rd international conference on Supercomputing, 2009

A power-aware hybrid RAM-CAM renaming mechanism for fast recovery.
Proceedings of the 27th International Conference on Computer Design, 2009

Paired ROBs: A Cost-Effective Reorder Buffer Sharing Strategy for SMT Processors.
Proceedings of the Euro-Par 2009 Parallel Processing, 2009

An Efficient Low-Complexity Alternative to the ROB for Out-of-Order Retirement of Instructions.
Proceedings of the 12th Euromicro Conference on Digital System Design, 2009

2008
The impact of out-of-order commit in coarse-grain, fine-grain and simultaneous multithreaded architectures.
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

A simple power-aware scheduling for multicore systems when running real-time applications.
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

Reducing the Number of Bits in the BTB to Attack the Branch Predictor Hot-Spot.
Proceedings of the Euro-Par 2008, 2008

2007
Spim-Cache: A Pedagogical Tool for Teaching Cache Memories Through Code-Based Exercises.
IEEE Trans. Education, 2007

Multi2Sim: A Simulation Framework to Evaluate Multicore-Multithreaded Processors.
Proceedings of the 19th Symposium on Computer Architecture and High Performance Computing (SBAC-PAD 2007), 2007

Leakage Current Reduction in Data Caches on Embedded Systems.
Proceedings of the 2007 International Conference on Intelligent Pervasive Computing, 2007

VB-MT: Design Issues and Performance of the Validation Buffer Microarchitecture for Multithreaded Processors.
Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007), 2007

2006
Addressing a workload characterization study to the design of consistency protocols.
The Journal of Supercomputing, 2006

RACFP: a training tool to work with floating-point representation, algorithms, and circuits in undergraduate courses.
IEEE Trans. Education, 2006

An execution-driven simulation tool for teaching cache memories in introductory computer organization courses.
Proceedings of the 2006 Workshop on Computer Architecture Education, 2006

Applying the zeros switch-off technique to reduce static energy in data caches.
Proceedings of the 18th Symposium on Computer Architecture and High Performance Computing (SBAC-PAD 2006), 2006

2005
Exploring the performance of split data cache schemes on superscalar processors and symmetric multiprocessors.
Journal of Systems Architecture, 2005

A Comparison Study of the HLRC-DU Protocol versus a HLRC Hardware Assisted Protocol.
Proceedings of the 13th Euromicro Workshop on Parallel, 2005

Exploiting temporal locality in drowsy cache policies.
Proceedings of the Second Conference on Computing Frontiers, 2005

2004
Characterizing the Dynamic Behavior of Workload Execution in SVM systems.
Proceedings of the 16th Symposium on Computer Architecture and High Performance Computing (SBAC-PAD 2004), 2004

2002
Characterizing Parallel Workloads to Reduce Multiple Writer Overhead in Shared Virtual Memory Systems.
Proceedings of the 10th Euromicro Workshop on Parallel, 2002

2001
About the sensitivity of the HLRC-DU protocol on diff and page sizes.
Proceedings of the 2001 IEEE International Symposium on Performance Analysis of Systems and Software, 2001

2000
LIDE: a simulation environment for shared virtual memory systems.
SIGARCH Computer Architecture News, 2000


  Loading...