Dimitrios S. Nikolopoulos

According to our database1, Dimitrios S. Nikolopoulos
  • authored at least 194 papers between 1998 and 2018.
  • has a "Dijkstra number"2 of three.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

On csauthors.net:

Bibliography

2018
Intra-Node Memory Safe GPU Co-Scheduling.
IEEE Trans. Parallel Distrib. Syst., 2018

A taxonomy of task-based parallel programming technologies for high-performance computing.
The Journal of Supercomputing, 2018

DARE.
IJHPCA, 2018

Incremental Training of Deep Convolutional Neural Networks.
CoRR, 2018

2017
FairGV: Fair and Fast GPU Virtualization.
IEEE Trans. Parallel Distrib. Syst., 2017

ALEA: A Fine-Grained Energy Profiling Tool.
TACO, 2017

SCALO: Scalability-Aware Parallelism Orchestration for Multi-Threaded Workloads.
TACO, 2017

Managed acceleration for In-Memory database analytic workloads.
IJPEDS, 2017

On the Virtualization of CUDA Based GPU Remoting on ARM and X86 Machines in the GVirtuS Framework.
International Journal of Parallel Programming, 2017

GPU Virtualization and Scheduling Methods: A Comprehensive Survey.
ACM Comput. Surv., 2017

Intra-node Memory Safe GPU Co-Scheduling.
CoRR, 2017

Power Modelling for Heterogeneous Cloud-Edge Data Centers.
CoRR, 2017

Edge-as-a-Service: Towards Distributed Cloud Architectures.
CoRR, 2017

ENORM: A Framework For Edge NOde Resource Management.
CoRR, 2017

Feasibility of Fog Computing.
CoRR, 2017

Dependency-Aware Rollback and Checkpoint-Restart for Distributed Task-Based Runtimes.
CoRR, 2017

Error-Resilient Server Ecosystems for Edge and Cloud Datacenters.
IEEE Computer, 2017

REFINE: realistic fault injection via compiler-based instrumentation for accuracy, portability and speed.
Proceedings of the International Conference for High Performance Computing, 2017

A Taxonomy of Task-Based Technologies for High-Performance Computing.
Proceedings of the Parallel Processing and Applied Mathematics, 2017

Incremental Training of Deep Convolutional Neural Networks.
Proceedings of the International Workshop on Automatic Selection, 2017

Edge-as-a-Service: Towards Distributed Cloud Architectures.
Proceedings of the Parallel Computing is Everywhere, 2017

Power Modelling for Heterogeneous Cloud-Edge Data Centers.
Proceedings of the Parallel Computing is Everywhere, 2017

MiniSymposium on Edge Computing.
Proceedings of the Parallel Computing is Everywhere, 2017

Relaxing DRAM refresh rate through access pattern scheduling: A case study on stencil-based algorithms.
Proceedings of the 23rd IEEE International Symposium on On-Line Testing and Robust System Design, 2017

GraphGrind: addressing load imbalance of graph partitioning.
Proceedings of the International Conference on Supercomputing, 2017

Accelerating Graph Analytics by Utilising the Memory Locality of Graph Partitioning.
Proceedings of the 46th International Conference on Parallel Processing, 2017

MyMinder: A User-centric Decision Making Framework for Intercloud Migration.
Proceedings of the CLOSER 2017, 2017

2016
Exploiting Significance of Computations for Energy-Constrained Approximate Computing.
International Journal of Parallel Programming, 2016

Evaluating fault tolerance on asymmetric multicore systems-on-chip using iso-metrics.
IET Computers & Digital Techniques, 2016

Challenges and Opportunities in Edge Computing.
CoRR, 2016

Energy Optimization of Memory Intensive Parallel workloads.
CoRR, 2016

Myrmics: Scalable, Dependency-aware Task Scheduling on Heterogeneous Manycores.
CoRR, 2016

BDDT-SCC: A Task-parallel Runtime for Non Cache-Coherent Multicores.
CoRR, 2016

TwinCG: Dual Thread Redundancy with Forward Recovery for Conjugate Gradient Methods.
CoRR, 2016

Methods and metrics for fair server assessment under real-time financial workloads.
Concurrency and Computation: Practice and Experience, 2016

Brief Announcement: Energy Optimization of Memory Intensive Parallel Workloads.
Proceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures, 2016

Challenges and Opportunities in Edge Computing.
Proceedings of the 2016 IEEE International Conference on Smart Cloud, 2016

Runtime support for adaptive power capping on heterogeneous SoCs.
Proceedings of the International Conference on Embedded Computer Systems: Architectures, 2016


VarSys Introduction.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016

Operator and Workflow Optimization for High-Performance Analytics.
Proceedings of the Workshops of the EDBT/ICDT 2016 Joint Conference, 2016

ECOSCALE: Reconfigurable computing and runtime system for future exascale systems.
Proceedings of the 2016 Design, Automation & Test in Europe Conference & Exhibition, 2016

TwinPCG: Dual Thread Redundancy with Forward Recovery for Preconditioned Conjugate Gradient Methods.
Proceedings of the 2016 IEEE International Conference on Cluster Computing, 2016

TwinPCG: Dual Thread Redundancy with forward Recovery for Preconditioned Conjugate Gradient Methods.
Proceedings of the 2016 IEEE International Conference on Cluster Computing, 2016

HPTA: High-performance text analytics.
Proceedings of the 2016 IEEE International Conference on Big Data, 2016

Big data availability: Selective partial checkpointing for in-memory database queries.
Proceedings of the 2016 IEEE International Conference on Big Data, 2016

A scalable and composable map-reduce system.
Proceedings of the 2016 IEEE International Conference on Big Data, 2016

Low-Cost Hardware Infrastructure for Runtime Thread Level Energy Accounting.
Proceedings of the Architecture of Computing Systems - ARCS 2016, 2016

The VINEYARD Approach: Versatile, Integrated, Accelerator-Based, Heterogeneous Data Centres.
Proceedings of the Applied Reconfigurable Computing - 12th International Symposium, 2016

2015
Realizing Accelerated Cost-Effective Distributed RAID.
Proceedings of the Handbook on Data Centers, 2015

Iso-Quality of Service: Fairly Ranking Servers for Real-Time Data Analytics.
Parallel Processing Letters, 2015

Scalable black-box prediction models for multi-dimensional adaptation on NUMA multi-cores.
IJPEDS, 2015

On the potential of significance-driven execution for energy-aware HPC.
Computer Science - R&D, 2015

Guest Editorial.
IET Computers & Digital Techniques, 2015

ALEA: Fine-grain Energy Profiling with Basic Block Sampling.
CoRR, 2015

Iso-Quality of Service: Fairly Ranking Servers for Real-Time Data Analytics.
CoRR, 2015

Methods and Metrics for Fair Server Assessment under Real-Time Financial Workloads.
CoRR, 2015

Evaluating Asymmetric Multicore Systems-on-Chip using Iso-Metrics.
CoRR, 2015

On the Energy-Efficiency of Byte-Addressable Non-Volatile Memory.
Computer Architecture Letters, 2015

Towards automated data-driven model creation for cloud computing simulation.
Proceedings of the 8th International Conference on Simulation Tools and Techniques, 2015

A programming model and runtime system for significance-aware energy-efficient computing.
Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2015

Mini-Symposium on Energy and Resilience in Parallel Programming.
Proceedings of the Parallel Computing: On the Road to Exascale, 2015

Performance and Fault Tolerance of Preconditioned Iterative Solvers on Low-Power ARM Architectures.
Proceedings of the Parallel Computing: On the Road to Exascale, 2015

HpMC: An Energy-aware Management System of Multi-level Memory Architectures.
Proceedings of the 2015 International Symposium on Memory Systems, 2015

Application-Level Energy Awareness for OpenMP.
Proceedings of the OpenMP: Heterogenous Execution and Data Movements, 2015

Power Capping: What Works, What Does Not.
Proceedings of the 21st IEEE International Conference on Parallel and Distributed Systems, 2015

Energy-Efficient In-Memory Data Stores on Hybrid Memory Hierarchies.
Proceedings of the 11th International Workshop on Data Management on New Hardware, 2015

LS-ADT: Lightweight and Scalable Anomaly Detection for Cloud Datacentres.
Proceedings of the Cloud Computing and Services Science - 5th International Conference, 2015

A Lightweight Tool for Anomaly Detection in Cloud Data Centres.
Proceedings of the CLOSER 2015, 2015

A significance-driven programming framework for energy-constrained approximate computing.
Proceedings of the 12th ACM International Conference on Computing Frontiers, 2015

Software-managed energy-efficient hybrid DRAM/NVM main memory.
Proceedings of the 12th ACM International Conference on Computing Frontiers, 2015

ALEA: Fine-Grain Energy Profiling with Basic Block Sampling.
Proceedings of the 2015 International Conference on Parallel Architecture and Compilation, 2015

Energy-Efficient Hybrid DRAM/NVM Main Memory.
Proceedings of the 2015 International Conference on Parallel Architecture and Compilation, 2015

2014
Hybrid address spaces: A methodology for implementing scalable high-level programming models on non-coherent many-core architectures.
Journal of Systems and Software, 2014

FPGA prototyping of emerging manycore architectures for parallel programming research using Formic boards.
Journal of Systems Architecture - Embedded Systems Design, 2014

Distributed region-based memory allocation and synchronization.
IJHPCA, 2014

A Programming Model and Runtime System for Significance-Aware Energy-Efficient Computing.
CoRR, 2014

Energy Efficiency through Significance-Based Computing.
IEEE Computer, 2014

On the viability of microservers for financial analytics.
Proceedings of the 7th Workshop on High Performance Computational Finance, 2014

Fast Dynamic Binary Rewriting for flexible thread migration on shared-ISA heterogeneous MPSoCs.
Proceedings of the XIVth International Conference on Embedded Computer Systems: Architectures, 2014

Overcoming the Scalability Challenges of Epidemic Simulations on Blue Waters.
Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014

Power-capped DVFS and thread allocation with ANN models on modern NUMA systems.
Proceedings of the 32nd IEEE International Conference on Computer Design, 2014

Power modelling and capping for heterogeneous ARM/FPGA SoCs.
Proceedings of the 2014 International Conference on Field-Programmable Technology, 2014

The CACTOS Vision of Context-Aware Cloud Topology Optimization and Simulation.
Proceedings of the IEEE 6th International Conference on Cloud Computing Technology and Science, 2014

2013
Strategies for Energy-Efficient Resource Management of Hybrid Programming Models.
IEEE Trans. Parallel Distrib. Syst., 2013

Analysis of dependence tracking algorithms for task dataflow execution.
TACO, 2013

Deterministic scale-free pipeline parallelism with hyperqueues.
Proceedings of the International Conference for High Performance Computing, 2013

DRASync: distributed region-based memory allocation and synchronization.
Proceedings of the 20th European MPI Users's Group Meeting, 2013

Prefetching and cache management using task lifetimes.
Proceedings of the International Conference on Supercomputing, 2013

Topic 1: Support Tools and Environments - (Introduction).
Proceedings of the Euro-Par 2013 Parallel Processing, 2013

Fast dynamic binary rewriting to support thread migration in shared-ISA asymmetric multicores.
Proceedings of the First International Workshop on Code Optimisation for Multi and Many Cores, 2013

Inference and Declaration of Independence in Task-Parallel Programs.
Proceedings of the Advanced Parallel Processing Technologies, 2013

BDDT: Block-Level Dynamic Dependence Analysis for Task-Based Parallelism.
Proceedings of the Advanced Parallel Processing Technologies, 2013

2012
Critical path-based thread placement for NUMA systems.
SIGMETRICS Performance Evaluation Review, 2012

EPC: a power instrumentation controller for embedded applications.
SIGBED Review, 2012

Cache-Integrated Network Interfaces: Flexible On-Chip Communication and Synchronization for Large-Scale CMPs.
International Journal of Parallel Programming, 2012

BTL: A Framework for Measuring and Modeling Energy in Memory Hierarchies.
Proceedings of the IEEE 24th International Symposium on Computer Architecture and High Performance Computing, 2012

BDDT: : block-level dynamic dependence analysis for deterministic task-based parallelism.
Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2012

On the Use of GPUs in Realizing Cost-Effective Distributed RAID.
Proceedings of the 20th IEEE International Symposium on Modeling, 2012

The myrmics memory allocator: hierarchical, message-passing allocation for global address spaces.
Proceedings of the International Symposium on Memory Management, 2012

Model-based, memory-centric performance and power optimization on NUMA multiprocessors.
Proceedings of the 2012 IEEE International Symposium on Workload Characterization, 2012

Dynamic binary rewriting and migration for shared-ISA asymmetric, multicore processors: summary.
Proceedings of the 21st International Symposium on High-Performance Parallel and Distributed Computing, 2012

Formic: Cost-efficient and Scalable Prototyping of Manycore Architectures.
Proceedings of the 2012 IEEE 20th Annual International Symposium on Field-Programmable Custom Computing Machines, 2012

Topic 16: GPU and Accelerators Computing.
Proceedings of the Euro-Par 2012 Parallel Processing - 18th International Conference, 2012

Inference and declaration of independence: impact on deterministic task parallelism.
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2012

2011
A capabilities-aware framework for using computational accelerators in data-intensive computing.
J. Parallel Distrib. Comput., 2011

Task-based parallel H.264 video encoding for explicit communication architectures.
Proceedings of the 2011 International Conference on Embedded Computer Systems: Architectures, 2011

A programming model for deterministic task parallelism.
Proceedings of the 2011 ACM SIGPLAN workshop on Memory Systems Performance and Correctness: held in conjunction with PLDI '11, 2011

Scalable Runtime Support for Data Intensive Applications on the Single Chip Cloud Computer.
Proceedings of the 3rd Many-core Applications Research Community (MARC) Symposium. Proceedings of the 3rd MARC Symposium, 2011

Parallel Programming of General-Purpose Programs Using Task-Based Programming Models.
Proceedings of the 3rd USENIX Workshop on Hot Topics in Parallelism, 2011

Fine-grain OpenMP runtime support with explicit communication hardware primitives.
Proceedings of the Design, Automation and Test in Europe, 2011

Scalable memory registration for high performance networks using helper threads.
Proceedings of the 8th Conference on Computing Frontiers, 2011

A Unified Scheduler for Recursive and Task Dataflow Parallelism.
Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques, 2011

2010
Explicit Communication and Synchronization in SARC.
IEEE Micro, 2010

Strider: Runtime Support for Optimizing Strided Data Accesses on Multi-Cores with Explicitly Managed Memories.
Proceedings of the Conference on High Performance Computing Networking, 2010

Hybrid MPI/OpenMP power-aware computing.
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Power-aware MPI task aggregation prediction for high-end computing systems.
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Rearchitecting MapReduce for Heterogeneous Multicore Processors with Explicitly Managed Memories.
Proceedings of the 39th International Conference on Parallel Processing, 2010

Tagged Procedure Calls (TPC): Efficient Runtime Support for Task-Based Parallelism on the Cell Processor.
Proceedings of the High Performance Embedded Architectures and Compilers, 2010

Comparing Scalability Prediction Strategies on an SMP of CMPs.
Proceedings of the Euro-Par 2010 - Parallel Processing, 16th International Euro-Par Conference, Ischia, Italy, August 31, 2010

Evaluation of streaming aggregation on parallel hardware architectures.
Proceedings of the Fourth ACM International Conference on Distributed Event-Based Systems, 2010

On-chip communication and synchronization mechanisms with cache-integrated network interfaces.
Proceedings of the 7th Conference on Computing Frontiers, 2010

Designing Accelerator-Based Distributed Systems for High Performance.
Proceedings of the 10th IEEE/ACM International Conference on Cluster, 2010

2009
Supporting MapReduce on large-scale asymmetric multi-core clusters.
Operating Systems Review, 2009

Algorithm, software, and hardware optimizations for Delaunay mesh generation on simultaneous multithreaded architectures.
J. Parallel Distrib. Comput., 2009

A multigrain Delaunay mesh generation method for multicore SMT-based architectures.
J. Parallel Distrib. Comput., 2009

Green Building Blocks - Software Stacks for Energy-Efficient Clusters and Data Centres.
ERCIM News, 2009

Programming Multiprocessors with Explicitly Managed Memory Hierarchies.
IEEE Computer, 2009

A comparison of programming models for multiprocessors with explicitly managed memory hierarchies.
Proceedings of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2009

CellMR: A framework for supporting mapreduce on asymmetric cell-based clusters.
Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

Scheduling dynamic parallelism on accelerators.
Proceedings of the 6th Conference on Computing Frontiers, 2009

2008
Prediction-Based Power-Performance Adaptation of Multithreaded Scientific Codes.
IEEE Trans. Parallel Distrib. Syst., 2008

Set-Top Supercomputing: Scalable Software for Scientific Simulations on GameConsoles.
ERCIM News, 2008

VT-ASOS: Holistic system software customization for many cores.
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

Modeling Multigrain Parallelism on Heterogeneous Multi-core Processors: A Case Study of the Cell BE.
Proceedings of the High Performance Embedded Architectures and Compilers, 2008

DMA-based prefetching for i/o-intensive workloads on the cell architecture.
Proceedings of the 5th Conference on Computing Frontiers, 2008

Cell-SWat: modeling and scheduling wavefront computations on the cell broadband engine.
Proceedings of the 5th Conference on Computing Frontiers, 2008

Scheduling Asymmetric Parallelism on a PlayStation3 Cluster.
Proceedings of the 8th IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2008), 2008

Prediction models for multi-dimensional power-performance optimization on many cores.
Proceedings of the 17th International Conference on Parallel Architecture and Compilation Techniques, 2008

2007
Exploring New Search Algorithms and Hardware for Phylogenetics: RAxML Meets the IBM Cell.
VLSI Signal Processing, 2007

Runtime scheduling of dynamic parallelism on accelerator-based multi-core systems.
Parallel Computing, 2007

Runtime and Programming Support for Memory Adaptation in Scientific Applications via Local Disk and Remote Memory.
J. Grid Comput., 2007

Dynamic multigrain parallelization on the cell broadband engine.
Proceedings of the 12th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2007

RAxML-Cell: Parallel Phylogenetic Tree Inference on the Cell Broadband Engine.
Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

Identifying energy-efficient concurrency levels using machine learning.
Proceedings of the 2007 IEEE International Conference on Cluster Computing, 2007

A comparison of online and offline strategies for program adaptation.
Proceedings of the 45th Annual Southeast Regional Conference, 2007

2006
PACMAN: A PerformAnce Counters MANager for Intel Hyperthreaded Processors.
Proceedings of the Third International Conference on the Quantitative Evaluation of Systems (QEST 2006), 2006

Scalable locality-conscious multithreaded memory allocation.
Proceedings of the 5th International Symposium on Memory Management, 2006

MESA: reducing cache conflicts by integrating static and run-time methods.
Proceedings of the 2006 IEEE International Symposium on Performance Analysis of Systems and Software, 2006

Online strategies for high-performance power-aware thread execution on emerging multiprocessors.
Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

Online power-performance adaptation of multithreaded programs using hardware event-based prediction.
Proceedings of the 20th Annual International Conference on Supercomputing, 2006

Runtime Support for Memory Adaptation in Scientific Applications via Local Disk and Remote Memory.
Proceedings of the 15th IEEE International Symposium on High Performance Distributed Computing, 2006

2005
An Evaluation of OpenMP on Current and Emerging Multithreaded/Multicore Processors.
Proceedings of the OpenMP Shared Memory Parallel Programming - International Workshops, 2005

Scheduling Algorithms for Effective Thread Pairing on Hybrid Multiprocessors.
Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005

Multigrain parallel Delaunay Mesh generation: challenges and opportunities for multithreaded architectures.
Proceedings of the 19th Annual International Conference on Supercomputing, 2005

Factory: An Object-Oriented Parallel Programming Substrate for Deep Multiprocessors.
Proceedings of the High Performance Computing and Communications, 2005

smt- SPRINTS: Software Precomputation with Intelligent Streaming for Resource-Constrained SMTs.
Proceedings of the Euro-Par 2005, Parallel Processing, 11th International Euro-Par Conference, Lisbon, Portugal, August 30, 2005

2004
Dynamic tiling for effective use of shared caches on multithreaded processors.
IJHPCN, 2004

Adapting to Memory Pressure from within Scientific Applications on Multiprogrammed COWs.
Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), 2004

Realistic Workload Scheduling Policies for Taming the Memory Bandwidth Bottleneck of SMPs.
Proceedings of the High Performance Computing, 2004

2003
Scaling non-regular shared-memory codes by reusing custom loop schedules.
Scientific Programming, 2003

Quantifying contention and balancing memory load on hardware DSM multiprocessors.
J. Parallel Distrib. Comput., 2003

Adaptive scheduling under memory constraints on non-dedicated computationalfarms.
Future Generation Comp. Syst., 2003

Code and Data Transformations for Improving Shared Cache Performance on SMT Processors.
Proceedings of the High Performance Computing, 5th International Symposium, 2003

Malleable Memory Mapping: User-Level Control of Memory Bounds for Effective Program Adaptation.
Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003

Scheduling Algorithms with Bus Bandwidth Considerations for SMPs.
Proceedings of the 32nd International Conference on Parallel Processing (ICPP 2003), 2003

2002
Scheduler-Activated Dynamic Page Migration for Multiprogrammed DSM Multiprocessors.
J. Parallel Distrib. Comput., 2002

Runtime vs. Manual Data Distribution for Architecture-Agnostic Shared-Memory Programming Models.
International Journal of Parallel Programming, 2002

Adaptive Scheduling under Memory Pressure on Multiprogrammed SMPs.
Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS 2002), 2002

Quantifying and Resolving Remote Memory Access Contention on Hardware DSM Multiprocessors.
Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS 2002), 2002

Effective Cross-Platform, Multilevel Parallelism via Dynamic Adaptive Execution.
Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS 2002), 2002

Adaptive Scheduling under Memory Pressure on Multiprogrammed Cluster.
Proceedings of the 2nd IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2002), 2002

2001
Exploiting memory affinity in OpenMP through schedule reuse.
SIGARCH Computer Architecture News, 2001

The Architectural and Operating System Implications on the Performance of Synchronization on ccNUMA Multiprocessors.
International Journal of Parallel Programming, 2001

A Study of Implicit Data Distribution Methods for OpenMP Using the SPEC Benchmarks.
Proceedings of the OpenMP Shared Memory Parallel Programming, 2001

Scaling irregular parallel codes with minimal programming effort.
Proceedings of the 2001 ACM/IEEE conference on Supercomputing, 2001

The trade-off between implicit and explicit data distribution in shared-memory programming paradigms.
Proceedings of the 15th international conference on Supercomputing, 2001

Informing Algorithms for Efficient Scheduling of Synchronizing Threads on Multiprogrammed SMPs.
Proceedings of the 2001 International Conference on Parallel Processing, 2001

Improving Java Server Performance with Interruptlets.
Proceedings of the Computational Science - ICCS 2001, 2001

A Transparent Operating System Infrastructure for Embedding Adaptability to Thread-Based Programming Models.
Proceedings of the Euro-Par 2001: Parallel Processing, 2001

2000
Is Data Distribution Necessary in OpenMP?
Proceedings of the Proceedings Supercomputing 2000, 2000

Efficient Dynamic Parallelism with OpenMP on Linux SMPs.
Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, 2000

UPMLIB: A Runtime System for Tuning the Memory Performance of OpenMP Programs on Scalable Shared-Memory Multiprocessors.
Proceedings of the Languages, 2000

A Tool to Schedule Parallel Applications on Multiprocessors: The NANOS CPU MANAGER.
Proceedings of the Job Scheduling Strategies for Parallel Processing, IPDPS 2000 Workshop, 2000

Leveraging Transparent Data Distribution in OpenMP via User-Level Dynamic Page Migration.
Proceedings of the High Performance Computing, Third International Symposium, 2000

Fast Synchronization on Scalable Cache-Coherent Multiprocessors using Hybrid Primitives.
Proceedings of the 14th International Parallel & Distributed Processing Symposium (IPDPS'00), 2000

A case for use-level dynamic page migration.
Proceedings of the 14th international conference on Supercomputing, 2000

User-Level Dynamic Page Migration for Multiprogrammed Shared-Memory Multiprocessors.
Proceedings of the 2000 International Conference on Parallel Processing, 2000

1999
Fine-Grain and Multiprogramming-Conscious Nanothreading with the Solaris Operating System.
Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, 1999

Achieving multiprogramming scalability of parallel programs on Intel SMP platforms: Nanothreading in the Linux kernel.
Proceedings of the Parallel Computing: Fundamentals & Applications, 1999

A quantitative architectural evaluation of synchronization algorithms and disciplines on ccNUMA systems: the case of the SGI Origin2000.
Proceedings of the 13th international conference on Supercomputing, 1999

1998
Efficient Runtime Thread Management for the Nano-Threads Programming Model.
IPPS/SPDP Workshops, 1998

Kernel-level Scheduling for the Nano-threads Programming Model.
Proceedings of the 12th international conference on Supercomputing, 1998

Enhancing the Performance of Auroscheduling in Distributed Shared Memory Multiprocessors.
Proceedings of the Euro-Par '98 Parallel Processing, 1998


  Loading...