Jesper Larsson Träff
According to our database1,
Jesper Larsson Träff
authored at least 138 papers
between 1989 and 2018.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis OtherLinks
Homepages:
-
at orcid.org
On csauthors.net:
Bibliography
2018
Practical, distributed, low overhead algorithms for irregular gather and scatter collectives.
Parallel Computing, 2018
Supporting concurrent memory access in TCF processor architectures.
Microprocessors and Microsystems - Embedded Hardware Design, 2018
Brief Announcement: Stamp-it, a more Thread-efficient, Concurrent Memory Reclamation Scheme in the C++ Memory Model.
Proceedings of the 30th on Symposium on Parallelism in Algorithms and Architectures, 2018
Full-Duplex Inter-Group All-to-All Broadcast Algorithms with Optimal Bandwidth.
Proceedings of the 25th European MPI Users' Group Meeting, 2018
Stamp-it, amortized constant-time memory reclamation in comparison to five other schemes.
Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2018
Implementation of Multioperations in Thick Control Flow Processors.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium Workshops, 2018
2017
On expected and observed communication performance with MPI derived datatypes.
Parallel Computing, 2017
Better Process Mapping and Sparse Quadratic Assignment.
Proceedings of the 16th International Symposium on Experimental Algorithms, 2017
Practical, linear-time, fully distributed algorithms for irregular gather and scatter.
Proceedings of the 24th European MPI Users' Group Meeting, 2017
Supporting concurrent memory access in TCF-aware processor architectures.
Proceedings of the IEEE Nordic Circuits and Systems Conference, 2017
Exploiting Common Neighborhoods to Optimize MPI Neighborhood Collectives.
Proceedings of the 24th IEEE International Conference on High Performance Computing, 2017
2016
Special issue: Euro-Par 2015.
Concurrency and Computation: Practice and Experience, 2016
(Mis)managing parallel computing research through EU project funding.
Commun. ACM, 2016
The EPiGRAM Project: Preparing Parallel Programming Models for Exascale.
Proceedings of the High Performance Computing, 2016
Brief Announcement: Benchmarking Concurrent Priority Queues.
Proceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures, 2016
A Library for Advanced Datatype Programming.
Proceedings of the 23rd European MPI Users' Group Meeting, EuroMPI 2016, 2016
On the Expected and Observed Communication Performance with MPI Derived Datatypes.
Proceedings of the 23rd European MPI Users' Group Meeting, EuroMPI 2016, 2016
Polynomial-Time Construction of Optimal MPI Derived Datatype Trees.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016
Automatic Verification of Self-consistent MPI Performance Guidelines.
Proceedings of the Euro-Par 2016: Parallel Processing, 2016
2015
Isomorphic, Sparse MPI-like Collective Communication Operations for Parallel Stencil Computations.
Proceedings of the 22nd European MPI Users' Group Meeting, 2015
Specification Guideline Violations by MPI_Dims_create.
Proceedings of the 22nd European MPI Users' Group Meeting, 2015
Efficient, Optimal MPI Datatype Reconstruction for Vector and Index Types.
Proceedings of the 22nd European MPI Users' Group Meeting, 2015
The lock-free k-LSM relaxed priority queue.
Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2015
2014
Perfectly Load-Balanced, Stable, Synchronization-Free Parallel Merge.
Parallel Processing Letters, 2014
Selected Papers from EuroMPI 2012 - 19th European MPI Users' Group Meeting.
Computing, 2014
Zero-copy, Hierarchical Gather is not possible with MPI Datatypes and Collectives.
Proceedings of the 21st European MPI Users' Group Meeting, 2014
MPI Collectives and Datatypes for Hierarchical All-to-all Communication.
Proceedings of the 21st European MPI Users' Group Meeting, 2014
Optimal MPI Datatype Normalization for Vector and Index-block Types.
Proceedings of the 21st European MPI Users' Group Meeting, 2014
Reproducible MPI Micro-Benchmarking Isn't As Easy As You Think.
Proceedings of the 21st European MPI Users' Group Meeting, 2014
Data structures for task-based priority scheduling.
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2014
Implementing a classic: zero-copy all-to-all communication with mpi datatypes.
Proceedings of the 2014 International Conference on Supercomputing, 2014
2013
Work-stealing with configurable scheduling strategies.
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2013
The Pheet Task-Scheduling Framework on the Intel® Xeon Phi Coprocessor and other Multicore Architectures.
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013
2012
Alternative, uniformly expressive and more scalable interfaces for collective communication in MPI.
Parallel Computing, 2012
Poster: Leveraging PEPPHER Technology for Performance Portable Supercomputing.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012
Abstract: Leveraging PEPPHER Technology for Performance Portable Supercomputing.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012
mpicroscope: Towards an MPI Benchmark Tool for Performance Guideline Verification.
Proceedings of the Recent Advances in the Message Passing Interface, 2012
Efficient MPI Implementation of a Parallel, Stable Merge Algorithm.
Proceedings of the Recent Advances in the Message Passing Interface, 2012
Programmability and performance portability aspects of heterogeneous multi-/manycore systems.
Proceedings of the 2012 Design, Automation & Test in Europe Conference & Exhibition, 2012
2011
Broadcast.
Proceedings of the Encyclopedia of Parallel Computing, 2011
Allgather.
Proceedings of the Encyclopedia of Parallel Computing, 2011
All-to-All.
Proceedings of the Encyclopedia of Parallel Computing, 2011
Scan for Distributed Memory, Message-Passing Systems.
Proceedings of the Encyclopedia of Parallel Computing, 2011
Collective Communication.
Proceedings of the Encyclopedia of Parallel Computing, 2011
Mpi on millions of Cores.
Parallel Processing Letters, 2011
PEPPHER: Efficient and Productive Usage of Hybrid Computing Systems.
IEEE Micro, 2011
The scalable process topology interface of MPI 2.2.
Concurrency and Computation: Practice and Experience, 2011
Work-stealing for mixed-mode parallelism by deterministic team-building.
Proceedings of the SPAA 2011: Proceedings of the 23rd Annual ACM Symposium on Parallelism in Algorithms and Architectures, 2011
Performance Expectations and Guidelines for MPI Derived Datatypes.
Proceedings of the Recent Advances in the Message Passing Interface, 2011
Using MPI Derived Datatypes in Numerical Libraries.
Proceedings of the Recent Advances in the Message Passing Interface, 2011
The PEPPHER Approach to Programmability and Performance Portability for Heterogeneous many-core Architectures.
Proceedings of the Applications, Tools and Techniques on the Road to Exascale Computing, Proceedings of the conference ParCo 2011, 31 August, 2011
An Extended Work-Stealing Framework for Mixed-Mode Parallel Applications.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011
A (Radical) Proposal Addressing the Non-scalability of the Irregular MPI Collective Interfaces.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011
Introduction.
Proceedings of the Euro-Par 2011 Parallel Processing - 17th International Conference, 2011
2010
Self-Consistent MPI Performance Guidelines.
IEEE Trans. Parallel Distrib. Syst., 2010
A Pipelined Algorithm for Large, Irregular All-Gather Problems.
IJHPCA, 2010
Transparent Neutral Element Elimination in MPI Reduction Operations.
Proceedings of the Recent Advances in the Message Passing Interface, 2010
Compact and Efficient Implementation of the MPI Group Operations.
Proceedings of the Recent Advances in the Message Passing Interface, 2010
Toward Performance Models of MPI Implementations for Understanding Application Scaling Issues.
Proceedings of the Recent Advances in the Message Passing Interface, 2010
Multicore and Manycore Programming.
Proceedings of the Euro-Par 2010 - Parallel Processing, 16th International Euro-Par Conference, Ischia, Italy, August 31, 2010
HPPC 2010: 5th Workshop on Highly Parallel Processing on a Chip.
Proceedings of the Euro-Par 2011: Parallel Processing Workshops - CCPI, CGWS, HeteroPar, HiBB, HPCVirt, HPPC, HPSS, MDGS, ProPer, Resilience, UCHPC, VHPC, Bordeaux, France, August 29, 2010
HPPC 2010: Forth Workshop on Highly Parallel Processing on a Chip.
Proceedings of the Euro-Par 2010 Parallel Processing Workshops, 2010
2009
Reviewers for Scientific Programming Special Issue on Software Development for Multi-core Computing Systems.
Scientific Programming, 2009
Introduction to the Scientific Programming Special Issue: Software Development for Multi-core Computing Systems.
Scientific Programming, 2009
Relationships between Regular and Irregular Collective Communication Operations on Clustered Multiprocessors.
Parallel Processing Letters, 2009
Two-tree algorithms for full bandwidth broadcast, reduction and scan.
Parallel Computing, 2009
What the parallel-processing community has (failed) to offer the multi/many-core generation.
J. Parallel Distrib. Comput., 2009
Exploiting Efficient Transpacking for One-Sided Communication and MPI-IO.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2009
MPI on a Million Processors.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2009
Sparse collective operations for MPI.
Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009
Investigating High Performance RMA Interfaces for the MPI-3 Standard.
Proceedings of the ICPP 2009, 2009
HPPC 2009: 3rd Workshop on Highly Parallel Processing on a Chip.
Proceedings of the Euro-Par 2009, 2009
HPPC 2009 Panel: Are Many-Core Computer Vendors on Track?
Proceedings of the Euro-Par 2009, 2009
Aspects of the efficient implementation of the message passing interface (MPI).
Shaker, 2009
2008
Optimal broadcast for fully connected processor-node networks.
J. Parallel Distrib. Comput., 2008
A Simple, Pipelined Algorithm for Large, Irregular All-gather Problems.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2008
Constructing MPI Input-output Datatypes for Efficient Transpacking.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2008
Self-consistent MPI-IO Performance Requirements and Expectations.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2008
How to avoid making the same Mistakes all over again: What the parallel-processing community has (failed) to offer the multi/many-core generation.
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008
Second Workshop on Highly Parallel Processing on a Chip (HPPC 2008).
Proceedings of the Euro-Par 2008 Workshops, 2008
User-Land Work Stealing Schedulers: Towards a Standard.
Proceedings of the Second International Conference on Complex, 2008
2007
Selected papers from EuroPVM/MPI 2006.
Parallel Computing, 2007
A test suite for parallel performance analysis tools.
Concurrency and Computation: Practice and Experience, 2007
Self-consistent MPI Performance Requirements.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 14th European PVM/MPI User's Group Meeting, Paris, France, September 30, 2007
Full Bandwidth Broadcast, Reduction and Scan with Only Two Trees.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 14th European PVM/MPI User's Group Meeting, Paris, France, September 30, 2007
HPPC 2007: Workshop on Highly Parallel Processing on a Chip.
Proceedings of the Euro-Par 2007 Workshops: Parallel Processing, 2007
2006
Direct graph k-partitioning with a Kernighan-Lin like heuristic.
Oper. Res. Lett., 2006
Efficient Allgather for Regular SMP-Clusters.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2006
Parallel Prefix (Scan) Algorithms for MPI.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2006
What MPI Could (and Cannot) Do for Mesh-Partitioning on Non-homogeneous Networks.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2006
Collective operations in NEC's high-performance MPI libraries.
Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006
2005
An Optimal Broadcast Algorithm Adapted to SMP Clusters.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2005
An Improved Algorithm for (Non-commutative) Reduce-Scatter with an Application.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2005
The MPI/SX Collectives Verification Library.
Proceedings of the Parallel Computing: Current & Future Issues of High-End Computing, 2005
Optimal Broadcast for Fully Connected Networks.
Proceedings of the High Performance Computing and Communications, 2005
2004
Verifying Collective MPI Calls.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2004
A Simple Work-Optimal Broadcast Algorithm for Message-Passing Parallel Systems.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2004
More Efficient Reduction Algorithms for Non-Power-of-Two Number of Processors in Message-Passing Parallel Systems.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2004
Hierarchical Gather/Scatter Algorithms with Graceful Degradation.
Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), 2004
Evaluating OpenMP Performance Analysis Tools with the APART Test Suite.
Proceedings of the Euro-Par 2004 Parallel Processing, 2004
2003
Fast Parallel Non-Contiguous File Access.
Proceedings of the ACM/IEEE SC2003 Conference on High Performance Networking and Computing, 2003
Improving Generic Non-contiguous File Access for MPI-IO.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface,10th European PVM/MPI Users' Group Meeting, Venice, Italy, September 29, 2003
SMP-Aware Message Passing Programming.
Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003
Initial Design of a Test Suite for Automatic Performance Analysis Tools.
Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003
SMP-Aware Message Passing Programming.
Proceedings of the Eighth International Workshop on High-Level Parallel Programming Models and Supportive Environments (HIPS'03), 2003
Initial Design of a Test Suite for Automatic Performance Analysis Tools.
Proceedings of the Eighth International Workshop on High-Level Parallel Programming Models and Supportive Environments (HIPS'03), 2003
A Practical Minimum Spanning Tree Algorithm Using the Cycle Property.
Proceedings of the Algorithms, 2003
2002
SKaMPI: a comprehensive benchmark for public benchmarking of MPI.
Scientific Programming, 2002
Implementing the MPI process topology mechanism.
Proceedings of the 2002 ACM/IEEE conference on Supercomputing, 2002
Improved MPI All-to-all Communication on a Giganet SMP Cluster.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 9th European PVM/MPI Users' Group Meeting, Linz, Austria, September 29, 2002
The Hierarchical Factor Algorithm for All-to-All Communication (Research Note).
Proceedings of the Euro-Par 2002, 2002
2001
MPI-2 One-Sided Communications on a Giganet SMP Cluster.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2001
Practical PRAM programming.
Wiley series on parallel and distributed computing, Wiley, 2001
2000
The Implementation of MPI-2 One-Sided Communication for the NEC SX-5.
Proceedings of the Proceedings Supercomputing 2000, 2000
A Benchmark for MPI Derived Datatypes.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2000
Formalizing OpenMP Performance Properties with ASL.
Proceedings of the High Performance Computing, Third International Symposium, 2000
On Performance Modeling for HPF Applications with ASL.
Proceedings of the High Performance Computing, Third International Symposium, 2000
Specification of Performance Problems in MPI Programs with ASL.
Proceedings of the 2000 International Conference on Parallel Processing, 2000
1999
Flattening on the Fly: Efficient Handling of MPI Derived Datatypes.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 1999
A PC Cluster with Application-Quality MPI.
Proceedings of the Euro-Par '99 Parallel Processing, 5th International Euro-Par Conference, Toulouse, France, August 31, 1999
1998
A Parallel Priority Queue with Constant Time Operations.
J. Parallel Distrib. Comput., 1998
An Implementation of the Binary Blocking Flow Algorithm.
Proceedings of the Algorithm Engineering, 1998
Portable Randomized List Ranking on Multiprocessors Using MPI.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 1998
1997
Language and library support for practical PRAM programming.
Proceedings of the Fifth Euromicro Workshop on Parallel and Distributed Processing (PDP '97), 1997
A Parallel Priority Data Structure with Applications.
Proceedings of the 11th International Parallel Processing Symposium (IPPS '97), 1997
A Meticulous Analysis of Mergesort Programs.
Proceedings of the Algorithms and Complexity, Third Italian Conference, 1997
1996
A Library of Basic PRAM Algorithms and its Implementation in FORK.
Proceedings of the 8th Annual ACM Symposium on Parallel Algorithms and Architectures, 1996
A Simple Parallel Algorithm for the Single-Source Shortest Path Problem on Planar Digraphs.
Proceedings of the Parallel Algorithms for Irregularly Structured Problems, 1996
1995
An Experimental Comparison of two Distributed Single-Source Shortest Path Algorithms.
Parallel Computing, 1995
Distributed and parallel graph algorithms - models and experiments.
PhD thesis, 1995
1994
Do Inherently Sequential Branch-and-Bound Algorithms Exist?
Parallel Processing Letters, 1994
Distributed, Synchronized Implementation of an Algorithm for the Maximum Flow Problem.
Proceedings of the 1994 International Conference on Parallel Processing, 1994
1993
Precis: Distributed Shortest Path Algorithms.
Proceedings of the PARLE '93, 1993
Parallel and Distributed Algorithms for the Single-Source Shortest Path Problem Based on Moore's Algorithm.
Proceedings of the Parallel Computing: Trends and Applications, 1993
1992
Partial Memoization for Obtaining Linear Time Behavior of a 2DPDA.
Theor. Comput. Sci., 1992
Meta-Programming for Reordering Literals in Deductive Databases.
Proceedings of the Meta-Programming in Logic, 3rd International Workshop, 1992
1991
Implementation of parallel branch-and-bound algorithms - experiences with the graph partitioning problem.
Annals OR, 1991
1989
Experiments with Implementations of Two Theoretical Constructions.
Proceedings of the Logic at Botik '89, 1989