Michael Lang

Orcid: 0000-0002-3498-6352

Affiliations:
  • Los Alamos National Laboratory


According to our database1, Michael Lang authored at least 81 papers between 2004 and 2020.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2020
Symbiotic HW Cache and SW DTLB Prefetching for DRAM/NVM Hybrid Memory.
Proceedings of the 28th International Symposium on Modeling, 2020

Global link arrangement for practical Dragonfly.
Proceedings of the ICS '20: 2020 International Conference on Supercomputing, 2020

2019
Modeling Universal Globally Adaptive Load-Balanced Routing.
ACM Trans. Parallel Comput., 2019

Topology-custom UGAL routing on dragonfly.
Proceedings of the International Conference for High Performance Computing, 2019

A Foundation for Automated Placement of Data.
Proceedings of the IEEE/ACM Fourth International Parallel Data Systems Workshop, 2019

Performance characterization of a DRAM-NVM hybrid memory architecture for HPC applications using intel optane DC persistent memory modules.
Proceedings of the International Symposium on Memory Systems, 2019

Topology-Aware Event Sequence Mining for Understanding HPC System Behavior and Detecting Anomalies.
Proceedings of the 21st IEEE International Conference on High Performance Computing and Communications; 17th IEEE International Conference on Smart City; 5th IEEE International Conference on Data Science and Systems, 2019

2018
Rapid Calculation of Max-Min Fair Rates for Multi-Commodity Flows in Fat-Tree Networks.
IEEE Trans. Parallel Distributed Syst., 2018

Random Regular Graph and Generalized De Bruijn Graph with k-Shortest Path Routing.
IEEE Trans. Parallel Distributed Syst., 2018

TPR: Traffic Pattern-Based Adaptive Routing for Dragonfly Networks.
IEEE Trans. Multi Scale Comput. Syst., 2018

Using virtualization to quantify power conservation via near-threshold voltage reduction for inherently resilient applications.
Parallel Comput., 2018

Fast classification of MPI applications using Lamport's logical clocks.
J. Parallel Distributed Comput., 2018

Heterogeneous Memory and Arena-Based Heap Allocation.
Proceedings of the Workshop on Memory Centric High Performance Computing, 2018

Performance and Accuracy Trade-offs of HPC Application Modeling and Simulation.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium, 2018

Event Block Identification and Analysis for Effective Anomaly Detection to Build Reliable HPC Systems.
Proceedings of the 20th IEEE International Conference on High Performance Computing and Communications; 16th IEEE International Conference on Smart City; 4th IEEE International Conference on Data Science and Systems, 2018

A Comparative Study of Topology Design Approaches for HPC Interconnects.
Proceedings of the 18th IEEE/ACM International Symposium on Cluster, 2018

ACTOR: Active Cloud Storage with Energy-Efficient On-Drive Data Processing.
Proceedings of the IEEE International Conference on Big Data (IEEE BigData 2018), 2018

Converting Unstructured System Logs into Structured Event List for Anomaly Detection.
Proceedings of the 13th International Conference on Availability, Reliability and Security, 2018

2017
TCASM: An asynchronous shared memory interface for high-performance application composition.
Parallel Comput., 2017

NUMA Distance for Heterogeneous Memory.
Proceedings of the Workshop on Memory Centric Programming for HPC, 2017

Modeling UGAL on the Dragonfly Topology.
Proceedings of the High Performance Computing Systems. Performance Modeling, Benchmarking, and Simulation, 2017

Optimized scatter/gather data operations for parallel storage.
Proceedings of the 2nd Joint International Workshop on Parallel Data Storage & Data Intensive Scalable Computing Systems, 2017

A comparative study of SDN and adaptive routing on dragonfly networks.
Proceedings of the International Conference for High Performance Computing, 2017

UNITY: Unified Memory and File Space.
Proceedings of the 7th International Workshop on Runtime and Operating Systems for Supercomputers, 2017

Throughput Models of Interconnection Networks: The Good, the Bad, and the Ugly.
Proceedings of the 25th IEEE Annual Symposium on High-Performance Interconnects, 2017

RSVP: Soft Error Resilient Power Savings at Near-Threshold Voltage Using Register Vulnerability.
Proceedings of the 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops, 2017

2016
Exploring the Design Tradeoffs for Extreme-Scale High-Performance Computing System Software.
IEEE Trans. Parallel Distributed Syst., 2016

TracSim: Simulating and scheduling trapped power capacity to maximize machine room throughput.
Parallel Comput., 2016

Load-balanced and locality-aware scheduling for data-intensive workloads at extreme scales.
Concurr. Comput. Pract. Exp., 2016

Power usage of production supercomputers and production workloads.
Concurr. Comput. Pract. Exp., 2016

Enhancing infiniband with openflow-style SDN capability.
Proceedings of the International Conference for High Performance Computing, 2016

Active Burst-Buffer: In-Transit Processing Integrated into Hierarchical Storage.
Proceedings of the IEEE International Conference on Networking, 2016

Random Regular Graph and Generalized De Bruijn Graph with k-Shortest Path Routing.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016

A Cross-Enclave Composition Mechanism for Exascale System Software.
Proceedings of the 6th International Workshop on Runtime and Operating Systems for Supercomputers, 2016

Traffic Pattern-Based Adaptive Routing for Intra-Group Communication in Dragonfly Networks.
Proceedings of the 24th IEEE Annual Symposium on High-Performance Interconnects, 2016

Characterizing power and energy efficiency of legion runtime and applications: An early experience.
Proceedings of the Seventh International Green and Sustainable Computing Conference, 2016

2015
Hop: Elastic Consistency for Exascale Data Stores.
Proceedings of the High Performance Computing - 30th International Conference, 2015

Measurement and characterization of Haswell power and energy consumption.
Proceedings of the 3rd International Workshop on Energy Efficient Supercomputing, 2015

Towards Scalable Distributed Workload Manager with Monitoring-Based Weakly Consistent Resource Stealing.
Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing, 2015

What is a Lightweight Kernel?
Proceedings of the 5th International Workshop on Runtime and Operating Systems for Supercomputers, 2015

System-Level Support for Composition of Applications.
Proceedings of the 5th International Workshop on Runtime and Operating Systems for Supercomputers, 2015

Dynamic Adaptation for Elastic System Services Using Virtual Servers.
Proceedings of the 22nd IEEE International Conference on High Performance Computing, 2015

Overcoming Hadoop Scaling Limitations through Distributed Task Execution.
Proceedings of the 2015 IEEE International Conference on Cluster Computing, 2015

Fast Calculation of Max-Min Fair Rates for Multi-commodity Flows in Fat-Tree Networks.
Proceedings of the 2015 IEEE International Conference on Cluster Computing, 2015

2014
Static load-balanced routing for slimmed fat-trees.
J. Parallel Distributed Comput., 2014

Trapped capacity: scheduling under a power cap to maximize machine-room throughput.
Proceedings of the 2nd International Workshop on Energy Efficient Supercomputing, 2014

LFTI: A New Performance Metric for Assessing Interconnect Designs for Extreme-Scale HPC Systems.
Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014

Next generation job management systems for extreme-scale ensemble computing.
Proceedings of the 23rd International Symposium on High-Performance Parallel and Distributed Computing, 2014

Optimizing load balancing and data-locality with data-aware scheduling.
Proceedings of the 2014 IEEE International Conference on Big Data (IEEE BigData 2014), 2014

Enabling composite applications through an asynchronous shared memory interface.
Proceedings of the 2014 IEEE International Conference on Big Data (IEEE BigData 2014), 2014

2013
Optimizing process creation and execution on multi-core architectures.
Int. J. High Perform. Comput. Appl., 2013

Understanding and isolating the noise in the Linux kernel.
Int. J. High Perform. Comput. Appl., 2013

A new routing scheme for Jellyfish and its performance with HPC workloads.
Proceedings of the International Conference for High Performance Computing, 2013

Using simulation to explore distributed key-value stores for extreme-scale system services.
Proceedings of the International Conference for High Performance Computing, 2013

DRepl: Optimizing access to application data for analysis and visualization.
Proceedings of the IEEE 29th Symposium on Mass Storage Systems and Technologies, 2013

RRR: A Load Balanced Routing Scheme for Slimmed Fat-Trees.
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

Understanding the Performance of Two Production Supercomputers.
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

A gossip-based approach to exascale system services.
Proceedings of the 3rd International Workshop on Runtime and Operating Systems for Supercomputers, 2013

Transparently consistent asynchronous shared memory.
Proceedings of the 3rd International Workshop on Runtime and Operating Systems for Supercomputers, 2013

Energy modeling of supercomputers and large-scale scientific applications.
Proceedings of the International Green Computing Conference, 2013

HPC runtime support for fast and power efficient locking and synchronization.
Proceedings of the 2013 IEEE International Conference on Cluster Computing, 2013

Multilevel Active Storage for big data applications in high performance computing.
Proceedings of the 2013 IEEE International Conference on Big Data (IEEE BigData 2013), 2013

2012
Optimizing latency and throughput for spawning processes on massively multicore processors.
Proceedings of the 2nd International Workshop on Runtime and Operating Systems for Supercomputers, 2012

Stepping towards noiseless Linux environment.
Proceedings of the 2nd International Workshop on Runtime and Operating Systems for Supercomputers, 2012

The design and implementation of a multi-level content-addressable checkpoint file system.
Proceedings of the 19th International Conference on High Performance Computing, 2012

2011
Adapting wave-front algorithms to efficiently utilize systems with deep communication hierarchies.
Parallel Comput., 2011

2010
On the Performance and Technological Impact of Adding Memory Controllers in Multi-Core Processors.
Parallel Process. Lett., 2010

Optimized InfiniBand<sup>TM</sup> fat-tree routing for shift all-to-all communication patterns.
Concurr. Comput. Pract. Exp., 2010

Analyzing the trade-off between multiple memory controllers and memory channels on multi-core processor performance.
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Characterizing the Impact of Using Spare-Cores on Application Performance.
Proceedings of the Euro-Par 2010 - Parallel Processing, 16th International Euro-Par Conference, Ischia, Italy, August 31, 2010

2009
Implementation and performance modeling of deterministic particle transport (Sweep3D) on the IBM Cell/B.E.
Sci. Program., 2009

The reverse-acceleration model for programming petascale hybrid systems.
IBM J. Res. Dev., 2009

Using Performance Modeling to Design Large-Scale Systems.
Computer, 2009

2008
Infiniband Routing Table Optimizations for Scientific Applications.
Parallel Process. Lett., 2008

A Performance Evaluation of the Nehalem Quad-Core Processor for Scientific Computing.
Parallel Process. Lett., 2008

Entering the petaflop era: the architecture and performance of Roadrunner.
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2008

Optimization of infiniband for scientific applications.
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

Experiences in scaling scientific applications on current-generation quad-core processors.
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

2006
Architecture - A performance comparison through benchmarking and modeling of three leading supercomputers: blue Gene/L, Red Storm, and Purple.
Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

2004
A Performance and Scalability Analysis of the BlueGene/L Architecture.
Proceedings of the ACM/IEEE SC2004 Conference on High Performance Networking and Computing, 2004

An empirical performance analysis of commodity memories in commodity servers.
Proceedings of the 2004 workshop on Memory System Performance, 2004


  Loading...