William Won

Orcid: 0000-0002-1715-9144

According to our database1, William Won authored at least 13 papers between 2021 and 2025.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
COSMIC: Enabling Full-Stack Co-Design and Optimization of Distributed Machine Learning Systems.
CoRR, May, 2025

MCMComm: Hardware-Software Co-Optimization for End-to-End Communication in Multi-Chip-Modules.
CoRR, May, 2025

Toward a Standardized Representation for Deep Learning Collective Algorithms.
IEEE Micro, 2025

FRED: A Wafer-scale Fabric for 3D Parallel DNN Training.
Proceedings of the 52nd Annual International Symposium on Computer Architecture, 2025

2024
FRED: Flexible REduction-Distribution Interconnect and Communication Implementation for Wafer-Scale Distributed Training of DNN Models.
CoRR, 2024

TACOS: Topology-Aware Collective Algorithm Synthesizer for Distributed Machine Learning.
Proceedings of the 57th IEEE/ACM International Symposium on Microarchitecture, 2024

LIBRA: Enabling Workload-Aware Multi-Dimensional Network Topology Optimization for Distributed Training of Large AI Models.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2024

Towards a Standardized Representation for Deep Learning Collective Algorithms.
Proceedings of the IEEE Symposium on High-Performance Interconnects, 2024

2023
TACOS: Topology-Aware Collective Algorithm Synthesizer for Distributed Training.
CoRR, 2023

ASTRA-sim2.0: Modeling Hierarchical Networks and Disaggregated Systems for Large-model Training at Scale.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2023

2022
Themis: a network bandwidth-aware collective scheduling policy for distributed training of DL models.
Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022

2021
Exploring Multi-dimensional Hierarchical Network Topologies for Efficient Distributed Training of Trillion Parameter DL Models.
CoRR, 2021

Extending Sparse Tensor Accelerators to Support Multiple Compression Formats.
Proceedings of the 35th IEEE International Parallel and Distributed Processing Symposium, 2021


  Loading...