Dan Huang
Orcid: 0000-0001-5582-1031Affiliations:
- Sun Yat-Sen University, Guangzhou, China
According to our database1,
Dan Huang
authored at least 48 papers
between 2015 and 2025.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2025
Critique of "Productivity, Portability, Performance Data-Centric Python" by SCC Team From Sun Yat-sen University.
IEEE Trans. Parallel Distributed Syst., May, 2025
CCF Trans. High Perform. Comput., April, 2025
2024
IEEE Trans. Parallel Distributed Syst., November, 2024
Accelerating Massively Distributed Deep Learning Through Efficient Pseudo-Synchronous Update Method.
Int. J. Parallel Program., June, 2024
SAIH: A Scalable Evaluation Methodology for Understanding AI Performance Trend on HPC Systems.
J. Comput. Sci. Technol., March, 2024
AdaNAS: Adaptively Postprocessing With Self-Supervised Neural Architecture Search for Ensemble Rainfall Forecasts.
IEEE Trans. Geosci. Remote. Sens., 2024
Topo: Towards a fine-grained topological data processing framework on Tianhe-3 supercomputer.
J. Parallel Distributed Comput., 2024
Sci. China Inf. Sci., 2024
APTMoE: Affinity-Aware Pipeline Tuning for MoE Models on Bandwidth-Constrained GPU Nodes.
Proceedings of the International Conference for High Performance Computing, 2024
Liger: Interleaving Intra- and Inter-Operator Parallelism for Distributed Large Model Inference.
Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2024
Proceedings of the Network and Parallel Computing, 2024
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Proceedings of the Euro-Par 2024: Parallel Processing, 2024
Communication-Efficient Model Parallelism for Distributed In-Situ Transformer Inference.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2024
2023
Improving Computation and Memory Efficiency for Real-world Transformer Inference on GPUs.
ACM Trans. Archit. Code Optim., December, 2023
Hierarchical Model Parallelism for Optimizing Inference on Many-core Processor via Decoupled 3D-CNN Structure.
ACM Trans. Archit. Code Optim., September, 2023
Parallel Comput., September, 2023
IEEE Trans. Parallel Distributed Syst., July, 2023
A Data-driven Approach to Harvesting Latent Reduced Models to Precondition Lossy Compression for Scientific Data.
IEEE Trans. Big Data, June, 2023
AdaNAS: Adaptively Post-processing with Self-supervised Neural Architecture Search for Ensemble Rainfall Forecasts.
CoRR, 2023
Proceedings of the 41st IEEE International Conference on Computer Design, 2023
Accelerating Inference of 3D-CNN on ARMMany-core CPU via Hierarchical Model Partition.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2023
Proceedings of the Advanced Parallel Processing Technologies, 2023
2022
Parallel Comput., 2022
Identifying challenges and opportunities of in-memory computing on large HPC systems.
J. Parallel Distributed Comput., 2022
IEEE Internet Things J., 2022
Proceedings of the ICS '22: 2022 International Conference on Supercomputing, Virtual Event, June 28, 2022
Proceedings of the 51st International Conference on Parallel Processing, 2022
2021
IEEE Trans. Computers, 2021
A Fine-grained Optimization to Winograd Convolution Based on Micro-architectural Features of CPU.
Proceedings of the 2021 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom), New York City, NY, USA, September 30, 2021
Proceedings of the ICPP 2021: 50th International Conference on Parallel Processing, Lemont, IL, USA, August 9, 2021
2020
CCF Trans. High Perform. Comput., 2020
Proceedings of the 40th IEEE International Conference on Distributed Computing Systems, 2020
2019
IEEE Trans. Parallel Distributed Syst., 2019
IEEE Trans. Computers, 2019
Proceedings of the 2019 IEEE International Parallel and Distributed Processing Symposium, 2019
2018
IEEE Trans. Computers, 2018
Performance Evaluation and Analysis for MPI-Based Data Movement in Virtual Switch Network.
Proceedings of the 2018 IEEE International Conference on Networking, 2018
2017
IEEE Trans. Computers, 2017
Deister: A light-weight autonomous block management in data-intensive file systems using deterministic declustering distribution.
J. Parallel Distributed Comput., 2017
J. Parallel Distributed Comput., 2017
Proceedings of the 2017 Symposium on Cloud Computing, SoCC 2017, Santa Clara, CA, USA, 2017
2016
Proceedings of the 53rd Annual Design Automation Conference, 2016
2015
Deister: A Light-Weight Autonomous Block Management in Data-Intensive File Systems Using Deterministic Declustering Distribution.
Proceedings of the 2015 IEEE International Conference on Smart City/SocialCom/SustainCom/DataCom/SC2 2015, 2015
Proceedings of the 10th Parallel Data Storage Workshop, 2015
Opass: Analysis and Optimization of Parallel Data Access on Distributed File Systems.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium, 2015
Proceedings of the Cloud Computing and Big Data, 2015