Ding-Yong Hong

Orcid: 0000-0002-7649-7581

According to our database¹, Ding-Yong Hong authored at least 47 papers between 2005 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Bibliography

2026

Efficient column-wise N:M pruning on RISC-V CPU.

[BibT_eX]

[DOI]

Chi-Wei Chu

Ding-Yong Hong

Jan-Jan Wu

J. Syst. Archit., 2026

SAP: Syntactic Attention Pruning for Transformer-based Language Models.

[BibT_eX]

[DOI]

Tzu-Yun Lee

Ding-Yong Hong

Jan-Jan Wu

Proceedings of the 41st ACM/SIGAPP Symposium on Applied Computing, 2026

2025

AdaSD: Adaptive Speculative Decoding for Efficient Language Model Inference.

[BibT_eX]

[DOI]

Kuan-Wei Lu

Ding-Yong Hong

Pangfeng Liu

CoRR, December, 2025

Efficient Distributed Training via Dual Batch Sizes and Cyclic Progressive Learning.

[BibT_eX]

[DOI]

CoRR, September, 2025

GPU memory usage optimization for backward propagation in deep network training.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 2025

Optimizing Compute Core Assignment for Dynamic Batch Inference in AI Inference Accelerator.

[BibT_eX]

[DOI]

Ze-Wei Liou

Ding-Yong Hong

Proceedings of the 40th ACM/SIGAPP Symposium on Applied Computing, 2025

Execution Time Optimization for Pipeline Deep Network Training on Multiple GPUs.

[BibT_eX]

[DOI]

Proceedings of the 33rd Euromicro International Conference on Parallel, 2025

A Grouping Algorithm for Training Tree-Shaped Models on Multiple GPUs with High Efficiency.

[BibT_eX]

[DOI]

Proceedings of the 49th IEEE Annual Computers, Software, and Applications Conference, 2025

Optimizing Pipeline Parallelism for Deep Learning with Activation Checkpointing.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Symposium on Computing and Networking, 2025

2024

Speculative Monte-Carlo Tree Search.

[BibT_eX]

[DOI]

Scott Cheng

Mahmut T. Kandemir

Ding-Yong Hong

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Approximation Algorithms and Simulated Annealing Heuristics for Row-and-Column Pruning of Deep Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Parallel and Distributed Processing with Applications, 2024

Effective Compression of Language Models by Combining Pruning and Knowledge Distillation.

[BibT_eX]

[DOI]

Proceedings of the 48th IEEE Annual Computers, Software, and Applications Conference, 2024

2023

Exploiting Fine-Grained Structured Pruning for Efficient Inference on CNN Model.

[BibT_eX]

[DOI]

Proceedings of the 29th IEEE International Conference on Parallel and Distributed Systems, 2023

Function Clustering to Optimize Resource Utilization on Container Platform.

[BibT_eX]

[DOI]

Proceedings of the 29th IEEE International Conference on Parallel and Distributed Systems, 2023

Accelerate Inference of CNN Models on CPU via Column Combining Based on Simulated Annealing.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Symposium on Computing and Networking, CANDAR 2023, Matsue, Japan, November 28, 2023

2022

Accelerating Video Captioning on Heterogeneous System Architectures.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2022

CNN Models Acceleration Using Filter Pruning and Sparse Tensor Core.

[BibT_eX]

[DOI]

Int. J. Netw. Comput., 2022

Accelerating Convolutional Neural Networks via Inter-operator Scheduling.

[BibT_eX]

[DOI]

Proceedings of the 28th IEEE International Conference on Parallel and Distributed Systems, 2022

Rewriting Deep Learning Models for Maximizing Edge TPU Utilization.

[BibT_eX]

[DOI]

Kung-Fu Chen

Ding-Yong Hong

Proceedings of the 28th IEEE International Conference on Parallel and Distributed Systems, 2022

Efficient Dual Batch Size Deep Learning for Distributed Parameter Server Systems.

[BibT_eX]

[DOI]

Proceedings of the 46th IEEE Annual Computers, Software, and Applications Conferenc, 2022

Efficient Inference on Convolutional Neural Networks by Image Difficulty Prediction.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Big Data, 2022

2021

Efficient Video Captioning on Heterogeneous System Architectures.

[BibT_eX]

[DOI]

Proceedings of the 35th IEEE International Parallel and Distributed Processing Symposium, 2021

Accelerate CNN Models via Filter Pruning and Sparse Tensor Core.

[BibT_eX]

[DOI]

Proceedings of the Ninth International Symposium on Computing and Networking, 2021

Optimal Branch Location for Cost-effective Inference on Branchynet.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE International Conference on Big Data (Big Data), 2021

2019

Exploiting SIMD Asymmetry in ARM-to-x86 Dynamic Binary Translation.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2019

Processor-Tracing Guided Region Formation in Dynamic Binary Translation.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2019

Optimizing data permutations in structured loads/stores translation and SIMD register mapping for a cross-ISA dynamic binary translator.

[BibT_eX]

[DOI]

J. Syst. Archit., 2019

Exploiting Vector Processing in Dynamic Binary Translation.

[BibT_eX]

[DOI]

Proceedings of the 48th International Conference on Parallel Processing, 2019

2018

Improving SIMD Parallelism via Dynamic Binary Translation.

[BibT_eX]

[DOI]

ACM Trans. Embed. Comput. Syst., 2018

Efficient and retargetable SIMD translation in a dynamic binary translator.

[BibT_eX]

[DOI]

Softw. Pract. Exp., 2018

Dynamic tuning of applications using restricted transactional memory.

[BibT_eX]

[DOI]

Proceedings of the 2018 Conference on Research in Adaptive and Convergent Systems, 2018

Exploiting SIMD capability in an ARMv7-to-ARMv8 dynamic binary translator.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Compilers, 2018

2017

Dynamic translation of structured Loads/Stores and register mapping for architectures with SIMD extensions.

[BibT_eX]

[DOI]

Proceedings of the 18th ACM SIGPLAN/SIGBED Conference on Languages, 2017

Exploiting Asymmetric SIMD Register Configurations in ARM-to-x86 Dynamic Binary Translation.

[BibT_eX]

[DOI]

Proceedings of the 26th International Conference on Parallel Architectures and Compilation Techniques, 2017

2016

Optimizing Control Transfer and Memory Virtualization in Full System Emulators.

[BibT_eX]

[DOI]

ACM Trans. Archit. Code Optim., 2016

Exploiting Longer SIMD Lanes in Dynamic Binary Translation.

[BibT_eX]

[DOI]

Proceedings of the 22nd IEEE International Conference on Parallel and Distributed Systems, 2016

2015

A dynamic binary translation system in a client/server environment.

[BibT_eX]

[DOI]

J. Syst. Archit., 2015

SIMD Code Translation in an Enhanced HQEMU.

[BibT_eX]

[DOI]

Proceedings of the 21st IEEE International Conference on Parallel and Distributed Systems, 2015

2014

Efficient and Retargetable Dynamic Binary Translation on Multicores.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2014

DBILL: an efficient and retargetable dynamic binary instrumentation framework using llvm backend.

[BibT_eX]

[DOI]

Proceedings of the 10th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, 2014

2013

Improving dynamic binary optimization through early-exit guided code region formation.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (co-located with ASPLOS 2013), 2013

2012

HQEMU: a multi-threaded and retargetable dynamic binary translator on multicores.

[BibT_eX]

[DOI]

Proceedings of the 10th Annual IEEE/ACM International Symposium on Code Generation and Optimization, 2012

2011

LnQ: Building High Performance Dynamic Binary Translators with Existing Compiler Backends.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Parallel Processing, 2011

2010

A Scalable HLA RTI System Based on Multiple-FedServ Architecture.

[BibT_eX]

[DOI]

Proceedings of the 12th UKSim, 2010

2009

MGRID: a modifiable-grid region matching approach for DDM in the HLA RTI.

[BibT_eX]

[DOI]

Proceedings of the 2009 Spring Simulation Multiconference, SpringSim 2009, 2009

2008

Early experiences in application level I/O tracing on blue gene systems.

[BibT_eX]

[DOI]

Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

2005

An Efficient MPI-IO for Noncontiguous Data Access over InfiniBand.

[BibT_eX]

[DOI]

Ding-Yong Hong

Ching-Wen You

Yeh-Ching Chung

Proceedings of the 8th International Symposium on Parallel Architectures, 2005

Ding-Yong Hong

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...