Guyue Huang

Orcid: 0000-0002-1280-4781

According to our database¹, Guyue Huang authored at least 19 papers between 2020 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

Enabling Efficient Sparse Multiplications on GPUs With Heuristic Adaptability.

[BibT_eX]

[DOI]

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., June, 2025

GMI-DRL: Empowering Multi-GPU DRL with Adaptive-Grained Parallelism.

[BibT_eX]

[DOI]

Proceedings of the 2025 USENIX Annual Technical Conference, 2025

TRACI: Network Acceleration of Input-Dynamic Communication for Large-Scale Deep Learning Recommendation Model.

[BibT_eX]

[DOI]

Proceedings of the 52nd Annual International Symposium on Computer Architecture, 2025

2024

High-Performance Deep Learning Systems via DL Sparsity and DL Compiler

[BibT_eX]

[DOI]

Guyue Huang

PhD thesis, 2024

OPER: Optimality-Guided Embedding Table Parallelization for Large-scale Recommendation Model.

[BibT_eX]

[DOI]

Proceedings of the 2024 USENIX Annual Technical Conference, 2024

MixDQ: Memory-Efficient Few-Step Text-to-Image Diffusion Models with Metric-Decoupled Mixed Precision Quantization.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

2023

TC-GNN: Bridging Sparse GNN Computation and Dense Tensor Cores on GPUs.

[BibT_eX]

[DOI]

Proceedings of the 2023 USENIX Annual Technical Conference, 2023

ALCOP: Automatic Load-Compute Pipelining in Deep Learning Compiler for AI-GPUs.

[BibT_eX]

[DOI]

Proceedings of the Sixth Conference on Machine Learning and Systems, 2023

RM-STC: Row-Merge Dataflow Inspired GPU Sparse Tensor Core for Energy-Efficient Sparse Acceleration.

[BibT_eX]

[DOI]

Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

2022

Enabling Data Movement and Computation Pipelining in Deep Learning Compiler.

[BibT_eX]

[DOI]

CoRR, 2022

Heuristic Adaptability to Input Dynamics for SpMM on GPUs.

[BibT_eX]

[DOI]

CoRR, 2022

LightSeq2: Accelerated Training for Transformer-Based Models on GPUs.

[BibT_eX]

[DOI]

Proceedings of the SC22: International Conference for High Performance Computing, 2022

Understanding GNN Computational Graph: A Coordinated Computation, IO, and Memory Perspective.

[BibT_eX]

[DOI]

Proceedings of the Fifth Conference on Machine Learning and Systems, 2022

Shfl-BW: accelerating deep neural network inference with tensor-core aware weight pruning.

[BibT_eX]

[DOI]

Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

Heuristic adaptability to input dynamics for SpMM on CPUs.

[BibT_eX]

[DOI]

Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

2021

Machine Learning for Electronic Design Automation: A Survey.

[BibT_eX]

[DOI]

ACM Trans. Design Autom. Electr. Syst., 2021

Efficient Sparse Matrix Kernels based on Adaptive Workload-Balancing and Parallel-Reduction.

[BibT_eX]

[DOI]

CoRR, 2021

Exploiting Online Locality and Reduction Parallelism for Sampled Dense Matrix Multiplication on GPUs.

[BibT_eX]

[DOI]

Proceedings of the 39th IEEE International Conference on Computer Design, 2021

2020

GE-SpMM: general-purpose sparse matrix-matrix multiplication on GPUs for graph neural networks.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2020

Guyue Huang

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...