Cong Guo

Orcid: 0000-0002-4479-5525

Affiliations:
  • Shanghai Jiao Tong University, Department of Computer Science and Engineering, China


According to our database1, Cong Guo authored at least 17 papers between 2019 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Accelerating Sparse DNNs Based on Tiled GEMM.
IEEE Trans. Computers, May, 2024

vTensor: Flexible Virtual Tensor Management for Efficient LLM Serving.
CoRR, 2024

JUNO: Optimizing High-Dimensional Approximate Nearest Neighbour Search with Sparsity-Aware Algorithm and Ray-Tracing Core Mapping.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

GMLake: Efficient and Transparent GPU Memory Defragmentation for Large-scale DNN Training with Virtual Memory Stitching.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

2023
Accelerating Generic Graph Neural Networks via Architecture, Compiler, Partition Method Co-Design.
CoRR, 2023

OliVe: Accelerating Large Language Models via Hardware-friendly Outlier-Victim Pair Quantization.
Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023

AdaptGear: Accelerating GNN Training via Adaptive Subgraph-Level Kernels on GPUs.
Proceedings of the 20th ACM International Conference on Computing Frontiers, 2023

2022
Efficient Activation Quantization via Adaptive Rounding Border for Post-Training Quantization.
CoRR, 2022

Towards Reliable AI Applications via Algorithm-Based Fault Tolerance on NVDLA.
Proceedings of the 18th International Conference on Mobility, Sensing and Networking, 2022

ANT: Exploiting Adaptive Numerical Data Type for Low-bit Deep Neural Network Quantization.
Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

SQuant: On-the-Fly Data-Free Quantization via Diagonal Hessian Approximation.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Nesting Forward Automatic Differentiation for Memory-Efficient Deep Neural Network Training.
Proceedings of the IEEE 40th International Conference on Computer Design, 2022

2021
Dual-side Sparse Tensor Core.
Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021

Characterizing and Demystifying the Implicit Convolution Algorithm on Commercial Matrix-Multiplication Accelerators.
Proceedings of the IEEE International Symposium on Workload Characterization, 2021

2020
Accelerating sparse DNN models without hardware-support via tile-wise sparsity.
Proceedings of the International Conference for High Performance Computing, 2020

Balancing Efficiency and Flexibility for DNN Acceleration via Temporal GPU-Systolic Array Integration.
Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

2019
Adversarial Defense Through Network Profiling Based Path Extraction.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019


  Loading...