We stand with Ukraine

We stand with Ukraine

Zihan Liu

Orcid: 0000-0002-0874-0682

Affiliations:

Shanghai Jiao Tong University, Department of Computer Science and Engineering, Shanghai, China

According to our database¹, Zihan Liu authored at least 23 papers between 2020 and 2025.

Collaborative distances:

Dijkstra number² of three.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

Online presence:

On csauthors.net:

Bibliography

2025

ClusterFusion: Expanding Operator Fusion Scope for LLM Inference via Cluster-Level Collective Primitive.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

CoRR, August, 2025

SLTarch: Towards Scalable Point-Based Neural Rendering by Taming Workload Imbalance and Memory Irregularity.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

CoRR, July, 2025

eLLM: Elastic Memory Management Framework for Efficient LLM Serving.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, June, 2025

Lumina: Real-Time Mobile Neural Rendering by Exploiting Computational Redundancy.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

CoRR, June, 2025

Astraea: A GPU-Oriented Token-wise Acceleration Framework for Video Diffusion Transformers.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

CoRR, June, 2025

SeeLe: A Unified Acceleration Framework for Real-Time Gaussian Splatting.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

CoRR, March, 2025

A Sample-Free Compilation Framework for Efficient Dynamic Tensor Computation.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the International Conference for High Performance Computing, 2025

Lumina: Real-Time Neural Rendering by Exploiting Computational Redundancy.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the 52nd Annual International Symposium on Computer Architecture, 2025

M-ANT: Efficient Low-bit Group Quantization for LLMs via Mathematically Adaptive Numerical Type.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2025

VQ-LLM: High-performance Code Generation for Vector Quantization Augmented LLM Inference.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2025

StreamGrid: Streaming Point Cloud Analytics via Compulsory Splitting and Deterministic Termination.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2025

2024

Potamoi: Accelerating Neural Rendering via a Unified Streaming Architecture.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

ACM Trans. Archit. Code Optim., December, 2024

Vortex: Efficient Sample-Free Dynamic Tensor Program Optimization via Hardware-aware Strategy Space Hierarchization.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

CoRR, 2024

vTensor: Flexible Virtual Tensor Management for Efficient LLM Serving.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, 2024

Cicero: Addressing Algorithmic and Architectural Bottlenecks in Neural Rendering by Radiance Warping and Memory Optimizations.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024

JUNO: Optimizing High-Dimensional Approximate Nearest Neighbour Search with Sparsity-Aware Algorithm and Ray-Tracing Core Mapping.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

GMLake: Efficient and Transparent GPU Memory Defragmentation for Large-scale DNN Training with Virtual Memory Stitching.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

2023

DFlow: Efficient Dataflow-based Invocation Workflow Execution for Function-as-a-Service.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

CoRR, 2023

AdaptGear: Accelerating GNN Training via Adaptive Subgraph-Level Kernels on GPUs.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

Proceedings of the 20th ACM International Conference on Computing Frontiers, 2023

2022

ANT: Exploiting Adaptive Numerical Data Type for Low-bit Deep Neural Network Quantization.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

VELTAIR: towards high-performance multi-tenant deep learning services via adaptive compilation and scheduling.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the ASPLOS '22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022, 2022

2020

Survey and design of paleozoic: a high-performance compiler tool chain for deep learning inference accelerator.

[BibT_eX]

[DOI]

,

,

,

,

,

CCF Trans. High Perform. Comput., 2020

DLFusion: An Auto-Tuning Compiler for Layer Fusion on Deep Neural Network Accelerator.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the IEEE International Conference on Parallel & Distributed Processing with Applications, 2020

Loading...