Jun Liu
Orcid: 0009-0003-8280-9072Affiliations:
- Shanghai Jiao Tong University, Qingyuan Research Institute, Shanghai, China
According to our database1,
Jun Liu authored at least 20 papers
between 2021 and 2026.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2026
DynSplit-KV: Dynamic Semantic Splitting for KVCache Compression in Efficient Long-Context LLM Inference.
CoRR, February, 2026
2025
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2025
FlightVGM: Efficient Video Generation Model Inference with Online Sparsification and Hybrid Precision on FPGAs.
Proceedings of the 2025 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2025
Harnessing Conventional Video Processing Insights for Emerging 3D Video Generation Models: A Comprehensive Attention-aware Way.
Proceedings of the 62nd ACM/IEEE Design Automation Conference, 2025
SG-Filter: Enhancing Similar Text Retrieval via Hierarchical Summarized-Semantic Index and Adaptive Filtering.
Proceedings of the 34th ACM International Conference on Information and Knowledge Management, 2025
Proceedings of the 30th Asia and South Pacific Design Automation Conference, 2025
ViDA: Video Diffusion Transformer Acceleration with Differential Approximation and Adaptive Dataflow.
Proceedings of the 30th Asia and South Pacific Design Automation Conference, 2025
2024
Efficient and Effective Retrieval of Dense-Sparse Hybrid Vectors using Graph-based Approximate Nearest Neighbor Search.
CoRR, 2024
CoRR, 2024
FlashDecoding++: Faster Large Language Model Inference with Asynchronization, Flat GEMM Optimization, and Heuristics.
Proceedings of the Seventh Annual Conference on Machine Learning and Systems, 2024
Fast and Efficient 2-bit LLM Inference on GPU: 2/4/16-bit in a Weight Matrix with Asynchronous Dequantization.
Proceedings of the 43rd IEEE/ACM International Conference on Computer-Aided Design, 2024
Proceedings of the 43rd IEEE/ACM International Conference on Computer-Aided Design, 2024
FlightLLM: Efficient Large Language Model Inference with a Complete Mapping Flow on FPGAs.
Proceedings of the 2024 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2024
2023
DF-GAS: a Distributed FPGA-as-a-Service Architecture towards Billion-Scale Graph-based Approximate Nearest Neighbor Search.
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023
TSTC: Two-Level Sparsity Tensor Core Enabling both Algorithm Flexibility and Hardware Efficiency.
Proceedings of the IEEE/ACM International Conference on Computer Aided Design, 2023
Processing-In-Hierarchical-Memory Architecture for Billion-Scale Approximate Nearest Neighbor Search.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023
2022
A Unified FPGA Virtualization Framework for General-Purpose Deep Neural Networks in the Cloud.
ACM Trans. Reconfigurable Technol. Syst., 2022
Proceedings of the 23rd IEEE International Conference on Mobile Data Management, 2022
2021
3M-AI: A Multi-task and Multi-core Virtualization Framework for Multi-FPGA AI Systems in the Cloud.
Proceedings of the FPGA '21: The 2021 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Virtual Event, USA, February 28, 2021