Shangyu Wu
Orcid: 0000-0002-1961-143X
According to our database1,
Shangyu Wu
authored at least 19 papers
between 2019 and 2025.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2025
Beyond Semantic Similarity: Reducing Unnecessary API Calls via Behavior-Aligned Retriever.
CoRR, August, 2025
AD-EE: Early Exiting for Fast and Reliable Vision-Language Models in Autonomous Driving.
CoRR, June, 2025
A<sup>2</sup>ATS: Retrieval-Based KV Cache Reduction via Windowed Rotary Position Embedding and Query-Aware Vector Quantization.
CoRR, February, 2025
FlexInfer: Breaking Memory Constraint via Flexible and Efficient Offloading for On-Device LLM Inference.
Proceedings of the 5th Workshop on Machine Learning and Systems, 2025
A²ATS: Retrieval-Based KV Cache Reduction via Windowed Rotary Position Embedding and Query-Aware Vector Quantization.
Proceedings of the Findings of the Association for Computational Linguistics, 2025
2024
GeneQuery: A General QA-based Framework for Spatial Gene Expression Predictions from Histology Images.
CoRR, 2024
RAEE: A Training-Free Retrieval-Augmented Early Exiting Framework for Efficient Inference.
CoRR, 2024
Improving Natural Language Understanding with Computation-Efficient Retrieval Representation Fusion.
CoRR, 2024
ReFusion: Improving Natural Language Understanding with Computation-Efficient Retrieval Representation Fusion.
Proceedings of the Twelfth International Conference on Learning Representations, 2024
LeaderKV: Improving Read Performance of KV Stores via Learned Index and Decoupled KV Table.
Proceedings of the 40th IEEE International Conference on Data Engineering, 2024
CHESS: Optimizing LLM Inference via Channel-Wise Thresholding and Selective Sparsification.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
2023
Tidal-Tree-Mem: Toward Read-Intensive Key-Value Stores With Tidal Structure Based on LSM-Tree.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., February, 2023
2022
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022
Work-in-Progress: Lark: A Learned Secondary Index Toward LSM-tree for Resource-Constrained Embedded Storage Systems.
Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis, 2022
2020
Proceedings of the 25th Asia and South Pacific Design Automation Conference, 2020
2019
Towards Cross-Platform Inference on Edge Devices with Emerging Neuromorphic Architecture.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2019