Shangyu Wu

Orcid: 0000-0002-1961-143X

According to our database1, Shangyu Wu authored at least 19 papers between 2019 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
Beyond Semantic Similarity: Reducing Unnecessary API Calls via Behavior-Aligned Retriever.
CoRR, August, 2025

AD-EE: Early Exiting for Fast and Reliable Vision-Language Models in Autonomous Driving.
CoRR, June, 2025

EvoP: Robust LLM Inference via Evolutionary Pruning.
CoRR, February, 2025

A<sup>2</sup>ATS: Retrieval-Based KV Cache Reduction via Windowed Rotary Position Embedding and Query-Aware Vector Quantization.
CoRR, February, 2025

FlexInfer: Breaking Memory Constraint via Flexible and Efficient Offloading for On-Device LLM Inference.
Proceedings of the 5th Workshop on Machine Learning and Systems, 2025

A²ATS: Retrieval-Based KV Cache Reduction via Windowed Rotary Position Embedding and Query-Aware Vector Quantization.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024
GeneQuery: A General QA-based Framework for Spatial Gene Expression Predictions from Histology Images.
CoRR, 2024

Retrieval-Augmented Generation for Natural Language Processing: A Survey.
CoRR, 2024

RAEE: A Training-Free Retrieval-Augmented Early Exiting Framework for Efficient Inference.
CoRR, 2024

Improving Natural Language Understanding with Computation-Efficient Retrieval Representation Fusion.
CoRR, 2024

ReFusion: Improving Natural Language Understanding with Computation-Efficient Retrieval Representation Fusion.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

LeaderKV: Improving Read Performance of KV Stores via Learned Index and Decoupled KV Table.
Proceedings of the 40th IEEE International Conference on Data Engineering, 2024

CHESS: Optimizing LLM Inference via Channel-Wise Thresholding and Selective Sparsification.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

2023
Tidal-Tree-Mem: Toward Read-Intensive Key-Value Stores With Tidal Structure Based on LSM-Tree.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., February, 2023

2022
Bits-Ensemble: Toward Light-Weight Robust Deep Ensemble by Bits-Sharing.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

NFL: Robust Learned Index via Distribution Transformation.
Proc. VLDB Endow., 2022

Work-in-Progress: Lark: A Learned Secondary Index Toward LSM-tree for Resource-Constrained Embedded Storage Systems.
Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis, 2022

2020
Towards Read-Intensive Key-Value Stores with Tidal Structure Based on LSM-Tree.
Proceedings of the 25th Asia and South Pacific Design Automation Conference, 2020

2019
Towards Cross-Platform Inference on Edge Devices with Emerging Neuromorphic Architecture.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2019


  Loading...