Qingru Zhang

Orcid: 0009-0002-2021-7939

According to our database1, Qingru Zhang authored at least 19 papers between 2002 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
Win Fast or Lose Slow: Balancing Speed and Accuracy in Latency-Sensitive Decisions of LLMs.
CoRR, May, 2025

Think-RM: Enabling Long-Horizon Reasoning in Generative Reward Models.
CoRR, May, 2025

Multiscale Spatio-Temporal Fusion Network for video dehazing.
Comput. Vis. Image Underst., 2025

2024
Model Tells Itself Where to Attend: Faithfulness Meets Automatic Attention Steering.
CoRR, 2024

Robust Reinforcement Learning from Corrupted Human Feedback.
CoRR, 2024

GEAR: An Efficient KV Cache Compression Recipe for Near-Lossless Generative Inference of LLM.
CoRR, 2024

Robust Reinforcement Learning from Corrupted Human Feedback.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Tell Your Model Where to Attend: Post-hoc Attention Steering for LLMs.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

GEAR: An Efficient Error Reduction Framework for KV Cache Compression in LLM Inference.
Proceedings of the NeurIPS Efficient Natural Language and Speech Processing Workshop, 2024

2023
Robust Multi-Agent Reinforcement Learning via Adversarial Regularization: Theoretical Foundation and Stable Algorithms.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Less is More: Task-aware Layer-wise Distillation for Language Model Compression.
Proceedings of the International Conference on Machine Learning, 2023

LoSparse: Structured Compression of Large Language Models based on Low-Rank and Sparse Approximation.
Proceedings of the International Conference on Machine Learning, 2023

Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Efficient Long-Range Transformers: You Need to Attend More, but Not Necessarily at Every Layer.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

2022
MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance.
Proceedings of the International Conference on Machine Learning, 2022

2021
A Biased Graph Neural Network Sampler with Near-Optimal Regret.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

2019
AdaShift: Decorrelation and Convergence of Adaptive Learning Rate Methods.
Proceedings of the 7th International Conference on Learning Representations, 2019

2002
Webquery: a simple web-enabled system for database management.
J. Comput. Sci. Coll., 2002


  Loading...