Souvik Kundu
Orcid: 0000-0002-3533-9405Affiliations:
- Intel Labs, San Diego, CA, USA
According to our database1,
Souvik Kundu
authored at least 35 papers
between 2023 and 2025.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2025
CoRR, June, 2025
Assortment of Attention Heads: Accelerating Federated PEFT with Head Pruning and Strategic Client Selection.
CoRR, June, 2025
Accelerating LLM Inference with Flexible N:M Sparsity via A Fully Digital Compute-in-Memory Accelerator.
CoRR, April, 2025
CoRR, April, 2025
CoRR, March, 2025
Enhancing Large Language Models for Hardware Verification: A Novel SystemVerilog Assertion Dataset.
CoRR, March, 2025
LANTERN++: Enhanced Relaxed Speculative Decoding with Static Tree Drafting for Visual Auto-regressive Models.
CoRR, February, 2025
CITER: Collaborative Inference for Efficient Large Language Model Decoding with Token-Level Routing.
CoRR, February, 2025
Unraveling Zeroth-Order Optimization through the Lens of Low-Dimensional Structured Perturbations.
CoRR, January, 2025
Trans. Mach. Learn. Res., 2025
MicroScopiQ: Accelerating Foundational Models through Outlier-Aware Microscaling Quantization.
Proceedings of the 52nd Annual International Symposium on Computer Architecture, 2025
LANTERN: Accelerating Visual Autoregressive Models with Relaxed Speculative Decoding.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
LLM-NPU: Towards Efficient Foundation Model Inference on Low-Power Neural Processing Units.
Proceedings of the IEEE International Conference on Omni-layer Intelligent Systems, 2025
2024
Bit-by-Bit: Investigating the Vulnerabilities of Binary Neural Networks to Adversarial Bit Flipping.
Trans. Mach. Learn. Res., 2024
AttentionBreaker: Adaptive Evolutionary Optimization for Unmasking Vulnerabilities in LLMs through Bit-Flip Attacks.
CoRR, 2024
CoRR, 2024
CLAMP-ViT: Contrastive Data-Free Learning for Adaptive Post-Training Quantization of ViTs.
CoRR, 2024
GEAR: An Efficient KV Cache Compression Recipe for Near-Lossless Generative Inference of LLM.
CoRR, 2024
ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Junk DNA Hypothesis: Pruning Small Pre-Trained Weights Irreversibly and Monotonically Impairs "Difficult" Downstream Tasks in LLMs.
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Proceedings of the Twelfth International Conference on Learning Representations, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the 31st IEEE International Conference on High Performance Computing, Data and Analytics, HiPC 2024, 2024
GEAR: An Efficient Error Reduction Framework for KV Cache Compression in LLM Inference.
Proceedings of the NeurIPS Efficient Natural Language and Speech Processing Workshop, 2024
Proceedings of the Computer Vision - ECCV 2024, 2024
2023
Junk DNA Hypothesis: A Task-Centric Angle of LLM Pre-trained Weights through Sparsity.
CoRR, 2023
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
NeRFool: Uncovering the Vulnerability of Generalizable Neural Radiance Fields against Adversarial Perturbations.
Proceedings of the International Conference on Machine Learning, 2023
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023