Souvik Kundu

Orcid: 0000-0002-3533-9405

Affiliations:

Intel Labs, San Diego, CA, USA

According to our database¹, Souvik Kundu authored at least 36 papers between 2023 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

On Evaluating Performance of LLM Inference Serving Systems.

[BibT_eX]

[DOI]

CoRR, July, 2025

On-the-Fly Adaptive Distillation of Transformer to Dual-State Linear Attention.

[BibT_eX]

[DOI]

CoRR, June, 2025

Assortment of Attention Heads: Accelerating Federated PEFT with Head Pruning and Strategic Client Selection.

[BibT_eX]

[DOI]

Yeshwanth Venkatesha

Souvik Kundu

Priyadarshini Panda

CoRR, June, 2025

Accelerating LLM Inference with Flexible N:M Sparsity via A Fully Digital Compute-in-Memory Accelerator.

[BibT_eX]

[DOI]

Deepak A. Mathaikutty

Tushar Krishna

CoRR, April, 2025

Understanding and Optimizing Multi-Stage AI Inference Pipelines.

[BibT_eX]

[DOI]

Abhimanyu Rajeshkumar Bambhaniya

Midhilesh Elavazhagan

Madhu Kumar

Tushar Krishna

CoRR, April, 2025

SEAL: Steerable Reasoning Calibration of Large Language Models for Free.

[BibT_eX]

[DOI]

CoRR, April, 2025

OuroMamba: A Data-Free Quantization Framework for Vision Mamba Models.

[BibT_eX]

[DOI]

CoRR, March, 2025

Enhancing Large Language Models for Hardware Verification: A Novel SystemVerilog Assertion Dataset.

[BibT_eX]

[DOI]

Gabriel Theodor Sonnenschein

Suvadeep Banerjee

Deepak Mathaikutty

Kanad Basu

CoRR, March, 2025

LANTERN++: Enhanced Relaxed Speculative Decoding with Static Tree Drafting for Visual Auto-regressive Models.

[BibT_eX]

[DOI]

CoRR, February, 2025

CITER: Collaborative Inference for Efficient Large Language Model Decoding with Token-Level Routing.

[BibT_eX]

[DOI]

CoRR, February, 2025

Unraveling Zeroth-Order Optimization through the Lens of Low-Dimensional Structured Perturbations.

[BibT_eX]

[DOI]

CoRR, January, 2025

Fast and Cost-effective Speculative Edge-Cloud Decoding with Early Exits.

[BibT_eX]

[DOI]

Yeshwanth Venkatesha

Souvik Kundu

Priyadarshini Panda

Trans. Mach. Learn. Res., 2025

MicroScopiQ: Accelerating Foundational Models through Outlier-Aware Microscaling Quantization.

[BibT_eX]

[DOI]

Akshat Ramachandran

Souvik Kundu

Tushar Krishna

Proceedings of the 52nd Annual International Symposium on Computer Architecture, 2025

LANTERN: Accelerating Visual Autoregressive Models with Relaxed Speculative Decoding.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Scaling Long Context Training Data by Long-Distance Referrals.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

LLM-NPU: Towards Efficient Foundation Model Inference on Low-Power Neural Processing Units.

[BibT_eX]

[DOI]

Arnab Raha

Souvik Kundu

Sharath Nittur Sridhar

Deepak A. Mathaikutty

Proceedings of the IEEE International Conference on Omni-layer Intelligent Systems, 2025

LAMB: A Training-Free Method to Enhance the Long-Context Understanding of SSMs via Attention-Guided Token Filtering.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2025

2024

Bit-by-Bit: Investigating the Vulnerabilities of Binary Neural Networks to Adversarial Bit Flipping.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2024

Unveiling Adversarially Robust Graph Lottery Tickets.

[BibT_eX]

[DOI]

Subhajit Dutta Chowdhury

Trans. Mach. Learn. Res., 2024

AttentionBreaker: Adaptive Evolutionary Optimization for Unmasking Vulnerabilities in LLMs through Bit-Flip Attacks.

[BibT_eX]

[DOI]

CoRR, 2024

Metron: Holistic Performance Evaluation Framework for LLM Inference Systems.

[BibT_eX]

[DOI]

CoRR, 2024

CLAMP-ViT: Contrastive Data-Free Learning for Adaptive Post-Training Quantization of ViTs.

[BibT_eX]

[DOI]

Akshat Ramachandran

Souvik Kundu

Tushar Krishna

CoRR, 2024

Demystifying Platform Requirements for Diverse LLM Inference Use Cases.

[BibT_eX]

[DOI]

Midhilesh Elavazhagan

Madhu Kumar

Tushar Krishna

CoRR, 2024

GEAR: An Efficient KV Cache Compression Recipe for Near-Lossless Generative Inference of LLM.

[BibT_eX]

[DOI]

CoRR, 2024

ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Junk DNA Hypothesis: Pruning Small Pre-Trained Weights Irreversibly and Monotonically Impairs "Difficult" Downstream Tasks in LLMs.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Fusing Models with Complementary Expertise.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Analyzing Adversarial Vulnerabilities of Graph Lottery Tickets.

[BibT_eX]

[DOI]

Subhajit Dutta Chowdhury

Proceedings of the IEEE International Conference on Acoustics, 2024

Towards Real-Time LLM Inference on Heterogeneous Edge Platforms.

[BibT_eX]

[DOI]

Rakshith Jayanth

Neelesh Gupta

Souvik Kundu

Deepak A. Mathaikutty

Viktor K. Prasanna

Proceedings of the 31st IEEE International Conference on High Performance Computing, Data and Analytics, HiPC 2024, 2024

GEAR: An Efficient Error Reduction Framework for KV Cache Compression in LLM Inference.

[BibT_eX]

[DOI]

Proceedings of the NeurIPS Efficient Natural Language and Speech Processing Workshop, 2024

GenQ: Quantization in Low Data Regimes with Generative Synthetic Data.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

2023

Sparse but Strong: Crafting Adversarially Robust Graph Lottery Tickets.

[BibT_eX]

[DOI]

Subhajit Dutta Chowdhury

CoRR, 2023

Junk DNA Hypothesis: A Task-Centric Angle of LLM Pre-trained Weights through Sparsity.

[BibT_eX]

[DOI]

CoRR, 2023

Don't just prune by magnitude! Your mask topology is a secret weapon.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

NeRFool: Uncovering the Vulnerability of Generalizable Neural Radiance Fields against Adversarial Perturbations.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Vision HGNN: An Image is More than a Graph of Nodes.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Souvik Kundu

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...