Walid Ahmed

According to our database1, Walid Ahmed authored at least 22 papers between 2018 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
CommFuse: Hiding Tail Latency via Communication Decomposition and Fusion for Distributed LLM Training.
CoRR, April, 2026

Distributed Hybrid Parallelism for Large Language Models: Comparative Study and System Design Guide.
CoRR, February, 2026

EPAS: Efficient Training with Progressive Activation Sharing.
CoRR, January, 2026

FLOP-Efficient Training: Early Stopping Based on Test-Time Compute Awareness.
CoRR, January, 2026

2025
Bayesian Mixture of Experts For Large Language Models.
CoRR, November, 2025

ETT: Expanding the Long Context Understanding Capability of LLMs at Test-Time.
CoRR, July, 2025

Continuous Self-Improvement of Large Language Models by Test-time Training with Verifier-Driven Sample Selection.
CoRR, May, 2025

Balancing Computation Load and Representation Expressivity in Parallel Hybrid Neural Networks.
CoRR, May, 2025

ECHO-LLaMA: Efficient Caching for High-Performance LLaMA Training.
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

2024
Accelerating the Low-Rank Decomposed Models.
CoRR, 2024

Is 3D Convolution with 5D Tensors Really Necessary for Video Analysis?
CoRR, 2024

SkipViT: Speeding Up Vision Transformers with a Token-Level Skip Connection.
CoRR, 2024

Accelerating the Low-Rank Decomposed Models.
Proceedings of the NeurIPS Efficient Natural Language and Speech Processing Workshop, 2024

Is 3D Convolution with 5D Tensors Really Necessary for Video Analysis?
Proceedings of the NeurIPS Efficient Natural Language and Speech Processing Workshop, 2024

2023
SwiftLearn: A Data-Efficient Training Method of Deep Learning Models using Importance Sampling.
CoRR, 2023

GQKVA: Efficient Pre-training of Transformers by Grouping Queries, Keys, and Values.
CoRR, 2023

Speeding up Resnet Architecture with Layers Targeted Low Rank Decomposition.
CoRR, 2023

Improving Resnet-9 Generalization Trained on Small Datasets.
CoRR, 2023

Training Acceleration of Low-Rank Decomposed Networks using Sequential Freezing and Rank Quantization.
CoRR, 2023

Talk-to-the-Robot: Speech-Interactive Robot To Teach Children Computational Thinking.
Proceedings of the 15th International Conference on Computer Supported Education, 2023

2018
Ensemble-based Adaptive Single-shot Multi-box Detector.
Proceedings of the 2018 International Symposium on Networks, Computers and Communications, 2018

Efficient Single-Shot Multibox Detector for Construction Site Monitoring.
Proceedings of the IEEE International Smart Cities Conference, 2018


  Loading...