Amirkeivan Mohtashami

According to our database1, Amirkeivan Mohtashami authored at least 14 papers between 2021 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
DenseFormer: Enhancing Information Flow in Transformers via Depth Weighted Averaging.
CoRR, 2024

2023
The splay-list: a distribution-adaptive concurrent skip-list.
Distributed Comput., September, 2023

Social Learning: Towards Collaborative Learning with Large Language Models.
CoRR, 2023

MEDITRON-70B: Scaling Medical Pretraining for Large Language Models.
CoRR, 2023

CoTFormer: More Tokens With Attention Make Up For Less Depth.
CoRR, 2023

Landmark Attention: Random-Access Infinite Context Length for Transformers.
CoRR, 2023

Learning Translation Quality Evaluation on Low Resource Languages from Large Language Models.
CoRR, 2023

Random-Access Infinite Context Length for Transformers.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Special Properties of Gradient Descent with Large Learning Rates.
Proceedings of the International Conference on Machine Learning, 2023

2022
On Avoiding Local Minima Using Gradient Descent With Large Learning Rates.
CoRR, 2022

Characterizing & Finding Good Data Orderings for Fast Convergence of Sequential Gradient Methods.
CoRR, 2022

Masked Training of Neural Networks with Partial Gradients.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

2021
Simultaneous Training of Partially Masked Neural Networks.
CoRR, 2021

Critical Parameters for Scalable Distributed Learning with Large Batches and Asynchronous Updates.
Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021


  Loading...