Dara Bahri

Orcid: 0000-0003-0144-2911

According to our database1, Dara Bahri authored at least 34 papers between 2018 and 2023.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2023
Efficient Transformers: A Survey.
ACM Comput. Surv., 2023

Surprise: Result List Truncation via Extreme Value Theory.
Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023

Sharpness-Aware Minimization Leads to Low-Rank Features.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

UL2: Unifying Language Learning Paradigms.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

2022
Is margin all you need? An extensive empirical study of active learning on tabular data.
CoRR, 2022

Confident Adaptive Language Modeling.
CoRR, 2022

Unifying Language Learning Paradigms.
CoRR, 2022

Transformer Memory as a Differentiable Search Index.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Confident Adaptive Language Modeling.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Charformer: Fast Character Transformers via Gradient-based Subword Tokenization.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Churn Reduction via Distillation.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Scarf: Self-Supervised Contrastive Learning using Random Feature Corruption.
Proceedings of the Tenth International Conference on Learning Representations, 2022

ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Sharpness-Aware Minimization Improves Language Model Generalization.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

ED2LM: Encoder-Decoder to Language Model for Faster Document Re-ranking Inference.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

2021
Rethinking search: making domain experts out of dilettantes.
SIGIR Forum, 2021

Are Pre-trained Convolutions Better than Pre-trained Transformers?
CoRR, 2021

Rethinking Search: Making Experts out of Dilettantes.
CoRR, 2021

Locally Adaptive Label Smoothing for Predictive Churn.
CoRR, 2021

Label Smoothed Embedding Hypothesis for Out-of-Distribution Detection.
CoRR, 2021

Generative Models are Unsupervised Predictors of Page Quality: A Colossal-Scale Study.
Proceedings of the WSDM '21, 2021

Synthesizer: Rethinking Self-Attention for Transformer Models.
Proceedings of the 38th International Conference on Machine Learning, 2021

OmniNet: Omnidirectional Representations from Transformers.
Proceedings of the 38th International Conference on Machine Learning, 2021

Locally Adaptive Label Smoothing Improves Predictive Churn.
Proceedings of the 38th International Conference on Machine Learning, 2021

HyperGrid Transformers: Towards A Single Model for Multiple Tasks.
Proceedings of the 9th International Conference on Learning Representations, 2021

Long Range Arena : A Benchmark for Efficient Transformers.
Proceedings of the 9th International Conference on Learning Representations, 2021

Are Pretrained Convolutions Better than Pretrained Transformers?
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

StructFormer: Joint Unsupervised Induction of Dependency and Constituency Structure from Masked Language Modeling.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
HyperGrid: Efficient Multi-Task Transformers with Grid-wise Decomposable Hyper Projections.
CoRR, 2020

Choppy: Cut Transformer for Ranked List Truncation.
Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, 2020

Sparse Sinkhorn Attention.
Proceedings of the 37th International Conference on Machine Learning, 2020

Deep k-NN for Noisy Labels.
Proceedings of the 37th International Conference on Machine Learning, 2020

Reverse Engineering Configurations of Neural Text Generation Models.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2018
Diminishing Returns Shape Constraints for Interpretability and Regularization.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018


  Loading...