Mostafa Dehghani

Orcid: 0000-0002-9772-1095

Affiliations:
  • Google
  • University of Amsterdam, The Netherlands
  • University of Tehran, Iran


According to our database1, Mostafa Dehghani authored at least 99 papers between 2012 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Frozen Feature Augmentation for Few-Shot Image Classification.
CoRR, 2024

Fractal Patterns May Unravel the Intelligence in Next-Token Prediction.
CoRR, 2024

2023
PolyViT: Co-training Vision Transformers on Images, Videos and Audio.
Trans. Mach. Learn. Res., 2023

Dual PatchNorm.
Trans. Mach. Learn. Res., 2023

Efficient Transformers: A Survey.
ACM Comput. Surv., 2023

Parameter-Efficient Multilingual Summarisation: An Empirical Study.
CoRR, 2023

How (not) to ensemble LVLMs for VQA.
CoRR, 2023

Group Membership Bias.
CoRR, 2023

PaLI-X: On Scaling up a Multilingual Vision and Language Model.
CoRR, 2023

PaLM 2 Technical Report.
CoRR, 2023

End-to-End Spatio-Temporal Action Localisation with Video Transformers.
CoRR, 2023

Scaling Vision Transformers to 22 Billion Parameters.
CoRR, 2023

Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Adaptive Computation with Elastic Input Sequence.
Proceedings of the International Conference on Machine Learning, 2023


UL2: Unifying Language Learning Paradigms.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

L2 Norm Guided Adaptive Computation.
Proceedings of the First Tiny Papers Track at ICLR 2023, 2023

$\Lambda$-DARTS: Mitigating Performance Collapse by Harmonizing Operation Selection among Cells.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Transcending Scaling Laws with 0.1% Extra Compute.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Scaling Laws vs Model Architectures: How does Inductive Bias Influence Scaling?
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

DSI++: Updating Transformer Memory with New Documents.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

2022
DSI++: Updating Transformer Memory with New Documents.
CoRR, 2022

Automated Deep Aberration Detection from Chromosome Karyotype Images.
CoRR, 2022

Scaling Instruction-Finetuned Language Models.
CoRR, 2022

Λ-DARTS: Mitigating Performance Collapse by Harmonizing Operation Selection among Cells.
CoRR, 2022

Confident Adaptive Language Modeling.
CoRR, 2022

Beyond Transfer Learning: Co-finetuning for Action Localisation.
CoRR, 2022

Simple Open-Vocabulary Object Detection with Vision Transformers.
CoRR, 2022

Unifying Language Learning Paradigms.
CoRR, 2022

Retrieval-Enhanced Machine Learning.
Proceedings of the SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11, 2022

Transformer Memory as a Differentiable Search Index.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Confident Adaptive Language Modeling.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Scale Efficiently: Insights from Pretraining and Finetuning Transformers.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Discrete Representations Strengthen Vision Transformer Robustness.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Exploring the Limits of Large Scale Pre-training.
Proceedings of the Tenth International Conference on Learning Representations, 2022

The Efficiency Misnomer.
Proceedings of the Tenth International Conference on Learning Representations, 2022


SCENIC: A JAX Library for Computer Vision Research and Beyond.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Intersection of Parallels as an Early Stopping Criterion.
Proceedings of the 31st ACM International Conference on Information & Knowledge Management, 2022

2021
Learning to rank for multi-label text classification: Combining different sources of information.
Nat. Lang. Eng., 2021

VUT: Versatile UI Transformer for Multi-Modal Multi-Task User Interface Modeling.
CoRR, 2021

PolyViT: Co-training Vision Transformers on Images, Videos and Audio.
CoRR, 2021

Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers.
CoRR, 2021

The Benchmark Lottery.
CoRR, 2021

TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?
CoRR, 2021

Gradual Domain Adaptation in the Wild: When Intermediate Distributions are Absent.
CoRR, 2021

Are Pre-trained Convolutions Better than Pre-trained Transformers?
CoRR, 2021

TokenLearner: Adaptive Space-Time Tokenization for Videos.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

OmniNet: Omnidirectional Representations from Transformers.
Proceedings of the 38th International Conference on Machine Learning, 2021

Long Range Arena : A Benchmark for Efficient Transformers.
Proceedings of the 9th International Conference on Learning Representations, 2021

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale.
Proceedings of the 9th International Conference on Learning Representations, 2021

IDF++: Analyzing and Improving Integer Discrete Flows for Lossless Compression.
Proceedings of the 9th International Conference on Learning Representations, 2021

ViViT: A Video Vision Transformer.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Are Pretrained Convolutions Better than Pretrained Transformers?
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
Transferring Inductive Biases through Knowledge Distillation.
CoRR, 2020

MetNet: A Neural Weather Model for Precipitation Forecasting.
CoRR, 2020

2019
HiTR: Hierarchical Topic Model Re-Estimation for Measuring Topical Diversity of Documents.
IEEE Trans. Knowl. Data Eng., 2019

Universal Transformers.
Proceedings of the 7th International Conference on Learning Representations, 2019

Learning to Transform, Combine, and Reason in Open-Domain Question Answering.
Proceedings of the 31st Benelux Conference on Artificial Intelligence (BNAIC 2019) and the 28th Belgian Dutch Conference on Machine Learning (Benelearn 2019), 2019

2018
Expert finding by the Dempster-Shafer theory for evidence combination.
Expert Syst. J. Knowl. Eng., 2018

Learning to Rank from Samples of Variable Quality.
CoRR, 2018

SIGIR 2018 Workshop on Learning from Limited or Noisy Data for Information Retrieval.
Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, 2018

Fidelity-Weighted Learning.
Proceedings of the 6th International Conference on Learning Representations, 2018

From Neural Re-Ranking to Neural Ranking: Learning a Sparse Representation for Inverted Indexing.
Proceedings of the 27th ACM International Conference on Information and Knowledge Management, 2018

2017
Toward Document Understanding for Information Retrieval.
SIGIR Forum, 2017

Learning to Learn from Weak Supervision by Full Supervision.
CoRR, 2017

Avoiding Your Teacher's Mistakes: Training Neural Networks with Controlled Weak Supervision.
CoRR, 2017

Share your Model instead of your Data: Privacy Preserving Mimic Learning for Ranking.
CoRR, 2017

Neural Networks for Information Retrieval.
Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2017

Neural Ranking Models with Weak Supervision.
Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2017

On Search Powered Navigation.
Proceedings of the ACM SIGIR International Conference on Theory of Information Retrieval, 2017

Hierarchical Re-estimation of Topic Models for Measuring Topical Diversity.
Proceedings of the Advances in Information Retrieval, 2017

Learning to Attend, Copy, and Generate for Session-Based Query Suggestion.
Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, 2017

Words are Malleable: Computing Semantic Shifts in Political and Media Discourse.
Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, 2017

Telling How to Narrow it Down: Browsing Path Recommendation for Exploratory Search.
Proceedings of the 2017 Conference on Conference Human Information Interaction and Retrieval, 2017

2016
Building a multi-domain comparable corpus using a learning to rank method.
Nat. Lang. Eng., 2016

Alecsa: Attentive Learning for Email Categorization using Structural Aspects.
Knowl. Based Syst., 2016

Significant Words Language Models for Contextual Suggestion.
Proceedings of The Twenty-Fifth Text REtrieval Conference, 2016

Significant Words Representations of Entities.
Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, 2016

On Horizontal and Vertical Separation in Hierarchical Text Classification.
Proceedings of the 2016 ACM on International Conference on the Theory of Information Retrieval, 2016

Distributional Semantics for Medical Information Extraction.
Proceedings of the Working Notes of CLEF 2016, 2016

Two-Way Parsimonious Classification Models for Evolving Hierarchies.
Proceedings of the Experimental IR Meets Multilinguality, Multimodality, and Interaction, 2016

Luhn Revisited: Significant Words Language Models.
Proceedings of the 25th ACM International Conference on Information and Knowledge Management, 2016

The Healing Power of Poison: Helpful Non-relevant Documents in Feedback.
Proceedings of the 25th ACM International Conference on Information and Knowledge Management, 2016

Generalized Group Profiling for Content Customization.
Proceedings of the 2016 ACM Conference on Human Information Interaction and Retrieval, 2016

2015
Parsimonious User and Group Profiling in Venue Recommendation.
Proceedings of The Twenty-Fourth Text REtrieval Conference, 2015

Time-Aware Authorship Attribution for Short Text Streams.
Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2015

Revisiting Optimal Rank Aggregation: A Dynamic Programming Approach.
Proceedings of the 2015 International Conference on The Theory of Information Retrieval, 2015

Sources of Evidence for Automatic Indexing of Political Texts.
Proceedings of the Advances in Information Retrieval, 2015

Are Topically Diverse Documents Also Interesting?
Proceedings of the Experimental IR Meets Multilinguality, Multimodality, and Interaction, 2015

Meta Text Aligner: Text Alignment Based on Predicted Plagiarism Relation.
Proceedings of the Experimental IR Meets Multilinguality, Multimodality, and Interaction, 2015

2014
Entity linking by focusing DBpedia candidate entities.
Proceedings of the ERD'14, 2014

Authorship Identification Using Dynamic Selection of Features from Probabilistic Feature Set.
Proceedings of the Information Access Evaluation. Multilinguality, Multimodality, and Interaction, 2014

Expanded N-Grams for Semantic Text Alignment Notebook for PAN at CLEF 2014.
Proceedings of the Working Notes for CLEF 2014 Conference, 2014

2013
A learning approach for email conversation thread reconstruction.
J. Inf. Sci., 2013

A Supervised Approach for Reconstructing Thread Structure in Comments on Blogs and Online News Agencies (El enfoque supervisado para reconstrucción de la estructura de hilos en comentarios en blogs y agencias de noticias en línea).
Computación y Sistemas, 2013

2012
An Evolutionary-Based Method for Reconstructing Conversation Threads in Email Corpora.
Proceedings of the International Conference on Advances in Social Networks Analysis and Mining, 2012


  Loading...