We stand with Ukraine

We stand with Ukraine

Madian Khabsa

According to our database¹, Madian Khabsa authored at least 81 papers between 2010 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2025

Beyond Reasoning Gains: Mitigating General Capabilities Forgetting in Large Reasoning Models.

[DOI]

,

,

,

,

,

,

,

,

CoRR, October, 2025

Pisces: An Auto-regressive Foundation Model for Image Understanding and Generation.

[DOI]

,

,

,

,

,

,

,

,

,

Michihiro Yasunaga

,

,

Xi Victoria Lin

,

CoRR, June, 2025

High Accuracy, Less Talk (HALT): Reliable LLMs through Capability-Aligned Finetuning.

[DOI]

,

Archie Sravankumar

,

,

,

,

,

Jakob N. Foerster

,

Luke Zettlemoyer

,

CoRR, June, 2025

Diversity-driven Data Selection for Language Model Tuning through Sparse Autoencoder.

[DOI]

,

,

,

Suchin Gururangan

,

,

,

,

CoRR, February, 2025

2024

On the Equivalence of Graph Convolution and Mixup.

[DOI]

,

,

,

,

,

,

,

Karthik Abinav Sankararaman

,

,

,

,

Trans. Mach. Learn. Res., 2024

Preference Optimization with Multi-Sample Comparisons.

[DOI]

,

,

,

Karthik Abinav Sankararaman

,

,

,

,

,

,

,

CoRR, 2024

The Perfect Blend: Redefining RLHF with Mixture of Judges.

[DOI]

CoRR, 2024

Effective Long-Context Scaling of Foundation Models.

[DOI]

Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

MART: Improving LLM Safety with Multi-round Automatic Red-Teaming.

[DOI]

,

,

,

,

,

,

,

Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants.

[DOI]

Lucas Bandarkar

,

,

Benjamin Muller

,

,

Satya Narayan Shukla

,

,

,

Abhinandan Krishnan

,

Luke Zettlemoyer

,

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023

Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations.

[DOI]

,

Kartikeya Upasani

,

,

,

,

,

Michael Tontchev

,

,

,

Davide Testuggine

,

CoRR, 2023

Llama 2: Open Foundation and Fine-Tuned Chat Models.

[DOI]

,

,

,

,

Amjad Almahairi

,

,

Nikolay Bashlykov

,

,

Prajjwal Bhargava

,

,

,

,

Cristian Canton-Ferrer

,

,

Guillem Cucurull

,

,

,

,

,

,

,

Vedanuj Goswami

,

,

Anthony Hartshorn

,

Saghar Hosseini

,

,

,

,

,

,

Isabel Kloumann

,

,

Punit Singh Koura

,

Marie-Anne Lachaux

,

,

,

Diana Liskovich

,

,

,

Xavier Martinet

,

,

,

,

,

,

Jeremy Reizenstein

,

,

,

,

,

Eric Michael Smith

,

Ranjan Subramanian

,

Xiaoqing Ellen Tan

,

,

,

,

Jian Xiang Kuan

,

,

,

,

,

,

Melanie Kambadur

,

,

Aurélien Rodriguez

,

,

,

CoRR, 2023

MMViT: Multiscale Multiview Vision Transformers.

[DOI]

,

,

,

,

,

,

,

,

,

Donald S. Williamson

,

CoRR, 2023

SVT: Supertoken Video Transformer for Efficient Video Understanding.

[DOI]

,

,

,

,

Senem Velipasalar

,

CoRR, 2023

Progressive Prompts: Continual Learning for Language Models.

[DOI]

Anastasia Razdaibiedina

,

,

,

,

,

Amjad Almahairi

Proceedings of the Eleventh International Conference on Learning Representations, 2023

XLM-V: Overcoming the Vocabulary Bottleneck in Multilingual Masked Language Models.

[DOI]

,

,

,

,

,

Marjan Ghazvininejad

,

Luke Zettlemoyer

,

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

RoAST: Robustifying Language Models via Adversarial Perturbation with Selective Training.

[DOI]

,

,

,

,

,

,

,

,

,

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Generating Hashtags for Short-form Videos with Guided Signals.

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

MixPAVE: Mix-Prompt Tuning for Few-shot Product Attribute Value Extraction.

[DOI]

,

,

,

,

,

,

,

,

,

Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

MUSTIE: Multimodal Structural Transformer for Web Information Extraction.

[DOI]

,

,

,

,

,

,

,

,

,

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Residual Prompt Tuning: improving prompt tuning with residual reparameterization.

[DOI]

Anastasia Razdaibiedina

,

,

,

,

,

,

Amjad Almahairi

Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

Logical Satisfiability of Counterfactuals for Faithful Explanations in NLI.

[DOI]

,

,

Amjad Almahairi

,

,

Luke Zettlemoyer

,

Lambert Mathias

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022

Uniform Masking Prevails in Vision-Language Pretraining.

[DOI]

Siddharth Verma

,

,

,

,

,

,

Amjad Almahairi

CoRR, 2022

Sparse Distillation: Speeding Up Text Classification by Using Bigger Student Models.

[DOI]

,

,

,

,

,

Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

Quantifying Adaptability in Pre-trained Language Models with 500 Tasks.

[DOI]

,

,

,

Luke Zettlemoyer

,

,

Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

SMARTAVE: Structured Multimodal Transformer for Product Attribute Value Extraction.

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

UniPELT: A Unified Framework for Parameter-Efficient Language Model Tuning.

[DOI]

,

Lambert Mathias

,

,

Amjad Almahairi

,

,

,

,

Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

2021

A Fistful of Words: Learning Transferable Visual Models from Bag-of-Words Supervision.

[DOI]

Ajinkya Tejankar

,

,

,

,

,

Hamed Pirsiavash

,

CoRR, 2021

Sparse Distillation: Speeding Up Text Classification by Using Bigger Models.

[DOI]

,

,

,

,

,

CoRR, 2021

Entailment as Few-Shot Learner.

[DOI]

,

,

,

,

CoRR, 2021

Towards Few-Shot Fact-Checking via Perplexity.

[DOI]

,

,

,

,

CoRR, 2021

On Unifying Misinformation Detection.

[DOI]

,

,

,

,

,

,

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

On the Influence of Masking Policies in Intermediate Pre-training.

[DOI]

,

,

,

,

,

,

,

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

2020

Studying Strategically: Learning to Mask for Closed-book QA.

[DOI]

,

,

,

,

,

,

,

CoRR, 2020

CLEAR: Contrastive Learning for Sentence Representation.

[DOI]

,

,

,

,

,

CoRR, 2020

To Pretrain or Not to Pretrain: Examining the Benefits of Pretraining on Resource Rich Tasks.

[DOI]

,

,

CoRR, 2020

Linformer: Self-Attention with Linear Complexity.

[DOI]

,

,

,

,

CoRR, 2020

Language Models as Fact Checkers?

[DOI]

,

,

,

,

,

CoRR, 2020

To Pretrain or Not to Pretrain: Examining the Benefits of Pretrainng on Resource Rich Tasks.

[DOI]

,

,

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019

Keeping Notes: Conditional Natural Language Generation with a Scratchpad Mechanism.

[DOI]

Ryan Y. Benmalek

,

,

,

,

CoRR, 2019

Keeping Notes: Conditional Natural Language Generation with a Scratchpad Encoder.

[DOI]

Ryan Y. Benmalek

,

,

,

,

Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

Adversarial Training for Community Question Answer Selection Based on Multi-Scale Matching.

[DOI]

,

,

,

,

Ahmed Hassan Awadallah

,

,

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018

Adversarial Training for Community Question Answer Selection Based on Multi-scale Matching.

[DOI]

,

,

,

,

Ahmed Hassan Awadallah

CoRR, 2018

Identifying Task Boundaries in Digital Assistants.

[DOI]

,

,

Ahmed Hassan Awadallah

,

,

Proceedings of the Companion Proceedings of the The Web Conference 2018, 2018

Characterizing and Supporting Question Answering in Human-to-Human Communication.

[DOI]

,

Ahmed Hassan Awadallah

,

,

,

Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, 2018

2017

Actionable Email Intent Modeling with Reparametrized RNNs.

[DOI]

,

,

,

,

Ahmed Hassan Awadallah

,

CoRR, 2017

User Interaction Sequences for Search Satisfaction Prediction.

[DOI]

Rishabh Mehrotra

,

,

Ahmed Hassan Awadallah

,

,

Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2017

Building Natural Language Interfaces to Web APIs.

[DOI]

,

Ahmed Hassan Awadallah

,

,

,

,

Mark J. Encarnación

Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, 2017

Deep Sequential Models for Task Satisfaction Prediction.

[DOI]

Rishabh Mehrotra

,

Ahmed Hassan Awadallah

,

,

,

,

,

Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, 2017

2016

Learning to identify relevant studies for systematic reviews using random forest and external information.

[DOI]

,

Ahmed K. Elmagarmid

,

,

,

Mach. Learn., 2016

Random Forest DBSCAN for USPTO Inventor Name Disambiguation.

[DOI]

,

,

CoRR, 2016

Detecting Good Abandonment in Mobile Search.

[DOI]

,

,

,

,

Ahmed Hassan Awadallah

,

Proceedings of the 25th International Conference on World Wide Web, 2016

Is This Your Final Answer?: Evaluating the Effect of Answers on Good Abandonment in Mobile Search.

[DOI]

,

,

,

,

Ahmed Hassan Awadallah

,

Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, 2016

Identifying Earmarks in Congressional Bills.

[DOI]

,

,

,

,

,

Christopher Berry

,

Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016

Inventor Name Disambiguation for a Patent Database Using a Random Forest and DBSCAN.

[DOI]

,

,

Proceedings of the 16th ACM/IEEE-CS on Joint Conference on Digital Libraries, 2016

Towards Better Understanding of Academic Search.

[DOI]

,

,

Proceedings of the 16th ACM/IEEE-CS on Joint Conference on Digital Libraries, 2016

Learning to Account for Good Abandonment in Search Success Metrics.

[DOI]

,

,

Ahmed Hassan Awadallah

,

,

Tasos Anastasakos

,

Proceedings of the 25th ACM International Conference on Information and Knowledge Management, 2016

2015

The CHEMDNER corpus of chemicals and drugs and its annotation principles.

[DOI]

Martin Krallinger

,

,

Florian Leitner

,

,

,

,

,

,

,

,

,

Riza Theresa Batista-Navarro

,

,

,

Tim Rocktäschel

,

,

,

,

,

Tsendsuren Munkhdalai

,

,

,

P. Senthil Nathan

,

,

,

,

,

Saber A. Akhondi

,

,

,

,

Utpal Kumar Sikdar

,

,

Masaharu Yoshioka

,

,

,

,

,

,

,

Ravikumar Komandur Elayavilli

,

,

Francisco M. Couto

,

,

Richard Tzong-Han Tsai

,

,

,

,

,

Isabel Segura-Bedmar

,

Paloma Martínez

,

Julen Oyarzabal

,

Alfonso Valencia

J. Cheminformatics, 2015

Chemical entity extraction using CRF and an ensemble of extractors.

[DOI]

,

J. Cheminformatics, 2015

CiteSeerX: AI in a Digital Library Search Engine.

[DOI]

,

Kyle Mark Williams

,

Hung-Hsuan Chen

,

,

Cornelia Caragea

,

Suppawong Tuarob

,

Alexander Ororbia

,

,

Prasenjit Mitra

,

AI Mag., 2015

Big Scholarly Data in CiteSeerX: Information Extraction from the Web.

[DOI]

Alexander G. Ororbia II

,

,

,

,

Clyde Lee Giles

Proceedings of the 24th International Conference on World Wide Web Companion, 2015

Automatically Generating a Concept Hierarchy with Graphs.

[DOI]

Pucktada Treeratpituk

,

,

Proceedings of the 15th ACM/IEEE-CE Joint Conference on Digital Libraries, 2015

Online Person Name Disambiguation with Constraints.

[DOI]

,

Pucktada Treeratpituk

,

Proceedings of the 15th ACM/IEEE-CE Joint Conference on Digital Libraries, 2015

2014

Towards building a scholarly big data platform: Challenges, lessons and opportunities.

[DOI]

,

,

,

,

Hung-Hsuan Chen

,

,

Suppawong Tuarob

,

Sagnik Ray Choudhury

,

Alexander Ororbia

,

Prasenjit Mitra

,

Proceedings of the IEEE/ACM Joint Conference on Digital Libraries, 2014

The feasibility of investing in manual correction of metadata for a large-scale digital library.

[DOI]

Hung-Hsuan Chen

,

,

Proceedings of the IEEE/ACM Joint Conference on Digital Libraries, 2014

A Web Service for Scholarly Big Data Information Extraction.

[DOI]

,

,

,

,

Patrick C. Shih

,

Proceedings of the 2014 IEEE International Conference on Web Services, 2014

Scholarly big data information extraction and integration in the CiteSeer<sup>χ</sup> digital library.

[DOI]

,

,

Sagnik Ray Choudhury

,

,

Proceedings of the Workshops Proceedings of the 30th International Conference on Data Engineering Workshops, 2014

Migrating a Digital Library to a Private Cloud.

[DOI]

,

Pradeep B. Teregowda

,

,

,

,

,

,

Proceedings of the 2014 IEEE International Conference on Cloud Engineering, 2014

Utility-Based Control Feedback in a Digital Library Search Engine: Cases in CiteSeerX.

[DOI]

,

Alexander Ororbia

,

,

,

,

Proceedings of the 9th International Workshop on Feedback Computing, 2014

The impact of user corrections on a crawl-based digital library: A CiteSeerX perspective.

[DOI]

,

,

,

Proceedings of the 10th IEEE International Conference on Collaborative Computing: Networking, 2014

Large scale author name disambiguation in digital libraries.

[DOI]

,

Pucktada Treeratpituk

,

Proceedings of the 2014 IEEE International Conference on Big Data (IEEE BigData 2014), 2014

CiteSeerX: AI in a Digital Library Search Engine.

[DOI]

,

Kyle Mark Williams

,

Hung-Hsuan Chen

,

,

Cornelia Caragea

,

Alexander Ororbia

,

,

Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, 2014

2013

Graph-based Approach to Automatic Taxonomy Generation (GraBTax).

[DOI]

Pucktada Treeratpituk

,

,

CoRR, 2013

2012

Specialized Research Datasets in the CiteSeer<sup>x</sup> Digital Library.

[DOI]

,

Cornelia Caragea

,

Hung-Hsuan Chen

,

,

Pucktada Treeratpituk

,

,

,

Prasenjit Mitra

,

D Lib Mag., 2012

Web crawler middleware for search engine digital libraries: a case study for citeseerX.

[DOI]

,

Pradeep B. Teregowda

,

,

,

,

Jose San Pedro Wandelmer

,

,

Prasenjit Mitra

,

Proceedings of the Twelfth International Workshop on Web Information and Data Management, 2012

A Framework for Bridging the Gap Between Open Source Search Tools.

,

,

Sagnik Ray Choudhury

,

Proceedings of the SIGIR 2012 Workshop on Open Source Information Retrieval, 2012

Towards Building and Analyzing a Social Network of Acknowledgments in Scientific and Academic Documents.

[DOI]

,

,

Proceedings of the Social Computing, Behavioral - Cultural Modeling and Prediction, 2012

A system for indexing tables, algorithms and figures.

[DOI]

Pradeep B. Teregowda

,

,

Clyde Lee Giles

Proceedings of the 12th ACM/IEEE-CS Joint Conference on Digital Libraries, 2012

AckSeer: a repository and search engine for automatically extracted acknowledgments from digital libraries.

[DOI]

,

Pucktada Treeratpituk

,

Proceedings of the 12th ACM/IEEE-CS Joint Conference on Digital Libraries, 2012

Entity resolution using search engine results.

[DOI]

,

Pucktada Treeratpituk

,

Proceedings of the 21st ACM International Conference on Information and Knowledge Management, 2012

2010

SeerSuite: Developing a Scalable and Reliable Application Framework for Building Digital Libraries by Crawling the Web.

[DOI]

Pradeep B. Teregowda

,

Isaac G. Councill

,

Juan Pablo Fernández Ramírez

,

,

,

Proceedings of the USENIX Conference on Web Application Development, 2010

Loading...