Arman Cohan

According to our database1, Arman Cohan authored at least 104 papers between 2014 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions.
CoRR, 2024

On the Benefits of Fine-Grained Loss Truncation: A Case Study on Factuality in Summarization.
CoRR, 2024

Quantifying Contamination in Evaluating Code Generation Capabilities of Language Models.
CoRR, 2024

Calibrating Long-form Generations from Large Language Models.
CoRR, 2024

Prioritizing Safeguarding Over Autonomy: Risks of LLM Agents for Science.
CoRR, 2024

OLMo: Accelerating the Science of Language Models.
CoRR, 2024

When do Generative Query and Document Expansions Fail? A Comprehensive Study Across Methods, Retrievers, and Datasets.
Proceedings of the Findings of the Association for Computational Linguistics: EACL 2024, 2024

TESS: Text-to-Text Self-Conditioned Simplex Diffusion.
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics, 2024

On the Benefits of Fine-Grained Loss Truncation: A Case Study on Factuality in Summarization.
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics, 2024

2023
Observable Propagation: A Data-Efficient Approach to Uncover Feature Vectors in Transformers.
CoRR, 2023

MedAgents: Large Language Models as Collaborators for Zero-shot Medical Reasoning.
CoRR, 2023

ML-Bench: Large Language Models Leverage Open-source Libraries for Machine Learning Tasks.
CoRR, 2023

DocMath-Eval: Evaluating Numerical Reasoning Capabilities of LLMs in Understanding Long Documents with Tabular Data.
CoRR, 2023

KnowledgeMath: Knowledge-Intensive Math Word Problem Solving in Finance Domains.
CoRR, 2023

Investigating Data Contamination in Modern Benchmarks for Large Language Models.
CoRR, 2023

Back to Basics: A Simple Recipe for Improving Out-of-Domain Retrieval in Dense Encoders.
CoRR, 2023

On Evaluating the Integration of Reasoning and Action in LLM Agents with Database Question Answering.
CoRR, 2023

Benchmarking Generation and Evaluation Capabilities of Large Language Models for Instruction Controllable Summarization.
CoRR, 2023

L2CEval: Evaluating Language-to-Code Generation Capabilities of Large Language Models.
CoRR, 2023

Struc-Bench: Are Large Language Models Really Good at Generating Complex Structured Data?
CoRR, 2023

ODSum: New Benchmarks for Open Domain Multi-Document Summarization.
CoRR, 2023

Large Language Models are Effective Table-to-Text Generators, Evaluators, and Feedback Providers.
CoRR, 2023

A Controllable QA-based Framework for Decontextualization.
CoRR, 2023

QTSumm: A New Benchmark for Query-Focused Table Summarization.
CoRR, 2023

On Learning to Summarize with Large Language Models as References.
CoRR, 2023

Enhancing Few-shot Text-to-SQL Capabilities of Large Language Models: A Study on Prompt Design Strategies.
CoRR, 2023

Inference-time Re-ranker Relevance Feedback for Neural Information Retrieval.
CoRR, 2023

TESS: Text-to-Text Self-Conditioned Simplex Diffusion.
CoRR, 2023

The Semantic Scholar Open Data Platform.
CoRR, 2023

SciRepEval: A Multi-Format Benchmark for Scientific Document Representations.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

A Question Answering Framework for Decontextualizing User-facing Snippets from Scientific Documents.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Enhancing Text-to-SQL Capabilities of Large Language Models: A Study on Prompt Design Strategies.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Open Domain Multi-document Summarization: A Comprehensive Study of Model Brittleness under Retrieval.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Medical Text Simplification: Optimizing for Readability with Unlikelihood Training and Reranked Beam Search Decoding.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Investigating Table-to-Text Generation Capabilities of Large Language Models in Real-World Information Seeking Scenarios.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: EMNLP 2023, 2023

QTSumm: Query-Focused Summarization over Tabular Data.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Embedding Recycling for Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: EACL 2023, 2023

LongEval: Guidelines for Human Evaluation of Faithfulness in Long-form Summarization.
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, 2023

OpenRT: An Open-source Framework for Reasoning Over Tabular Data.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2023

Peek Across: Improving Multi-Document Modeling via Cross-Document Question-Answering.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Aligning Factual Consistency for Clinical Studies Summarization through Reinforcement Learning.
Proceedings of the 5th Clinical Natural Language Processing Workshop, 2023

2022
ABNIRML: Analyzing the Behavior of Neural IR Models.
Trans. Assoc. Comput. Linguistics, 2022

Exploring the Challenges of Open Domain Multi-Document Summarization.
CoRR, 2022

MultiVerS: Improving scientific claim verification with weak supervision and full-document context.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2022, 2022

Multi-Vector Models with Textual Guidance for Fine-Grained Scientific Document Similarity.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

MultiCite: Modeling realistic citations requires moving beyond the single-sentence single-label setting.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

Long Context Question Answering via Supervised Contrastive Learning.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

SciFact-Open: Towards open-domain scientific claim verification.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

Overview of the First Shared Task on Multi Perspective Scientific Document Summarization (MuP).
Proceedings of the Third Workshop on Scholarly Document Processing, 2022

Overview of the Third Workshop on Scholarly Document Processing.
Proceedings of the Third Workshop on Scholarly Document Processing, 2022

PRIMERA: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

Improving the Generalizability of Depression Detection by Leveraging Clinical Questionnaires.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

Zero- and Few-Shot NLP with Pretrained Language Models.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, 2022

Generating Scientific Claims for Zero-Shot Scientific Fact Checking.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

2021
ParsiNLU: A Suite of Language Understanding Challenges for Persian.
Trans. Assoc. Comput. Linguistics, 2021

Utilizing Evidence Spans via Sequence-Level Contrastive Learning for Long-Context Question Answering.
CoRR, 2021

LongChecker: Improving scientific claim verification by modeling full-abstract context.
CoRR, 2021

PRIMER: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization.
CoRR, 2021

Cross-Document Language Modeling.
CoRR, 2021

Simplified Data Wrangling with ir_datasets.
Proceedings of the SIGIR '21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021

FLEX: Unifying Evaluation for Few-Shot NLP.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

A Dataset of Information-Seeking Questions and Answers Anchored in Research Papers.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

CDLM: Cross-Document Language Modeling.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

On Generating Extended Summaries of Long Documents.
Proceedings of the Workshop on Scientific Document Understanding co-located with 35th AAAI Conference on Artificial Inteligence, 2021

2020
SLEDGE: A Simple Yet Effective Baseline for Coronavirus Scientific Knowledge Search.
CoRR, 2020

Longformer: The Long-Document Transformer.
CoRR, 2020

Fact or Fiction: Verifying Scientific Claims.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

SLEDGE-Z: A Zero-Shot Baseline for COVID-19 Literature Search.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

GUIR @ LongSumm 2020: Learning to Generate Long Summaries from Scientific Documents.
Proceedings of the First Workshop on Scholarly Document Processing, 2020

TLDR: Extreme Summarization of Scientific Documents.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

Ranking Significant Discrepancies in Clinical Reports.
Proceedings of the Advances in Information Retrieval, 2020

SUPP.AI: finding evidence for supplement-drug interactions.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2020

SPECTER: Document-level Representation Learning using Citation-informed Transformers.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019
Overcoming low-utility facets for complex answer retrieval.
Inf. Retr. J., 2019

Extracting evidence of supplement-drug interactions from literature.
CoRR, 2019

SciBERT: Pretrained Contextualized Embeddings for Scientific Text.
CoRR, 2019

CEDR: Contextualized Embeddings for Document Ranking.
Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2019

Ontology-Aware Clinical Abstractive Summarization.
Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2019

Structural Scaffolds for Citation Intent Classification in Scientific Publications.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

Pretrained Language Models for Sequential Sentence Classification.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

SciBERT: A Pretrained Language Model for Scientific Text.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

2018
Text Summarization and Categorization for Scientific and Health-Related Data.
SIGIR Forum, 2018

Scientific document summarization via citation contextualization and scientific discourse.
Int. J. Digit. Libr., 2018

Characterizing Question Facets for Complex Answer Retrieval.
Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, 2018

GU IRLAB at SemEval-2018 Task 7: Tree-LSTMs for Scientific Relation Classification.
Proceedings of The 12th International Workshop on Semantic Evaluation, 2018

A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents.
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018

SMHD: a Large-Scale Resource for Exploring Online Language Usage for Multiple Mental Health Conditions.
Proceedings of the 27th International Conference on Computational Linguistics, 2018

Relation Extraction for Protein-protein Interactions Affected by Mutations.
Proceedings of the 2018 ACM International Conference on Bioinformatics, 2018

Helping or Hurting? Predicting Changes in Users' Risk of Self-Harm Through Online Community Interactions.
Proceedings of the Fifth Workshop on Computational Linguistics and Clinical Psychology: From Keyboard to Clinic, 2018

RSDD-Time: Temporal Annotation of Self-Reported Mental Health Diagnoses.
Proceedings of the Fifth Workshop on Computational Linguistics and Clinical Psychology: From Keyboard to Clinic, 2018

2017
Triaging content severity in online mental health forums.
J. Assoc. Inf. Sci. Technol., 2017

Contextualizing Citations for Scientific Summarization using Word Embeddings and Domain Knowledge.
Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2017

GUIR at SemEval-2017 Task 12: A Framework for Cross-Domain Clinical Temporal Information Extraction.
Proceedings of the 11th International Workshop on Semantic Evaluation, 2017

Depression and Self-Harm Risk Assessment in Online Forums.
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017

A Neural Attention Model for Categorizing Patient Safety Events.
Proceedings of the Advances in Information Retrieval, 2017

Identifying Harm Events in Clinical Care through Medical Narratives.
Proceedings of the 8th ACM International Conference on Bioinformatics, 2017

2016
GUIR at SemEval-2016 task 12: Temporal Information Processing for Clinical Narratives.
Proceedings of the 10th International Workshop on Semantic Evaluation, 2016

Triaging Mental Health Forum Posts.
Proceedings of the 3rd Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality, 2016

Revisiting Summarization Evaluation for Scientific Articles.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

2015
Matching Citation Text and Cited Spans in Biomedical Literature: a Search-Oriented Approach.
Proceedings of the NAACL HLT 2015, The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado, USA, May 31, 2015

Scientific Article Summarization Using Citation-Context and Article's Discourse Structure.
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015

Retrieving Medical Literature for Clinical Decision Support.
Proceedings of the Advances in Information Retrieval, 2015

2014
Query Reformulation for Clinical Decision Support Search.
Proceedings of The Twenty-Third Text REtrieval Conference, 2014

On clinical decision support.
Proceedings of the 5th ACM Conference on Bioinformatics, 2014


  Loading...