Arman Cohan

Orcid: 0000-0002-8954-2724

According to our database1, Arman Cohan authored at least 137 papers between 2014 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
<tt>L2CEval</tt>: Evaluating Language-to-Code Generation Capabilities of Large Language Models.
Trans. Assoc. Comput. Linguistics, 2024

SurgeryLLM: a retrieval-augmented generation large language model framework for surgical decision support and workflow enhancement.
npj Digit. Medicine, 2024

ChemSafetyBench: Benchmarking LLM Safety on Chemistry Domain.
CoRR, 2024

FinDVer: Explainable Claim Verification over Long and Hybrid-Content Financial Documents.
CoRR, 2024

MDCure: A Scalable Pipeline for Multi-Document Instruction-Following.
CoRR, 2024

TOMATO: Assessing Visual Temporal Reasoning Capabilities in Multimodal Foundation Models.
CoRR, 2024

COMAL: A Convergent Meta-Algorithm for Aligning LLMs with General Preferences.
CoRR, 2024

ReIFE: Re-evaluating Instruction-Following Evaluation.
CoRR, 2024

MetaMath: Integrating Natural Language and Code for Enhanced Mathematical Reasoning in Large Language Models.
CoRR, 2024

RouterRetriever: Exploring the Benefits of Routing over Multiple Expert Embedding Models.
CoRR, 2024

Understanding Reference Policies in Direct Preference Optimization.
CoRR, 2024

Unveiling the Spectrum of Data Contamination in Language Models: A Survey from Detection to Remediation.
CoRR, 2024

Step-Back Profiling: Distilling User History for Personalized Scientific Writing.
CoRR, 2024

SciRIFF: A Resource to Enhance Language Model Instruction-Following over Scientific Literature.
CoRR, 2024

MIMIR: A Streamlined Platform for Personalized Agent Tuning in Domain Expertise.
CoRR, 2024

Evaluating LLMs at Detecting Errors in LLM Responses.
CoRR, 2024

FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions.
CoRR, 2024

On the Benefits of Fine-Grained Loss Truncation: A Case Study on Factuality in Summarization.
CoRR, 2024

Prioritizing Safeguarding Over Autonomy: Risks of LLM Agents for Science.
CoRR, 2024

OLMo: Accelerating the Science of Language Models.
CoRR, 2024

Struc-Bench: Are Large Language Models Good at Generating Complex Structured Tabular Data?
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Short Papers, 2024

On Evaluating the Integration of Reasoning and Action in LLM Agents with Database Question Answering.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2024, 2024

On Learning to Summarize with Large Language Models as References.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

Benchmarking Generation and Evaluation Capabilities of Large Language Models for Instruction Controllable Summarization.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2024, 2024

Investigating Data Contamination in Modern Benchmarks for Large Language Models.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

NExT: Teaching Large Language Models to Reason about Code Execution.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Observable Propagation: Uncovering Feature Vectors in Transformers.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

SciDQA: A Deep Reading Comprehension Dataset over Scientific Papers.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

OMG-QA: Building Open-Domain Multi-Modal Generative Question Answering Systems.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: EMNLP 2024, 2024

M3SciQA: A Multi-Modal Multi-Document Scientific QA Benchmark for Evaluating Foundation Models.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Calibrating Long-form Generations From Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

P-FOLIO: Evaluating and Improving Logical Reasoning with Abundant Human-Written Reasoning Chains.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024


Bayesian Calibration of Win Rate Estimation with LLM Evaluators.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

FinDVer: Explainable Claim Verification over Long and Hybrid-content Financial Documents.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

When do Generative Query and Document Expansions Fail? A Comprehensive Study Across Methods, Retrievers, and Datasets.
Proceedings of the Findings of the Association for Computational Linguistics: EACL 2024, 2024

TESS: Text-to-Text Self-Conditioned Simplex Diffusion.
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics, 2024

On the Benefits of Fine-Grained Loss Truncation: A Case Study on Factuality in Summarization.
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics, 2024

MedAgents: Large Language Models as Collaborators for Zero-shot Medical Reasoning.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

Quantifying Contamination in Evaluating Code Generation Capabilities of Language Models.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Rethinking Efficient Multilingual Text Summarization Meta-Evaluation.
Proceedings of the Findings of the Association for Computational Linguistics, 2024


Unveiling the Spectrum of Data Contamination in Language Model: A Survey from Detection to Remediation.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

KnowledgeFMath: A Knowledge-Intensive Math Reasoning Dataset in Finance Domains.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

DocMath-Eval: Evaluating Math Reasoning Capabilities of LLMs in Understanding Financial Documents.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

TaPERA: Enhancing Faithfulness and Interpretability in Long-Form Table QA by Content Planning and Execution-based Reasoning.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023
Observable Propagation: A Data-Efficient Approach to Uncover Feature Vectors in Transformers.
CoRR, 2023

MedAgents: Large Language Models as Collaborators for Zero-shot Medical Reasoning.
CoRR, 2023

ML-Bench: Large Language Models Leverage Open-source Libraries for Machine Learning Tasks.
CoRR, 2023

DocMath-Eval: Evaluating Numerical Reasoning Capabilities of LLMs in Understanding Long Documents with Tabular Data.
CoRR, 2023

KnowledgeMath: Knowledge-Intensive Math Word Problem Solving in Finance Domains.
CoRR, 2023

Back to Basics: A Simple Recipe for Improving Out-of-Domain Retrieval in Dense Encoders.
CoRR, 2023

L2CEval: Evaluating Language-to-Code Generation Capabilities of Large Language Models.
CoRR, 2023

Struc-Bench: Are Large Language Models Really Good at Generating Complex Structured Data?
CoRR, 2023

ODSum: New Benchmarks for Open Domain Multi-Document Summarization.
CoRR, 2023

Large Language Models are Effective Table-to-Text Generators, Evaluators, and Feedback Providers.
CoRR, 2023

A Controllable QA-based Framework for Decontextualization.
CoRR, 2023

QTSumm: A New Benchmark for Query-Focused Table Summarization.
CoRR, 2023

On Learning to Summarize with Large Language Models as References.
CoRR, 2023

Enhancing Few-shot Text-to-SQL Capabilities of Large Language Models: A Study on Prompt Design Strategies.
CoRR, 2023

Inference-time Re-ranker Relevance Feedback for Neural Information Retrieval.
CoRR, 2023

The Semantic Scholar Open Data Platform.
CoRR, 2023

SciRepEval: A Multi-Format Benchmark for Scientific Document Representations.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

A Question Answering Framework for Decontextualizing User-facing Snippets from Scientific Documents.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Enhancing Text-to-SQL Capabilities of Large Language Models: A Study on Prompt Design Strategies.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Open Domain Multi-document Summarization: A Comprehensive Study of Model Brittleness under Retrieval.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Medical Text Simplification: Optimizing for Readability with Unlikelihood Training and Reranked Beam Search Decoding.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Investigating Table-to-Text Generation Capabilities of Large Language Models in Real-World Information Seeking Scenarios.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: EMNLP 2023, 2023

QTSumm: Query-Focused Summarization over Tabular Data.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Embedding Recycling for Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: EACL 2023, 2023

LongEval: Guidelines for Human Evaluation of Faithfulness in Long-form Summarization.
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, 2023

OpenRT: An Open-source Framework for Reasoning Over Tabular Data.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2023

Peek Across: Improving Multi-Document Modeling via Cross-Document Question-Answering.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Aligning Factual Consistency for Clinical Studies Summarization through Reinforcement Learning.
Proceedings of the 5th Clinical Natural Language Processing Workshop, 2023

2022
ABNIRML: Analyzing the Behavior of Neural IR Models.
Trans. Assoc. Comput. Linguistics, 2022

Exploring the Challenges of Open Domain Multi-Document Summarization.
CoRR, 2022

MultiVerS: Improving scientific claim verification with weak supervision and full-document context.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2022, 2022

Multi-Vector Models with Textual Guidance for Fine-Grained Scientific Document Similarity.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

MultiCite: Modeling realistic citations requires moving beyond the single-sentence single-label setting.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

Long Context Question Answering via Supervised Contrastive Learning.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

SciFact-Open: Towards open-domain scientific claim verification.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

Overview of the First Shared Task on Multi Perspective Scientific Document Summarization (MuP).
Proceedings of the Third Workshop on Scholarly Document Processing, 2022

Overview of the Third Workshop on Scholarly Document Processing.
Proceedings of the Third Workshop on Scholarly Document Processing, 2022

PRIMERA: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

Improving the Generalizability of Depression Detection by Leveraging Clinical Questionnaires.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

Zero- and Few-Shot NLP with Pretrained Language Models.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, 2022

Generating Scientific Claims for Zero-Shot Scientific Fact Checking.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

2021
ParsiNLU: A Suite of Language Understanding Challenges for Persian.
Trans. Assoc. Comput. Linguistics, 2021

Utilizing Evidence Spans via Sequence-Level Contrastive Learning for Long-Context Question Answering.
CoRR, 2021

LongChecker: Improving scientific claim verification by modeling full-abstract context.
CoRR, 2021

PRIMER: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization.
CoRR, 2021

Cross-Document Language Modeling.
CoRR, 2021

Simplified Data Wrangling with ir_datasets.
Proceedings of the SIGIR '21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021

FLEX: Unifying Evaluation for Few-Shot NLP.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

A Dataset of Information-Seeking Questions and Answers Anchored in Research Papers.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

CDLM: Cross-Document Language Modeling.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

On Generating Extended Summaries of Long Documents.
Proceedings of the Workshop on Scientific Document Understanding co-located with 35th AAAI Conference on Artificial Inteligence, 2021

2020
SLEDGE: A Simple Yet Effective Baseline for Coronavirus Scientific Knowledge Search.
CoRR, 2020

Longformer: The Long-Document Transformer.
CoRR, 2020

Fact or Fiction: Verifying Scientific Claims.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

SLEDGE-Z: A Zero-Shot Baseline for COVID-19 Literature Search.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

GUIR @ LongSumm 2020: Learning to Generate Long Summaries from Scientific Documents.
Proceedings of the First Workshop on Scholarly Document Processing, 2020

TLDR: Extreme Summarization of Scientific Documents.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

Ranking Significant Discrepancies in Clinical Reports.
Proceedings of the Advances in Information Retrieval, 2020

SUPP.AI: finding evidence for supplement-drug interactions.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2020

SPECTER: Document-level Representation Learning using Citation-informed Transformers.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019
Overcoming low-utility facets for complex answer retrieval.
Inf. Retr. J., 2019

Extracting evidence of supplement-drug interactions from literature.
CoRR, 2019

SciBERT: Pretrained Contextualized Embeddings for Scientific Text.
CoRR, 2019

CEDR: Contextualized Embeddings for Document Ranking.
Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2019

Ontology-Aware Clinical Abstractive Summarization.
Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2019

Structural Scaffolds for Citation Intent Classification in Scientific Publications.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

Pretrained Language Models for Sequential Sentence Classification.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

SciBERT: A Pretrained Language Model for Scientific Text.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

2018
Text Summarization and Categorization for Scientific and Health-Related Data.
SIGIR Forum, 2018

Scientific document summarization via citation contextualization and scientific discourse.
Int. J. Digit. Libr., 2018

Characterizing Question Facets for Complex Answer Retrieval.
Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, 2018

GU IRLAB at SemEval-2018 Task 7: Tree-LSTMs for Scientific Relation Classification.
Proceedings of The 12th International Workshop on Semantic Evaluation, 2018

A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents.
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018

SMHD: a Large-Scale Resource for Exploring Online Language Usage for Multiple Mental Health Conditions.
Proceedings of the 27th International Conference on Computational Linguistics, 2018

Relation Extraction for Protein-protein Interactions Affected by Mutations.
Proceedings of the 2018 ACM International Conference on Bioinformatics, 2018

Helping or Hurting? Predicting Changes in Users' Risk of Self-Harm Through Online Community Interactions.
Proceedings of the Fifth Workshop on Computational Linguistics and Clinical Psychology: From Keyboard to Clinic, 2018

RSDD-Time: Temporal Annotation of Self-Reported Mental Health Diagnoses.
Proceedings of the Fifth Workshop on Computational Linguistics and Clinical Psychology: From Keyboard to Clinic, 2018

2017
Triaging content severity in online mental health forums.
J. Assoc. Inf. Sci. Technol., 2017

Contextualizing Citations for Scientific Summarization using Word Embeddings and Domain Knowledge.
Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2017

GUIR at SemEval-2017 Task 12: A Framework for Cross-Domain Clinical Temporal Information Extraction.
Proceedings of the 11th International Workshop on Semantic Evaluation, 2017

Depression and Self-Harm Risk Assessment in Online Forums.
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017

A Neural Attention Model for Categorizing Patient Safety Events.
Proceedings of the Advances in Information Retrieval, 2017

Identifying Harm Events in Clinical Care through Medical Narratives.
Proceedings of the 8th ACM International Conference on Bioinformatics, 2017

2016
GUIR at SemEval-2016 task 12: Temporal Information Processing for Clinical Narratives.
Proceedings of the 10th International Workshop on Semantic Evaluation, 2016

Triaging Mental Health Forum Posts.
Proceedings of the 3rd Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality, 2016

Revisiting Summarization Evaluation for Scientific Articles.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

2015
Matching Citation Text and Cited Spans in Biomedical Literature: a Search-Oriented Approach.
Proceedings of the NAACL HLT 2015, The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado, USA, May 31, 2015

Scientific Article Summarization Using Citation-Context and Article's Discourse Structure.
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015

Retrieving Medical Literature for Clinical Decision Support.
Proceedings of the Advances in Information Retrieval, 2015

2014
Query Reformulation for Clinical Decision Support Search.
Proceedings of The Twenty-Third Text REtrieval Conference, 2014

On clinical decision support.
Proceedings of the 5th ACM Conference on Bioinformatics, 2014


  Loading...