Kyle Lo

Orcid: 0000-0002-1804-2853

Affiliations:
  • Allen Institute for Artificial Intelligence, Seattle, Washington, USA


According to our database1, Kyle Lo authored at least 74 papers between 2018 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions.
CoRR, 2024

Know Your Audience: The benefits and pitfalls of generating plain language summaries beyond the "general" audience.
CoRR, 2024

KIWI: A Dataset of Knowledge-Intensive Writing Instructions for Answering Research Questions.
CoRR, 2024

OLMo: Accelerating the Science of Language Models.
CoRR, 2024

Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research.
CoRR, 2024

InfoLossQA: Characterizing and Recovering Information Loss in Text Simplification.
CoRR, 2024

When do Generative Query and Document Expansions Fail? A Comprehensive Study Across Methods, Retrievers, and Datasets.
Proceedings of the Findings of the Association for Computational Linguistics: EACL 2024, 2024

2023
LIMEADE: From AI Explanations to Advice Taking.
ACM Trans. Interact. Intell. Syst., December, 2023

Paper Plain: Making Medical Research Papers Approachable to Healthcare Consumers with Natural Language Processing.
ACM Trans. Comput. Hum. Interact., October, 2023

Paloma: A Benchmark for Evaluating Language Model Fit.
CoRR, 2023

Back to Basics: A Simple Recipe for Improving Out-of-Domain Retrieval in Dense Encoders.
CoRR, 2023

The Rise of Open Science: Tracking the Evolution and Perceived Value of Data and Methods Link-Sharing Practices.
CoRR, 2023

BooookScore: A systematic exploration of book-length summarization in the era of LLMs.
CoRR, 2023

Efficiency Pentathlon: A Standardized Arena for Efficiency Evaluation.
CoRR, 2023

A Controllable QA-based Framework for Decontextualization.
CoRR, 2023

Complex Mathematical Symbol Definition Structures: A Dataset and Model for Coordination Resolution in Definition Extraction.
CoRR, 2023

Beyond Summarization: Designing AI Support for Real-World Expository Writing Tasks.
CoRR, 2023

The Semantic Reader Project: Augmenting Scholarly Documents through AI-Powered Interactive Reading Interfaces.
CoRR, 2023

The Semantic Scholar Open Data Platform.
CoRR, 2023

Scim: Intelligent Skimming Support for Scientific Papers.
Proceedings of the 28th International Conference on Intelligent User Interfaces, 2023

A Question Answering Framework for Decontextualizing User-facing Snippets from Scientific Documents.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

PaperMage: A Unified Toolkit for Processing, Representing, and Manipulating Visually-Rich Scientific Documents.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Decomposing Complex Queries for Tip-of-the-tongue Retrieval.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Open Domain Multi-document Summarization: A Comprehensive Study of Model Brittleness under Retrieval.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

LongEval: Guidelines for Human Evaluation of Faithfulness in Long-form Summarization.
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, 2023

CiteSee: Augmenting Citations in Scientific Papers with Persistent and Personalized Historical Context.
Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, 2023

Are Layout-Infused Language Models Robust to Layout Distribution Shifts? A Case Study with Scientific Documents.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

2022
VILA: Improving Structured Content Extraction from Scientific PDFs Using Visual Layout Groups.
Trans. Assoc. Comput. Linguistics, 2022

Exploring the Challenges of Open Domain Multi-Document Summarization.
CoRR, 2022

Data Governance in the Age of Large-Scale Data-Driven Language Technology.
CoRR, 2022

Scim: Intelligent Faceted Highlights for Interactive, Multi-Pass Skimming of Scientific Papers.
CoRR, 2022

Infrastructure for Rapid Open Knowledge Network Development.
AI Mag., 2022

Multi-LexSum: Real-world Summaries of Civil Rights Lawsuits at Multiple Granularities.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022


MultiVerS: Improving scientific claim verification with weak supervision and full-document context.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2022, 2022

MultiCite: Modeling realistic citations requires moving beyond the single-sentence single-label setting.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

Data Governance in the Age of Large-Scale Data-Driven Language Technology.
Proceedings of the FAccT '22: 2022 ACM Conference on Fairness, Accountability, and Transparency, Seoul, Republic of Korea, June 21, 2022

SciFact-Open: Towards open-domain scientific claim verification.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

ACCoRD: A Multi-Document Approach to Generating Diverse Descriptions of Scientific Concepts.
Proceedings of the The 2022 Conference on Empirical Methods in Natural Language Processing, 2022

Overview of the Third Workshop on Scholarly Document Processing.
Proceedings of the Third Workshop on Scholarly Document Processing, 2022

Exploring the Role of Local and Global Explanations in Recommender Systems.
Proceedings of the CHI '22: CHI Conference on Human Factors in Computing Systems, New Orleans, LA, USA, 29 April 2022, 2022

Generating Scientific Claims for Zero-Shot Scientific Fact Checking.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

2021
Searching for scientific evidence in a pandemic: An overview of TREC-COVID.
J. Biomed. Informatics, 2021

Harnessing the Power of Smart and Connected Health to Tackle COVID-19: IoT, AI, Robotics, and Blockchain for a Better World.
IEEE Internet Things J., 2021

LongChecker: Improving scientific claim verification by modeling full-abstract context.
CoRR, 2021

Overview and Insights from the SciVer Shared Task on Scientific Claim Verification.
CoRR, 2021

Incorporating Visual Layout Structures for Scientific Text Classification.
CoRR, 2021

Text mining approaches for dealing with the rapidly expanding literature on COVID-19.
Briefings Bioinform., 2021

FLEX: Unifying Evaluation for Few-Shot NLP.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

A Dataset of Information-Seeking Questions and Answers Anchored in Research Papers.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

Discourse Understanding and Factual Consistency in Abstractive Summarization.
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 2021

Augmenting Scientific Papers with Just-in-Time, Position-Sensitive Definitions of Terms and Symbols.
Proceedings of the CHI '21: CHI Conference on Human Factors in Computing Systems, 2021

Explaining Relationships Between Scientific Documents.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
TREC-COVID: constructing a pandemic information retrieval test collection.
SIGIR Forum, 2020

TREC-COVID: rationale and structure of an information retrieval shared task for COVID-19.
J. Am. Medical Informatics Assoc., 2020

Mitigating Biases in CORD-19 for Analyzing COVID-19 Literature.
Frontiers Res. Metrics Anal., 2020

CORD-19: The Covid-19 Open Research Dataset.
CoRR, 2020

Explanation-Based Tuning of Opaque Machine Learners with Application to Paper Recommendation.
CoRR, 2020

Citation Text Generation.
CoRR, 2020

Fact or Fiction: Verifying Scientific Claims.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Document-Level Definition Detection in Scholarly Documents: Existing Models, Error Analyses, and Future Directions.
Proceedings of the First Workshop on Scholarly Document Processing, 2020

TLDR: Extreme Summarization of Scientific Documents.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

The COVID-19 Open Research Dataset - Abstract.
Proceedings of the Workshop on Semantic Indexing and Information Retrieval for Health from heterogeneous content types and languages co-located with 42nd European Conference on Information Retrieval, 2020

S2ORC: The Semantic Scholar Open Research Corpus.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Don't Stop Pretraining: Adapt Language Models to Domains and Tasks.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019
GORC: A large contextual citation graph of academic papers.
CoRR, 2019

Cooperative Generator-Discriminator Networks for Abstractive Summarization with Narrative Flow.
CoRR, 2019

SciBERT: Pretrained Contextualized Embeddings for Scientific Text.
CoRR, 2019

Combining Distant and Direct Supervision for Neural Relation Extraction.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

SciBERT: A Pretrained Language Model for Scientific Text.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

2018
Improving Distant Supervision with Maxpooled Attention and Sentence-Level Supervision.
CoRR, 2018

Citation Count Analysis for Papers with Preprints.
CoRR, 2018

Construction of the Literature Graph in Semantic Scholar.
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018

Ontology alignment in the biomedical domain using entity definitions and context.
Proceedings of the BioNLP 2018 workshop, Melbourne, Australia, July 19, 2018, 2018


  Loading...