W. John Wilbur

According to our database1, W. John Wilbur authored at least 151 papers between 1984 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
AgentMD: Empowering Language Agents for Risk Prediction with Large-Scale Clinical Tool Learning.
CoRR, 2024

2023
MedCPT: Contrastive Pre-trained Transformers with large-scale PubMed search logs for zero-shot biomedical information retrieval.
Bioinform., October, 2023

Comprehensively identifying Long Covid articles with human-in-the-loop machine learning.
Patterns, January, 2023

BioCPT: Contrastive Pre-trained Transformers with Large-scale PubMed Search Logs for Zero-shot Biomedical Information Retrieval.
CoRR, 2023

2022
Towards a unified search: Improving PubMed retrieval with full text.
J. Biomed. Informatics, 2022

Comprehensive identification of Long Covid articles with human-in-the-loop machine learning.
CoRR, 2022

2021
Measuring the relative importance of full text sections for information retrieval from scientific literature.
Proceedings of the 20th Workshop on Biomedical Language Processing, 2021

Improving PubMed Retrieval by Integrating Abstract and Full Text Search.
Proceedings of the AMIA 2021, American Medical Informatics Association Annual Symposium, San Diego, CA, USA, October 30, 2021, 2021

2020
Deep learning with sentence embeddings pre-trained on biomedical corpora improves the performance of finding similar sentences in electronic medical records.
BMC Medical Informatics Decis. Mak., April, 2020

Better synonyms for enriching biomedical search.
J. Am. Medical Informatics Assoc., 2020

Navigating the landscape of COVID-19 research through literature analysis: A bird's eye view.
CoRR, 2020

2019
LitSense: making sense of biomedical literature at sentence level.
Nucleic Acids Res., 2019

PDC - a probabilistic distributional clustering algorithm: a case study on suicide articles in PubMed.
CoRR, 2019

PubMed Text Similarity Model and its application to curation efforts in the Conserved Domain Database.
Database J. Biol. Databases Curation, 2019

Evaluation of Five Sentence Similarity Models on Electronic Medical Records.
Proceedings of the 10th ACM International Conference on Bioinformatics, 2019

2018
Discovering themes in biomedical literature using a projection-based algorithm.
BMC Bioinform., 2018

A Field Sensor: computing the composition and intent of PubMed queries.
Database J. Biol. Databases Curation, 2018

SingleCite: Towards an improved Single Citation Search in PubMed.
Proceedings of the BioNLP 2018 workshop, Melbourne, Australia, July 19, 2018, 2018

MeSH-based dataset for measuring the relevance of text retrieval.
Proceedings of the BioNLP 2018 workshop, Melbourne, Australia, July 19, 2018, 2018

Sentence Similarity Measures Revisited: Ranking Sentences in PubMed Documents.
Proceedings of the 2018 ACM International Conference on Bioinformatics, 2018

2017
Bridging the gap: Incorporating a semantic similarity measure for effectively mapping PubMed queries to documents.
J. Biomed. Informatics, 2017

The BioC-BioGRID corpus: full text articles annotated for curation of protein-protein and genetic interactions.
Database J. Biol. Databases Curation, 2017

2016
Bridging the Gap: a Semantic Similarity Measure between Queries and Documents.
CoRR, 2016

<i>Meshable</i>: searching PubMed abstracts by utilizing MeSH and MeSH-derived topical terms.
Bioinform., 2016

BioC viewer: a web-based tool for displaying and merging annotations in BioC.
Database J. Biol. Databases Curation, 2016

BioCreative V BioC track overview: collaborative biocurator assistant task for BioGRID.
Database J. Biol. Databases Curation, 2016

BioCconvert: A Conversion Tool Between BioC and PubAnnotation.
Proceedings of the Joint International Conference on Biological Ontology and BioCreative, 2016

PubTermVariants: biomedical term variants and their use for PubMed search.
Proceedings of the 15th Workshop on Biomedical Natural Language Processing, 2016

2015
Optimizing graph-based patterns to extract biomedical events from the literature.
BMC Bioinform., December, 2015

Extracting drug-drug interactions from literature using a rich feature-based linear kernel approach.
J. Biomed. Informatics, 2015

Identifying named entities from PubMed®; for enriching semantic categories.
BMC Bioinform., 2015

Summarizing Topical Contents from PubMed Documents Using a Thematic Analysis.
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015

2014
Author name disambiguation for PubMed.
J. Assoc. Inf. Sci. Technol., 2014

Retro: concept-based clustering of biomedical topical sets.
Bioinform., 2014

BioC implementations in Go, Perl, Python and Ruby.
Database J. Biol. Databases Curation, 2014

Assisting manual literature curation for protein-protein interactions using BioQRator.
Database J. Biol. Databases Curation, 2014

Finding abbreviations in biomedical literature: three BioC-compatible modules and four BioC-formatted corpora.
Database J. Biol. Databases Curation, 2014

Natural language processing pipelines to annotate BioC collections with an application to the NCBI disease corpus.
Database J. Biol. Databases Curation, 2014

BioC interoperability track overview.
Database J. Biol. Databases Curation, 2014

BioCreative-IV virtual issue.
Database J. Biol. Databases Curation, 2014

Stochastic Gradient Descent and the Prediction of MeSH for PubMed Records.
Proceedings of the AMIA 2014, 2014

2013
BioC: a minimalist approach to interoperability for biomedical text processing.
Database J. Biol. Databases Curation, 2013

An overview of the BioCreative 2012 Workshop Track III: interactive text mining task.
Database J. Biol. Databases Curation, 2013

BioNLP Shared Task 2013: Supporting Resources.
Proceedings of the BioNLP Shared Task 2013 Workshop, Sofia, 2013

Extracting Biomedical Events and Modifications Using Subgraph Matching with Noisy Training Data.
Proceedings of the BioNLP Shared Task 2013 Workshop, Sofia, 2013

Generalizing an Approximate Subgraph Matching-based System to Extract Events in Molecular Biology and Cancer Genetics.
Proceedings of the BioNLP Shared Task 2013 Workshop, Sofia, 2013

BioC: A Minimalist Approach to Interoperability for Biomedical Text Processing.
Proceedings of the AMIA 2013, 2013

2012
Database resources of the National Center for Biotechnology Information.
Nucleic Acids Res., 2012

Identifying well-formed biomedical phrases in MEDLINE® text.
J. Biomed. Informatics, 2012

Finding biomedical categories in Medline<sup>®</sup>.
J. Biomed. Semant., 2012

Thematic clustering of text documents using an EM-based approach.
J. Biomed. Semant., 2012

PIE <i>the search</i>: searching PubMed literature for protein interaction information.
Bioinform., 2012

BioCreative-2012 Virtual Issue.
Database J. Biol. Databases Curation, 2012

Improving links between literature and biological data with text mining: a case study with GEO, PDB and MEDLINE.
Database J. Biol. Databases Curation, 2012

Prioritizing PubMed articles for the Comparative Toxicogenomic Database utilizing semantic information.
Database J. Biol. Databases Curation, 2012

Classifying Gene Sentences in Biomedical Literature by Combining High-Precision Gene Identifiers.
Proceedings of the 2012 Workshop on Biomedical Natural Language Processing, 2012

Automatic Identification of Key Concepts in Large PubMed Retrievals.
Proceedings of the Information Retrieval and Knowledge Discovery in Biomedical Text, 2012

PROBE: Periodic Random Orbiter Algorithm for Machine Learning.
Proceedings of the Information Retrieval and Knowledge Discovery in Biomedical Text, 2012

2011
Database resources of the National Center for Biotechnology Information.
Nucleic Acids Res., 2011

Machine learning with naturally labeled data for identifying abbreviation definitions.
BMC Bioinform., 2011

Improving a gold standard: treating human relevance judgments of MEDLINE document pairs.
BMC Bioinform., 2011

The gene normalization task in BioCreative III.
BMC Bioinform., 2011

The Protein-Protein Interaction tasks of BioCreative III: classification/ranking of articles and linking bio-ontology concepts to full text.
BMC Bioinform., 2011

Classifying protein-protein interaction articles using word and syntactic features.
BMC Bioinform., 2011

Overview of the BioCreative III Workshop.
BMC Bioinform., 2011

Extraction of data deposition statements from the literature: a method for automatically tracking research results.
Bioinform., 2011

Comparison of Two Methods for Finding Biomedical Categories in Medline.
Proceedings of the 10th International Conference on Machine Learning and Applications and Workshops, 2011

An EM Clustering Algorithm which Produces a Dual Representation.
Proceedings of the 10th International Conference on Machine Learning and Applications and Workshops, 2011

Text Mining Techniques for Leveraging Positively Labeled Data.
Proceedings of the 2011 Workshop on Biomedical Natural Language Processing, 2011

Automatic extraction of data deposition statements: where do the research results go?
Proceedings of the 2011 Workshop on Biomedical Natural Language Processing, 2011

2010
Database resources of the National Center for Biotechnology Information.
Nucleic Acids Res., 2010

Finding related sentence pairs in MEDLINE.
Inf. Retr., 2010

Identifying Abbreviation Definitions Machine Learning with Naturally Labeled Data.
Proceedings of the Ninth International Conference on Machine Learning and Applications, 2010

2009
How to Get the Most out of Your Curation Effort.
PLoS Comput. Biol., 2009

The value of parsing as feature generation for gene mention recognition.
J. Biomed. Informatics, 2009

Improving accuracy for identifying related PubMed queries by an integrated approach.
J. Biomed. Informatics, 2009

How to interpret PubMed queries and why it matters.
J. Assoc. Inf. Sci. Technol., 2009

Viewpoint Paper: Evaluating Relevance Ranking Strategies for MEDLINE Retrieval.
J. Am. Medical Informatics Assoc., 2009

The ineffectiveness of within-document term frequency in text classification.
Inf. Retr., 2009

Evaluation of query expansion using MeSH in PubMed.
Inf. Retr., 2009

Modeling actions of PubMed users with <i>n</i>-gram language models.
Inf. Retr., 2009

Identifying related journals through log analysis.
Bioinform., 2009

Users' adjustments to unsuccessful queries in biomedical search.
Proceedings of the 2009 Joint International Conference on Digital Libraries, 2009

Exploring Two Biomedical Text Genres for Disease Recognition.
Proceedings of the BioNLP Workshop, BioNLP@HLT-NAACL 2009, 2009

Finding Query Suggestions for PubMed.
Proceedings of the AMIA 2009, 2009

2008
Research Paper: Optimal Training Sets for Bayesian Prediction of MeSH<sup>®</sup> Assignment.
J. Am. Medical Informatics Assoc., 2008

Navigating information spaces: A case study of related article search in PubMed.
Inf. Process. Manag., 2008

Abbreviation definition identification based on automatic precision estimates.
BMC Bioinform., 2008

Multi-dimensional classification of biomedical text: Toward automated, practical provision of high-utility text to diverse users.
Bioinform., 2008

Evaluating Relevance Ranking Strategies for MEDLINE Retrieval.
Proceedings of the AMIA 2008, 2008

2007
SplicePort - An interactive splice-site analysis tool.
Nucleic Acids Res., 2007

Using MEDLINE as a knowledge source for disambiguating abbreviations and acronyms in full-text biomedical journal articles.
J. Biomed. Informatics, 2007

Syntactic sentence compression in the biomedical domain: facilitating access to related articles.
Inf. Retr., 2007

PubMed related articles: a probabilistic topic-based model for content similarity.
BMC Bioinform., 2007

Features generated for computational splice-site prediction correspond to functional elements.
BMC Bioinform., 2007

Combining Resources to Find Answers to Biomedical Questions.
Proceedings of The Sixteenth Text REtrieval Conference, 2007

Characterizing RNA Secondary-Structure Features and Their Effects on Splice-Site Prediction.
Proceedings of the Workshops Proceedings of the 7th IEEE International Conference on Data Mining (ICDM 2007), 2007

Unsupervised Learning of the Morpho-Semantic Relationship in MEDLINE.
Proceedings of the Biological, translational, and clinical language processing, 2007

2006
A large scale, corpus-based approach for automatically disambiguating biomedical abbreviations.
ACM Trans. Inf. Syst., 2006

The importance of the lexicon in tagging biological text.
Nat. Lang. Eng., 2006

Spelling correction in the PubMed search engine.
Inf. Retr., 2006

New directions in biomedical text annotation: definitions, guidelines and corpus construction.
BMC Bioinform., 2006

Finding Relevant Passages in Scientific Articles: Fusion of Automatic Approaches vs. an Interactive Team Effort.
Proceedings of the Fifteenth Text REtrieval Conference, 2006

A Feature Generation Algorithm for Sequences with Application to Splice-Site Prediction.
Proceedings of the Knowledge Discovery in Databases: PKDD 2006, 2006

A Priority Model for Named Entities.
Proceedings of the Workshop on Linking Natural Language and Biology, 2006

SemCat: Semantically Categorized Entities for Genomics.
Proceedings of the AMIA 2006, 2006

2005
The Synergy Between PAV and AdaBoost.
Mach. Learn., 2005

GENETAG: a tagged corpus for gene/protein named entity recognition.
BMC Bioinform., 2005

Fusion of Knowledge-Intensive and Statistical Approaches for Retrieving and Annotating Textual Genomics Documents.
Proceedings of the Fourteenth Text REtrieval Conference, 2005

MedTag: A Collection of Biomedical Annotations.
Proceedings of the ACL-ISMB Workshop on Linking Biological Literature, 2005

A Strategy for Assigning New Concepts in the MEDLINE Database.
Proceedings of the AMIA 2005, 2005

2004
GeneWays: a system for extracting, analyzing, visualizing, and integrating molecular pathway data.
J. Biomed. Informatics, 2004

Generation of a Large Gene/protein Lexicon by Morphological Pattern Analysis.
J. Bioinform. Comput. Biol., 2004

Non-word identification or spell checking without a dictionary.
J. Assoc. Inf. Sci. Technol., 2004

Identification of related gene/protein names based on an HMM of name variations.
Comput. Biol. Chem., 2004

Retrieving definitional content for ontology development.
Comput. Biol. Chem., 2004

MedPost: a part-of-speech tagger for bioMedical text.
Bioinform., 2004

Knowledge-Intensive and Statistical Approaches to the Retrieval and Annotation of Genomics MEDLINE Citations.
Proceedings of the Thirteenth Text REtrieval Conference, 2004

Using MEDLINE as a Knowledge Source for Disambiguating Abbreviations in Full-Text Biomedical Journal Articles.
Proceedings of the 17th IEEE Symposium on Computer-Based Medical Systems (CBMS 2004), 2004

2003
Hidden Markov models and optimized sequence alignments.
Comput. Biol. Chem., 2003

The Dimensions of Indexing.
Proceedings of the AMIA 2003, 2003

2002
Automatically identifying gene/protein terms in MEDLINE abstracts.
J. Biomed. Informatics, 2002

Tagging gene and protein names in biomedical text.
Bioinform., 2002

A Thematic Analysis of the AIDS Literature.
Proceedings of the 7th Pacific Symposium on Biocomputing, 2002

Automatic extraction of gene and protein synonyms from MEDLINE and journal articles.
Proceedings of the AMIA 2002, 2002

DNA splice site detection: a comparison of specific and general methods.
Proceedings of the AMIA 2002, 2002

Tagging gene and protein names in full text articles.
Proceedings of the ACL 2002 Workshop on Natural Language Processing in the Biomedical Domain, 2002

2001
Global term weights for document retrieval learned from TREC data.
J. Inf. Sci., 2001

Corpus-based statistical screening for content-bearing terms.
J. Assoc. Inf. Sci. Technol., 2001

Amino Acid Residue Environments and Predictions of Residue Type.
Comput. Chem., 2001

Automatic MeSH term assignment and quality assessment.
Proceedings of the AMIA 2001, 2001

2000
Research Paper: Corpus-based Statistical Screening for Phrase Identification.
J. Am. Medical Informatics Assoc., 2000

A Theory of Information with Special Application to Search Problems.
Comput. Chem., 2000

Genes, Themes, and Microarrays: Using Information Retrieval for Large-Scale Gene Analysis.
Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology, 2000

Boosting naïve Bayesian learning on a large subset of MEDLINE.
Proceedings of the AMIA 2000, 2000


Finding Themes in Medline Documents: Probabilistic Similarity Search.
Proceedings of IEEE Advances in Digital Libraries 2000 (ADL 2000), 2000

1999
Analysis of biomedical text for chemical names: a comparison of three methods.
Proceedings of the AMIA 1999, 1999

Automated Assignment of Medical Subject Headings.
Proceedings of the AMIA 1999, 1999

1998
The Knowledge in Multiple Human Relevance Judgments.
ACM Trans. Inf. Syst., 1998

A Comparison of Group and Individual Performance Among Subject Experts and Untrained Workers at the Document Retrieval Task.
J. Am. Soc. Inf. Sci., 1998

1996
Using Corpus Statistics to Remove Redundant Words in Text Categorization.
J. Am. Soc. Inf. Sci., 1996

Human Subjectivity and Performance Limits in Document Retrieval.
Inf. Process. Manag., 1996

An analysis of statistical term strength and its use in the indexing and retrieval of molecular biology texts.
Comput. Biol. Medicine, 1996

1994
Non-parametric significance tests of retrieval performance comparisons.
J. Inf. Sci., 1994

The Effectiveness of Document Neighboring in Search Enhancement.
Inf. Process. Manag., 1994

1993
Retrieval Testing with Hypergeometric Document Models.
J. Am. Soc. Inf. Sci., 1993

1992
The automatic identification of stop words.
J. Inf. Sci., 1992

Retrieval Testing by the Comparison of Statistically Independent Retrieval Methods.
J. Am. Soc. Inf. Sci., 1992

An information measure of retrieval performance.
Inf. Syst., 1992

1984
On the statistical significance of nucleic acid similarities.
Nucleic Acids Res., 1984


  Loading...