Sun Kim

Orcid: 0000-0001-5385-9546

According to our database1, Sun Kim authored at least 170 papers between 1994 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.



In proceedings 
PhD thesis 




Finding Highly Similar Regions of Genomic Sequences Through Homomorphic Encryption.
J. Comput. Biol., 2024

SLM as Guardian: Pioneering AI Safety with Small Language Models.
CoRR, 2024

Taxonomy and Analysis of Sensitive User Queries in Generative AI Search.
CoRR, 2024

Improving out-of-distribution generalization in graphs via hierarchical semantic environments.
CoRR, 2024

DiSCO: Diffusion Schrödinger Bridge for Molecular Conformer Optimization.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

GOAT: Gene-level biomarker discovery from multi-Omics data using graph ATtention neural network for eosinophilic asthma subtype.
Bioinform., October, 2023

A model-agnostic framework to enhance knowledge graph-based drug combination prediction with drug-drug interaction data and supervised contrastive learning.
Briefings Bioinform., September, 2023

Metheor: Ultrafast DNA methylation heterogeneity calculation from bisulfite read alignments.
PLoS Comput. Biol., March, 2023

Improved drug response prediction by drug target data integration via network-based profiling.
Briefings Bioinform., March, 2023

Deep learning-based survival prediction using DNA methylation-derived 3D genomic information.
Proceedings of the 14th ACM International Conference on Bioinformatics, 2023

Clinical Note Owns its Hierarchy: Multi-Level Hypergraph Neural Networks for Patient-Level Representation Learning.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

MLDEG: A Machine Learning Approach to Identify Differentially Expressed Genes Using Network Property and Network Propagation.
IEEE ACM Trans. Comput. Biol. Bioinform., 2022

Generative Modeling to Predict Multiple Suitable Conditions for Chemical Reactions.
J. Chem. Inf. Model., 2022

SPGP: Structure Prototype Guided Graph Pooling.
CoRR, 2022

Triangular Contrastive Learning on Molecular Graphs.
CoRR, 2022

AutoCoV: tracking the early spread of COVID-19 in terms of the spatial and temporal patterns from embedding space by K-mer based deep learning.
BMC Bioinform., 2022

Embedding of FDA Approved Drugs in Chemical Space Using Cascade Autoencoder with Metric Learning.
Proceedings of the IEEE International Conference on Big Data and Smart Computing, 2022

Sparse Structure Learning via Graph Neural Networks for Inductive Document Classification.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

Ranked k-Spectrum Kernel for Comparative and Evolutionary Comparison of Exons, Introns, and CpG Islands.
IEEE ACM Trans. Comput. Biol. Bioinform., 2021

A generalization of the modular equations of higher degrees.
J. Comb. Theory A, 2021

Handling Long-Tail Queries with Slice-Aware Conversational Systems.
CoRR, 2021

BioVLAB-Cancer-Pharmacogenomics: tumor heterogeneity and pharmacogenomics analysis of multi-omics data from tumor on the cloud.
Bioinform., 2021

mirTime: identifying condition-specific targets of microRNA in time-series transcript data using Gaussian process model and spherical vector clustering.
Bioinform., 2021

Erratum to: Machine learning-based analysis of multi-omics data on the cloud for investigating gene regulations.
Briefings Bioinform., 2021

Machine learning-based analysis of multi-omics data on the cloud for investigating gene regulations.
Briefings Bioinform., 2021

Towards multi-omics characterization of tumor heterogeneity: a comprehensive review of statistical and machine learning approaches.
Briefings Bioinform., 2021

IDEA: Integrating Divisive and Ensemble-Agglomerate hierarchical clustering framework for arbitrary shape data.
Proceedings of the 2021 IEEE International Conference on Big Data (Big Data), 2021

A probabilistic model for pathway-guided gene set selection.
Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2021

TeamTat: A Collaborative Text Annotation Tool for Creating Gold-Standard Corpora.
Proceedings of the AMIA 2021, American Medical Informatics Association Annual Symposium, San Diego, CA, USA, October 30, 2021, 2021

Deep learning with sentence embeddings pre-trained on biomedical corpora improves the performance of finding similar sentences in electronic medical records.
BMC Medical Informatics Decis. Mak., April, 2020

BioConceptVec: Creating and evaluating literature-based biomedical concept embeddings on a large scale.
PLoS Comput. Biol., 2020

TeamTat: a collaborative text annotation tool.
Nucleic Acids Res., 2020

Better synonyms for enriching biomedical search.
J. Am. Medical Informatics Assoc., 2020

Cancer subtype classification and modeling by pathway attention and propagation.
Bioinform., 2020

Comprehensive and critical evaluation of individualized pathway activity measurement tools on pan-cancer data.
Briefings Bioinform., 2020

Homomorphic Computation of Local Alignment.
Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2020

LitSense: making sense of biomedical literature at sentence level.
Nucleic Acids Res., 2019

A secure SNP panel scheme using homomorphically encrypted K-mers without SNP calling on the user side.
BMC Genom., 2019

Venn-diaNet : venn diagram based network propagation analysis framework for comparing multiple biological experiments.
BMC Bioinform., 2019

HTRgene: a computational method to perform the integrated analysis of multiple heterogeneous time-series data: case analysis of cold and heat stress response signaling genes in Arabidopsis.
BMC Bioinform., 2019

PRISM: methylation pattern-based, reference-free inference of subclonal makeup.
Bioinform., 2019

CaPSSA: visual evaluation of cancer biomarker genes for patient stratification and survival analysis using mutation and expression data.
Bioinform., 2019

Overview of the BioCreative VI Precision Medicine Track: mining protein interactions and mutations for precision medicine.
Database J. Biol. Databases Curation, 2019

DeepFunNet: Deep Learning for Gene Functional Similarity Network Construction.
Proceedings of the IEEE International Conference on Big Data and Smart Computing, 2019

Evaluation of Five Sentence Similarity Models on Electronic Medical Records.
Proceedings of the 10th ACM International Conference on Bioinformatics, 2019

Editorial for Selected Papers of a Joint Conferences, Genome Informatics Workshop/International Conference on Bioinformatics (GIW/InCoB) 2015.
IEEE ACM Trans. Comput. Biol. Bioinform., 2018

The Circle Problem of Gauss and the Divisor Problem of Dirichlet - Still Unsolved.
Am. Math. Mon., 2018

ezTag: tagging biomedical concepts via interactive learning.
Nucleic Acids Res., 2018

Cloud-BS: A MapReduce-based bisulfite sequencing aligner on cloud.
J. Bioinform. Comput. Biol., 2018

The mRNA and miRNA transcriptomic landscape of Panax ginseng under the high ambient temperature.
BMC Syst. Biol., 2018

Discovering themes in biomedical literature using a projection-based algorithm.
BMC Bioinform., 2018

BRCA-Pathway: a structural integration and visualization system of TCGA breast cancer data on KEGG pathways.
BMC Bioinform., 2018

DeepFam: deep learning based alignment-free method for protein family modeling and prediction.
Bioinform., 2018

A Fast Deep Learning Model for Textual Relevance in Biomedical Information Retrieval.
Proceedings of the 2018 World Wide Web Conference on World Wide Web, 2018

Hybrid Approach of Relation Network and Localized Graph Convolutional Filtering for Breast Cancer Subtype Classification.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Identifying stress-related genes and predicting stress types in Arabidopsis using logical correlation layer and CMCL loss through time-series data.
Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2018

HTRgene: Integrating Multiple Heterogeneous Time-series Data to Investigate Cold and Heat Stress Response Signaling Genes in Arabidopsis.
Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2018

Sentence Similarity Measures Revisited: Ranking Sentences in PubMed Documents.
Proceedings of the 2018 ACM International Conference on Bioinformatics, 2018

Bridging the gap: Incorporating a semantic similarity measure for effectively mapping PubMed queries to documents.
J. Biomed. Informatics, 2017

Iterative segmented least square method for functional microRNA-mRNA module discovery in breast cancer.
Int. J. Data Min. Bioinform., 2017

PINTnet: construction of condition-specific pathway interaction network by computing shortest paths on weighted PPI.
BMC Syst. Biol., 2017

TimesVector: a vectorized clustering approach to the analysis of time series transcriptome data from multiple phenotypes.
Bioinform., 2017

The BioC-BioGRID corpus: full text articles annotated for curation of protein-protein and genetic interactions.
Database J. Biol. Databases Curation, 2017

Deep Learning for Biomedical Information Retrieval: Learning Textual Relevance from Click Logs.
Proceedings of the BioNLP 2017, Vancouver, Canada, August 4, 2017, 2017

BioCreative VI Precision Medicine Track: creating a training corpus for mining protein-protein interactions affected by mutations.
Proceedings of the BioNLP 2017, Vancouver, Canada, August 4, 2017, 2017

Flow maximization analysis of cell cycle pathway activation status in breast cancer subtypes.
Proceedings of the 2017 IEEE International Conference on Big Data and Smart Computing, 2017

Integration of heterogeneous time series gene expression data by clustering on time dimension.
Proceedings of the 2017 IEEE International Conference on Big Data and Smart Computing, 2017

Information theoretic sub-network mining characterizes breast cancer subtypes in terms of cancer core mechanisms.
J. Bioinform. Comput. Biol., 2016

Bridging the Gap: a Semantic Similarity Measure between Queries and Documents.
CoRR, 2016

Clustering and evolutionary analysis of small RNAs identify regulatory siRNA clusters induced under drought stress in rice.
BMC Syst. Biol., 2016

Subtype-specific CpG island shore methylation and mutation patterns in 30 breast cancer cell lines.
BMC Syst. Biol., 2016

RDDpred: a condition-specific RNA-editing prediction model from RNA-seq data.
BMC Genom., 2016

Prioritizing biological pathways by recognizing context in time-series gene expression data.
BMC Bioinform., 2016

<i>Meshable</i>: searching PubMed abstracts by utilizing MeSH and MeSH-derived topical terms.
Bioinform., 2016

Influence maximization in time bounded network identifies transcription factors regulating perturbed pathways.
Bioinform., 2016

BioC viewer: a web-based tool for displaying and merging annotations in BioC.
Database J. Biol. Databases Curation, 2016

BioCreative V BioC track overview: collaborative biocurator assistant task for BioGRID.
Database J. Biol. Databases Curation, 2016

BioCconvert: A Conversion Tool Between BioC and PubAnnotation.
Proceedings of the Joint International Conference on Biological Ontology and BioCreative, 2016

PubTermVariants: biomedical term variants and their use for PubMed search.
Proceedings of the 15th Workshop on Biomedical Natural Language Processing, 2016

Networks and models for the integrated analysis of multi omics data.
Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2016

Extracting drug-drug interactions from literature using a rich feature-based linear kernel approach.
J. Biomed. Informatics, 2015

Weighted divisor sums and Bessel function series, V.
J. Approx. Theory, 2015

Identifying named entities from PubMed®; for enriching semantic categories.
BMC Bioinform., 2015

BioVLAB-MMIA-NGS: microRNA-mRNA integrated analysis using high-throughput sequencing data.
Bioinform., 2015

Summarizing Topical Contents from PubMed Documents Using a Thematic Analysis.
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015

Partitions with part difference conditions and Bressoud's conjecture.
J. Comb. Theory A, 2014

Author name disambiguation for PubMed.
J. Assoc. Inf. Sci. Technol., 2014

piClust: A density based piRNA clustering algorithm.
Comput. Biol. Chem., 2014

Retro: concept-based clustering of biomedical topical sets.
Bioinform., 2014

Assisting manual literature curation for protein-protein interactions using BioQRator.
Database J. Biol. Databases Curation, 2014

An algorithm for identifying differentially expressed genes in multiclass RNA-seq samples.
Proceedings of the International Conference on Big Data and Smart Computing, BIGCOMP 2014, 2014

The Rogers-Ramanujan-Gordon identities, the generalized Göllnitz-Gordon identities, and parity questions.
J. Comb. Theory A, 2013

Genome-Wide Analysis and Modeling of DNA methylation susceptibility in 30 Breast Cancer Cell Lines by using CPG Flanking Sequences.
J. Bioinform. Comput. Biol., 2013

Designing discriminative spatial filter vectors in motor imagery brain-computer interface.
Int. J. Imaging Syst. Technol., 2013

Bio and health informatics meets cloud : BioVLab as an example.
Health Inf. Sci. Syst., 2013

Integrated profiling of three dimensional cell culture models and 3D microscopy.
Bioinform., 2013

Towards simultaneous clustering and motif-modeling for a large number of protein family.
Proceedings of the 2013 IEEE International Conference on Bioinformatics and Biomedicine, 2013

J. Bioinform. Comput. Biol., 2012

A novel k-mer mixture logistic regression for methylation susceptibility modeling of CpG dinucleotides in human gene promoters.
BMC Bioinform., 2012

Thematic clustering of text documents using an EM-based approach.
J. Biomed. Semant., 2012

GeneclusterViz: a tool for conserved gene cluster visualization, exploration and analysis.
Bioinform., 2012

PIE <i>the search</i>: searching PubMed literature for protein interaction information.
Bioinform., 2012

Prioritizing PubMed articles for the Comparative Toxicogenomic Database utilizing semantic information.
Database J. Biol. Databases Curation, 2012

Discriminative spatial pattern vectors selection for motor imagery classification.
Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, 2012

Classifying Gene Sentences in Biomedical Literature by Combining High-Precision Gene Identifiers.
Proceedings of the 2012 Workshop on Biomedical Natural Language Processing, 2012

Sequence-Based enzyme catalytic Domain Prediction Using Clustering and Aggregated Mutual Information Content.
J. Bioinform. Comput. Biol., 2011

Göllnitz-Gordon identities and parity questions.
Eur. J. Comb., 2011

The Protein-Protein Interaction tasks of BioCreative III: classification/ranking of articles and linking bio-ontology concepts to full text.
BMC Bioinform., 2011

Classifying protein-protein interaction articles using word and syntactic features.
BMC Bioinform., 2011

Transitioning BioVLab cloud workbench to a science gateway.
Proceedings of the 2011 TeraGrid Conference - Extreme Digital Discovery, 2011

An EM Clustering Algorithm which Produces a Dual Representation.
Proceedings of the 10th International Conference on Machine Learning and Applications and Workshops, 2011

BioVLAB-MMIA: A Reconfigurable Cloud Computing Environment for microRNA and mRNA Integrated Analysis.
Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2011

Annotation confidence score for genome annotation: a genome comparison approach.
Bioinform., 2010

Data mining for the study of disease genes and proteins.
Artif. Intell. Medicine, 2010

Gene cluster profile vectors: A novel method to infer functional coupling using both gene proximity and co-occurrence profiles.
Proceedings of the 2010 IEEE International Conference on Bioinformatics and Biomedicine, 2010

EGGSlicer: predicting biologically meaningful gene sets from gene clusters using gene ontology information.
Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology, 2010

MicroRNA and mRNA integrated analysis (MMIA): a web tool for examining biological functions of microRNA expression.
Nucleic Acids Res., 2009

Bijective proofs of partition identities arising from modular equations.
J. Comb. Theory A, 2009

Ensembled support vector machines for human papillomavirus risk type prediction from protein secondary structures.
Comput. Biol. Medicine, 2009

Computational analysis of microRNA profiles and their target genes suggests significant involvement in breast cancer antiestrogen resistance.
Bioinform., 2009

Experience report: issues in comparing gene function annotation in text.
Proceedings of the 27th Annual International Conference on Design of Communication, 2009

Evolutionary hypernetwork classifiers for protein-proteininteraction sentence filtering.
Proceedings of the Genetic and Evolutionary Computation Conference, 2009

Evolving hypernetwork models of binary time series for forecasting price movements on stock markets.
Proceedings of the IEEE Congress on Evolutionary Computation, 2009

PIE: an online prediction system for protein-protein interactions from text.
Nucleic Acids Res., 2008

A gene pattern mining algorithm using interchangeable gene sets for prokaryotes.
BMC Bioinform., 2008

ComPath: comparative enzyme analysis and annotation in pathway/subsystem contexts.
BMC Bioinform., 2008

Enriched transcription factor binding sites in hypermethylated gene promoters in drug resistant cancer cells.
Bioinform., 2008

A machine-learning approach to combined evidence validation of genome assemblies.
Bioinform., 2008

Predicting DNA Methylation Susceptibility Using CpG Flanking Sequences.
Proceedings of the Biocomputing 2008, 2008

BioVLAB-Microarray: Microarray Data Analysis in Virtual Environment.
Proceedings of the Fourth International Conference on e-Science, 2008

A Study of Residue Correlation within Protein Sequences and Its Application to Sequence Classification.
EURASIP J. Bioinform. Syst. Biol., 2007

dPattern: transcription factor binding site (TFBS) discovery in human genome using a discriminative pattern analysis.
Bioinform., 2007

CLASSEQ: Classification of Sequences via Comparative Analysis of Multiple Genomes.
Proceedings of the Sixth International Conference on Machine Learning and Applications, 2007

Evolving hypernetwork classifiers for microRNA expression profile analysis.
Proceedings of the IEEE Congress on Evolutionary Computation, 2007

EGGS: Extraction of Gene Clusters Using Genome Context Based Sequence Matching Techniques.
Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2007

Finding Cancer-Related Gene Combinations Using a Molecular Evolutionary Algorithm.
Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering, 2007

Bioinformatics: Mining the Massive Data from High Throughput Genomics Experiments.
Proceedings of the Analysis of Biological Data: A Soft Computing Approach, 2007

BAG: a graph theoretic sequence clustering algorithm.
Int. J. Data Min. Bioinform., 2006

Advanced Signal Processing Techniques for Bioinformatics.
EURASIP J. Adv. Signal Process., 2006

REFINEMENT: A search framework for the identification of interferon-responsive elements in DNA sequences - a case study with ISRE and GAS.
Comput. Biol. Chem., 2006

ARCS: an aggregated related column scoring scheme for aligned sequences.
Bioinform., 2006

A mixture model-based discriminate analysis for identifying ordered transcription factor binding site pairs in gene promoters directly regulated by estrogen receptor-alpha.
Bioinform., 2006

COMPAM : visualization of combining pairwise alignments for multiple genomes.
Bioinform., 2006

An Approximate de Bruijn Graph Approach to Multiple Local Alignment and Motif Discovery in Protein Sequences.
Proceedings of the Data Mining and Bioinformatics, First International Workshop, 2006

A Tree Kernel-Based Method for Protein-Protein Interaction Mining from Biomedical Literature.
Proceedings of the Knowledge Discovery in Life Science Literature, 2006

Prediction of the Human Papillomavirus Risk Types Using Gap-Spectrum Kernels.
Proceedings of the Advances in Neural Networks - ISNN 2006, Third International Symposium on Neural Networks, Chengdu, China, May 28, 2006

Genome Data Type: a Vehicle to Deliver a Genome Comparison System on the Web.
Proceedings of the Workshops Proceedings of the 6th IEEE International Conference on Data Mining (ICDM 2006), 2006

Human Papillomavirus Risk Type Classification from Protein Sequences Using Support Vector Machines.
Proceedings of the Applications of Evolutionary Computing, 2006

Text Classifiers Evolved on a Simulated DNA Computer.
Proceedings of the IEEE International Conference on Evolutionary Computation, 2006

GAME: A simple and efficient whole genome alignment method using maximal exact match filtering.
Comput. Biol. Chem., 2005

PLATCOM: a Platform for Computational Comparative Genomics.
Bioinform., 2005

Motif discovery for proteins using subsequence clustering.
Proceedings of the 5th international workshop on Bioinformatics, 2005

An Application of Support Vector Machines for Customer Churn Analysis: Credit Card Case.
Proceedings of the Advances in Natural Computation, First International Conference, 2005

PLATCOM: Current Status and Plan for the Next Stages.
Proceedings of the Data Integration in the Life Sciences, Second InternationalWorkshop, 2005

Cluster Utility: A New Metric for Clustering Biological Sequences.
Proceedings of the Fourth International IEEE Computer Society Computational Systems Bioinformatics Conference Workshops & Poster Abstracts, 2005

Gene Teams with Relaxed Proximity Constraint.
Proceedings of the Fourth International IEEE Computer Society Computational Systems Bioinformatics Conference, 2005

PLATCOM: a Platform for Computational Comparative Genomics on the Web.
Proceedings of the Fourth International IEEE Computer Society Computational Systems Bioinformatics Conference Workshops & Poster Abstracts, 2005

Guiding motif discovery by iterative pattern refinement.
Proceedings of the 2004 ACM Symposium on Applied Computing (SAC), 2004

Multiple Genome Alignment by Clustering Pairwise Matches.
Proceedings of the Comparative Genomics, 2004

Multi-objective Evolutionary Probe Design Based on Thermodynamic Criteria for HPV Detection.
Proceedings of the PRICAI 2004: Trends in Artificial Intelligence, 2004

Genetic Mining of HTML Structures for Effective Web-Document Retrieval.
Appl. Intell., 2003

Motif Discovery from Large Number of Sequences: A Case Study with Disease Resistance Genes in Arabidopsos thaliana.
Proceedings of the International Conference on Mathematics and Engineering Techniques in Medicine and Biological Scienes, 2003

Graph Theoretic Sequence Clustering Algorithms and Their Applications to Genome Comparison.
Proceedings of the Computational Biology and Genome Informatics, 2003

A probabilistic approach to sequence assembly validation.
Proceedings of the ACM SIGKDD Workshop on Data Mining in Bioinformatics (BIOKDD 2001), 2001

Evolutionary learning of Web-document structure for information retrieval.
Proceedings of the 2001 Congress on Evolutionary Computation, 2001

Web-Document Retrieval by Genetic Learning of Importance Factors for HTML Tags.
Proceedings of the International Workshop on Text and Web Mining, 2000

A New String-Pattern Matching Algorithm Using Partitioning and Hashing Efficiently.
ACM J. Exp. Algorithmics, 1999

AMASS: A Structured Pattern Matching Approach to Shotgun Sequence Assembly.
J. Comput. Biol., 1999

SCAI TREC-8 Experiments.
Proceedings of The Eighth Text REtrieval Conference, 1999

ModGen: Theorem Proving by Model Generation.
Proceedings of the 12th National Conference on Artificial Intelligence, Seattle, WA, USA, July 31, 1994