Kyuseok Shim

Orcid: 0000-0001-8818-0963

According to our database1, Kyuseok Shim authored at least 100 papers between 1993 and 2023.

Collaborative distances:

Awards

ACM Fellow

ACM Fellow 2013, "For contributions to scalable data mining and query processing.".

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2023
From Large Language Models to Databases and Back: A Discussion on Research and Education.
SIGMOD Rec., September, 2023

Collecting Geospatial Data Under Local Differential Privacy With Improving Frequency Estimation.
IEEE Trans. Knowl. Data Eng., July, 2023

THUNDER: Named Entity Recognition Using a Teacher-Student Model with Dual Classifiers for Strong and Weak Supervisions.
Proceedings of the ECAI 2023 - 26th European Conference on Artificial Intelligence, September 30 - October 4, 2023, Kraków, Poland, 2023

2022
Cardinality Estimation of Approximate Substring Queries using Deep Learning.
Proc. VLDB Endow., 2022

T5 Encoder Based Acronym Disambiguation with Weak Supervision.
Proceedings of the Workshop on Scientific Document Understanding co-located with 36th AAAI Conference on Artificial Inteligence, 2022

2021
TIDY: Publishing a Time Interval Dataset With Differential Privacy.
IEEE Trans. Knowl. Data Eng., 2021

Substring Similarity Search with Synonyms.
Proceedings of the 37th IEEE International Conference on Data Engineering, 2021

Collecting Geospatial Data with Local Differential Privacy for Personalized Services.
Proceedings of the 37th IEEE International Conference on Data Engineering, 2021

2020
TIDY: Publishing a Time Interval Dataset with Differential Privacy (Extended abstract).
Proceedings of the 36th IEEE International Conference on Data Engineering, 2020

String Joins with Synonyms.
Proceedings of the Database Systems for Advanced Applications, 2020

Dual Supervision Framework for Relation Extraction with Distant Supervision and Human Annotation.
Proceedings of the 28th International Conference on Computational Linguistics, 2020

T-REX: A Topic-Aware Relation Extraction Model.
Proceedings of the CIKM '20: The 29th ACM International Conference on Information and Knowledge Management, 2020

2019
Efficient two-dimensional Haar <sup>+</sup> synopsis construction for the maximum absolute error measure.
VLDB J., 2019

Efficient Aggregation Processing in the Presence of Duplicately Detected Objects in WSNs.
J. Sensors, 2019

Crowdsourced Truth Discovery in the Presence of Hierarchies for Knowledge Fusion.
Proceedings of the Advances in Database Technology, 2019

2018
Preface to the special issue on advances in Spatio-temporal data analysis and management.
GeoInformatica, 2018

2017
Preface to the special issue on big data search and mining.
World Wide Web, 2017

Efficient Processing of Skyline Queries Using MapReduce.
IEEE Trans. Knowl. Data Eng., 2017

Special Section on the International Conference on Data Engineering 2015.
IEEE Trans. Knowl. Data Eng., 2017

Efficient Haar+ Synopsis Construction for the Maximum Absolute Error Measure.
Proc. VLDB Endow., 2017

Integration of graphs from different data sources using crowdsourcing.
Inf. Sci., 2017

Latent ranking analysis using pairwise comparisons in crowdsourcing platforms.
Inf. Syst., 2017

2016
Parallel computation of k-nearest neighbor joins using MapReduce.
Proceedings of the 2016 IEEE International Conference on Big Data (IEEE BigData 2016), 2016

2015
Processing of Probabilistic Skyline Queries Using MapReduce.
Proc. VLDB Endow., 2015

Aggregate query processing in the presence of duplicates in wireless sensor networks.
Inf. Sci., 2015

Supporting set-valued joins in NoSQL using MapReduce.
Inf. Syst., 2015

2014
TWINS: Efficient time-windowed in-network joins for sensor networks.
Inf. Sci., 2014

DBCURE-MR: An efficient density-based clustering algorithm for large data using MapReduce.
Inf. Syst., 2014

TWILITE: A recommendation system for Twitter using a probabilistic model based on latent Dirichlet allocation.
Inf. Syst., 2014

Latent Ranking Analysis Using Pairwise Comparisons.
Proceedings of the 2014 IEEE International Conference on Data Mining, 2014

2013
Parallel Computation of Skyline and Reverse Skyline Queries Using MapReduce.
Proc. VLDB Endow., 2013

Efficient processing of substring match queries with inverted variable-length gram indexes.
Inf. Sci., 2013

DIGTOBI: a recommendation system for Digg articles using probabilistic modeling.
Proceedings of the 22nd International World Wide Web Conference, 2013

Efficient top-k algorithms for approximate substring matching.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2013

A remote cardiac monitoring system for preventive care.
Proceedings of the IEEE International Conference on Consumer Electronics, 2013

2012
MapReduce Algorithms for Big Data Analysis.
Proc. VLDB Endow., 2012

Parallel Top-K Similarity Join Algorithms Using MapReduce.
Proceedings of the IEEE 28th International Conference on Data Engineering (ICDE 2012), 2012

Data Management Challenges and Opportunities in Cloud Computing.
Proceedings of the Database Systems for Advanced Applications, 2012

HotDigg: Finding Recent Hot Topics from Digg.
Proceedings of the Database Systems for Advanced Applications, 2012

A breast tumor classification method based on ultrasound BI-RADS data mining.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

2011
TEXT: Automatic Template Extraction from Heterogeneous Web Pages.
IEEE Trans. Knowl. Data Eng., 2011

Similarity Join Size Estimation using Locality Sensitive Hashing.
Proc. VLDB Endow., 2011

CATCH: A detecting algorithm for coalition attacks of hit inflation in internet advertising.
Inf. Syst., 2011

TWITOBI: A Recommendation System for Twitter Using Probabilistic Modeling.
Proceedings of the 11th IEEE International Conference on Data Mining, 2011

2010
Approximate algorithms with generalizing attribute values for k-anonymity.
Inf. Syst., 2010

Efficient processing of substring match queries with inverted q-gram indexes.
Proceedings of the 26th International Conference on Data Engineering, 2010

2009
Power-Law Based Estimation of Set Similarity Join Size.
Proc. VLDB Endow., 2009

FAST: Flash-aware external sorting for mobile database systems.
J. Syst. Softw., 2009

Approximate substring selectivity estimation.
Proceedings of the EDBT 2009, 2009

2008
Wavelet synopsis for hierarchical range queries with workloads.
VLDB J., 2008

2007
A Note on Linear Time Algorithms for Maximum Error Histograms.
IEEE Trans. Knowl. Data Eng., 2007

SQUIRE: Sequential pattern mining with quantities.
J. Syst. Softw., 2007

Extending Q-Grams to Estimate Selectivity of String Matching with Low Edit Distance.
Proceedings of the 33rd International Conference on Very Large Data Bases, 2007

Approximate algorithms for K-anonymity.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2007

2006
Approximation and streaming algorithms for histogram construction problems.
ACM Trans. Database Syst., 2006

Erratum to: "An adaptive path index for XML data using the query workload": [Information Systems 30(6) (2005) 467-487].
Inf. Syst., 2006

2005
Storing XML (with XSD) in SQL Databases: Interplay of Logical and Physical Designs.
IEEE Trans. Knowl. Data Eng., 2005

An adaptive path index for XML data using the query workload.
Inf. Syst., 2005

Offline and Data Stream Algorithms for Efficient Computation of Synopsis Structures.
Proceedings of the 31st International Conference on Very Large Data Bases, Trondheim, Norway, August 30, 2005

2004
WALRUS: A Similarity Retrieval Algorithm for Image Databases.
IEEE Trans. Knowl. Data Eng., 2004

Recent Advances in Histogram Construction Algorithms.
Proceedings of the Advances in Web-Age Information Management: 5th International Conference, 2004

REHIST: Relative Error Histogram Construction Algorithms.
Proceedings of the (e)Proceedings of the Thirtieth International Conference on Very Large Data Bases, VLDB 2004, Toronto, Canada, August 31, 2004

XWAVE: Approximate Extended Wavelets for Streaming Data.
Proceedings of the (e)Proceedings of the Thirtieth International Conference on Very Large Data Bases, VLDB 2004, Toronto, Canada, August 31, 2004

2003
Mining Optimized Gain Rules for Numeric Attributes.
IEEE Trans. Knowl. Data Eng., 2003

DTD Inference from XML Documents: The XTRACT Approach.
IEEE Data Eng. Bull., 2003

Building Decision Trees with Constraints.
Data Min. Knowl. Discov., 2003

XTRACT: Learning Document Type Descriptors from XML Document Collections.
Data Min. Knowl. Discov., 2003

Storage and Retrieval of XML Data using Relational Databases.
Proceedings of the 19th International Conference on Data Engineering, 2003

Techniques for Clustering Massive Data Sets.
Proceedings of the Clustering and Information Retrieval, 2003

2002
High-Dimensional Similarity Joins.
IEEE Trans. Knowl. Data Eng., 2002

Mining Optimized Association Rules with Categorical and Numeric Attributes.
IEEE Trans. Knowl. Data Eng., 2002

Mining Sequential Patterns with Regular Expression Constraints.
IEEE Trans. Knowl. Data Eng., 2002

Reminiscences on Influential Papers.
SIGMOD Rec., 2002

APEX: an adaptive path index for XML data.
Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, 2002

2001
Approximate query processing using wavelets.
VLDB J., 2001

Mining optimized support rules for numeric attributes.
Inf. Syst., 2001

Cure: An Efficient Clustering Algorithm for Large Databases.
Inf. Syst., 2001

Data-streams and histograms.
Proceedings of the Proceedings on 33rd Annual ACM Symposium on Theory of Computing, 2001

2000
Workshop Report: 1999 ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery.
SIGKDD Explor., 2000

Editorial.
SIGKDD Explor., 2000

ROCK: A Robust Clustering Algorithm for Categorical Attributes.
Inf. Syst., 2000

PUBLIC: A Decision Tree Classifier that Integrates Building and Pruning.
Data Min. Knowl. Discov., 2000

Efficient Algorithms for Mining Outliers from Large Data Sets.
Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, 2000

XTRACT: A System for Extracting Document Type Descriptors from XML Documents.
Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, 2000

Efficient algorithms for constructing decision trees with constraints.
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, 2000

1999
Optimization of Queries with User-Defined Predicates.
ACM Trans. Database Syst., 1999

Data Mining and the Web: Past, Present and Future.
Proceedings of the ACM CIKM'99 2nd Workshop on Web Information and Data Management (WIDM'99), 1999

SPIRIT: Sequential Pattern Mining with Regular Expression Constraints.
Proceedings of the VLDB'99, 1999

Of Crawlers, Portals, Mice and Men: Is there more to Mining the Web? (Panel).
Proceedings of the SIGMOD 1999, 1999

Scalable Algorithms for Mining Large Databases.
Proceedings of the Tutorial Notes for ACM SIGKDD 1999 International Conference on Knowledge Discovery and Data Mining, 1999

1998
A Constraint-Based Spatial Extension to SQL.
Proceedings of the ACM-GIS '98, 1998

1997
Parametric Query Optimization.
VLDB J., 1997

1996
Developing Tightly-Coupled Data Mining Applications on a Relational Database System.
Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96), 1996

Optimizing Queries with Aggregate Views.
Proceedings of the Advances in Database Technology, 1996

1995
An Overview of Cost-based Optimization of Queries with Aggregates.
IEEE Data Eng. Bull., 1995

Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases.
Proceedings of the VLDB'95, 1995

Optimizing Queries with Materialized Views.
Proceedings of the Eleventh International Conference on Data Engineering, 1995

1994
Improvements on a Heuristic Algorithm for Multiple-Query Optimization.
Data Knowl. Eng., 1994

Including Group-By in Query Optimization.
Proceedings of the VLDB'94, 1994

1993
Query Optimization in the Presence of Foreign Functions.
Proceedings of the 19th International Conference on Very Large Data Bases, 1993


  Loading...