Xin Dong

Orcid: 0009-0001-2049-2458

Affiliations:
  • Amazon
  • Google


According to our database1, Xin Dong authored at least 139 papers between 2003 and 2024.

Collaborative distances:

Awards

ACM Fellow

ACM Fellow 2023, "For contributions to knowledge graph construction and data integration".

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
SnapNTell: Enhancing Entity-Centric Visual Question Answering with Retrieval Augmented Multimodal LLM.
CoRR, 2024

Large Language Models as Zero-shot Dialogue State Tracker through Function Calling.
CoRR, 2024

Lumos : Empowering Multimodal LLMs with Scene Text Recognition.
CoRR, 2024

The Journey to A Knowledgeable Assistant with Retrieval-Augmented Generation (RAG).
Proceedings of the 17th ACM International Conference on Web Search and Data Mining, 2024

2023
Editorial: Special Issue for Selected Papers of VLDB 2021.
VLDB J., November, 2023

Generations of Knowledge Graphs: The Crazy Ideas and the Business Impact.
Proc. VLDB Endow., 2023

Head-to-Tail: How Knowledgeable are Large Language Models (LLM)? A.K.A. Will LLMs Replace Knowledge Graphs?
CoRR, 2023


Personal Data for Personal Use: Vision or Reality?
Proceedings of the Companion of the 2023 International Conference on Management of Data, 2023

Towards Next-Generation Intelligent Assistants Leveraging LLM Techniques.
Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023

Next-Generation Intelligent Assistants for AR/VR Devices.
Proceedings of the Extraction et Gestion des Connaissances, 2023

Tab-Cleaner: Weakly Supervised Tabular Data Cleaning via Pre-training for E-commerce Catalog.
Proceedings of the The 61st Annual Meeting of the Association for Computational Linguistics: Industry Track, 2023

2022
ACM WSDM 2022 report.
SIGWEB Newsl., 2022

PGE: Robust Product Graph Embedding Learning for Error Detection.
Proc. VLDB Endow., 2022

Knowledge Graphs: Introduction, History and, Perspectives.
AI Mag., 2022

OA-Mine: Open-World Attribute Mining for E-Commerce Products with Weak Supervision.
Proceedings of the WWW '22: The ACM Web Conference 2022, Virtual Event, Lyon, France, April 25, 2022

Publication Culture and Review Processes in the Data Management Community: An Open Discussion.
Proceedings of the SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12, 2022

2021
VLDB 2021: Designing a Hybrid Conference.
SIGMOD Rec., 2021

Deep Transfer Learning for Multi-source Entity Linkage via Domain Adaptation.
Proc. VLDB Endow., 2021

Machine Knowledge: Creation and Curation of Comprehensive Knowledge Bases.
Found. Trends Databases, 2021

Minimally-Supervised Structure-Rich Text Categorization via Learning on Text-Rich Networks.
Proceedings of the WWW '21: The Web Conference 2021, 2021

TCN: Table Convolutional Network for Web Table Interpretation.
Proceedings of the WWW '21: The Web Conference 2021, 2021

The 1st International Workshop on Machine Reasoning: International Machine Reasoning Conference (MRC 2021).
Proceedings of the WSDM '21, 2021

EX3: Explainable Attribute-aware Item-set Recommendations.
Proceedings of the RecSys '21: Fifteenth ACM Conference on Recommender Systems, Amsterdam, The Netherlands, 27 September 2021, 2021

All You Need to Know to Build a Product Knowledge Graph.
Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021

PAM: Understanding Product Images in Cross Product Category Attribute Extraction.
Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021

End-to-End Conversational Search for Online Shopping with Utterance Transfer.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

AdaTag: Multi-Attribute Value Extraction from Product Profiles with Adaptive Decoding.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

CoRI: Collective Relation Integration with Data Augmentation for Open Information Extraction.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
Front Matter.
Proc. VLDB Endow., 2020

Winds from Seattle: Database Research Directions.
Proc. VLDB Endow., 2020

Collective Multi-type Entity Alignment Between Knowledge Graphs.
Proceedings of the WWW '20: The Web Conference 2020, Taipei, Taiwan, April 20-24, 2020, 2020

AutoBlock: A Hands-off Blocking Framework for Entity Matching.
Proceedings of the WSDM '20: The Thirteenth ACM International Conference on Web Search and Data Mining, 2020

Web-scale Knowledge Collection.
Proceedings of the WSDM '20: The Thirteenth ACM International Conference on Web Search and Data Mining, 2020

Automatic Validation of Textual Attribute Values in E-commerce Catalog by Learning with Limited Labeled Data.
Proceedings of the KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2020

MultiImport: Inferring Node Importance in a Knowledge Graph from Multiple Input Signals.
Proceedings of the KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2020

Octet: Online Catalog Taxonomy Enrichment with Self-Supervision.
Proceedings of the KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2020

Multi-modal Information Extraction from Text, Semi-structured, and Tabular Data on the Web.
Proceedings of the KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2020

AutoKnow: Self-Driving Knowledge Collection for Products of Thousands of Types.
Proceedings of the KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2020

CorDEL: A Contrastive Deep Learning Approach for Entity Linkage.
Proceedings of the 20th IEEE International Conference on Data Mining, 2020

J-Recs: Principled and Scalable Recommendation Justification.
Proceedings of the 20th IEEE International Conference on Data Mining, 2020

P-Companion: A Principled Framework for Diversified Complementary Product Recommendation.
Proceedings of the CIKM '20: The 29th ACM International Conference on Information and Knowledge Management, 2020

Ceres: Harvesting Knowledge from the Semi-structured Web.
Proceedings of the CIKM '20: The 29th ACM International Conference on Information and Knowledge Management, 2020

Contrastive Entity Linkage: Mining Variational Attributes from Large Catalogs for Entity Linkage.
Proceedings of the Conference on Automated Knowledge Base Construction, 2020

ZeroShotCeres: Zero-Shot Relation Extraction from Semi-Structured Webpages.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

TXtract: Taxonomy-Aware Knowledge Extraction for Thousands of Product Categories.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019
The Seattle Report on Database Research.
SIGMOD Rec., 2019

Efficient Knowledge Graph Accuracy Evaluation.
Proc. VLDB Endow., 2019

OpenKI: Integrating Open Information Extraction and Knowledge Bases with Relation Inference.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

OpenCeres: When Open Information Extraction Meets the Semi-Structured Web.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

Estimating Node Importance in Knowledge Graphs Using Graph Neural Networks.
Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2019

MIDAS: Finding the Right Web Sources to Fill Knowledge Gaps.
Proceedings of the 35th IEEE International Conference on Data Engineering, 2019

Building a Broad Knowledge Graph for Products.
Proceedings of the 35th IEEE International Conference on Data Engineering, 2019

2018
XML Indexing.
Proceedings of the Encyclopedia of Database Systems, Second Edition, 2018

Entity Resolution.
Proceedings of the Encyclopedia of Database Systems, Second Edition, 2018

Data Fusion.
Proceedings of the Encyclopedia of Database Systems, Second Edition, 2018

Managing Data Integration Uncertainty.
Proceedings of the Encyclopedia of Database Systems, Second Edition, 2018

Mining Summaries for Knowledge Graph Search.
IEEE Trans. Knowl. Data Eng., 2018

CERES: Distantly Supervised Relation Extraction from the Semi-Structured Web.
Proc. VLDB Endow., 2018

Data Integration and Machine Learning: A Natural Synergy.
Proc. VLDB Endow., 2018

Big Data Integration for Product Specifications.
IEEE Data Eng. Bull., 2018

Lessons Learned and Research Agenda for Big Data Integration of Product Specifications.
Proceedings of the 26th Italian Symposium on Advanced Database Systems, 2018

OpenTag: Open Attribute Value Extraction from Product Profiles.
Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2018

Challenges and Innovations in Building a Product Knowledge Graph.
Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2018

Challenges and innovations in building a product knowledge graph: extended abstract.
Proceedings of the 1st ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA), 2018

LinkNBed: Multi-Graph Representation Learning with Entity Linkage.
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018

2017
Data Quality: The Role of Empiricism.
SIGMOD Rec., 2017

Knowledge Verification for LongTail Verticals.
Proc. VLDB Endow., 2017

Discovering Multiple Truths with a Hybrid Model.
CoRR, 2017

2016
A Time Machine for Information: Looking Back to Look Forward.
SIGMOD Rec., 2016

Leave No Valuable Data Behind: The Crazy Ideas and the Business.
Proc. VLDB Endow., 2016

Editorial: Special Issue on Web Data Quality.
ACM J. Data Inf. Qual., 2016

Knowledge-Based Trust: Estimating the Trustworthiness of Web Sources.
IEEE Data Eng. Bull., 2016

SourceSight: Enabling Effective Source Selection.
Proceedings of the 2016 International Conference on Management of Data, 2016

Mining Summaries for Knowledge Graph Search.
Proceedings of the IEEE 16th International Conference on Data Mining, 2016

2015
Big Data Integration
Synthesis Lectures on Data Management, Morgan & Claypool Publishers, ISBN: 978-3-031-01853-4, 2015

Error Diagnosis and Data Profiling with Data X-Ray.
Proc. VLDB Endow., 2015

DEXTER: Large-Scale Discovery and Extraction of Product Specifications on the Web.
Proc. VLDB Endow., 2015

Keys for Graphs.
Proc. VLDB Endow., 2015

A Time Machine for Information: Looking Back to Look Forward.
Proc. VLDB Endow., 2015

Robust Group Linkage.
Proceedings of the 24th International Conference on World Wide Web, 2015

The elephant in the room: getting value from Big Data.
Proceedings of the 18th International Workshop on Web and Databases, 2015

Data X-Ray: A Diagnostic Tool for Data Errors.
Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31, 2015

Knowledge Curation and Knowledge Fusion: Challenges, Models and Applications.
Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31, 2015

TimeMachine: Timeline Generation for Knowledge-Base Entities.
Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2015

Scaling up copy detection.
Proceedings of the 31st IEEE International Conference on Data Engineering, 2015

Finding Quality in Quantity: The Challenge of Discovering Valuable Sources for Integration.
Proceedings of the Seventh Biennial Conference on Innovative Data Systems Research, 2015

2014
Incremental Record Linkage.
Proc. VLDB Endow., 2014

From Data Fusion to Knowledge Fusion.
Proc. VLDB Endow., 2014

Characterizing and selecting fresh data sources.
Proceedings of the International Conference on Management of Data, 2014

Fusing data with correlations.
Proceedings of the International Conference on Management of Data, 2014

Knowledge vault: a web-scale approach to probabilistic knowledge fusion.
Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2014

2013
Online Ordering of Overlapping Data Sources.
Proc. VLDB Endow., 2013

Big Data Integration.
Proc. VLDB Endow., 2013

Compact explanation of data fusion decisions.
Proceedings of the 22nd International World Wide Web Conference, 2013

Data Fusion: Resolving Conflicts from Multiple Sources.
Proceedings of the Web-Age Information Management - 14th International Conference, 2013

SIGMOD 2013 new researcher symposium.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2013

Data Fusion: Resolving Conflicts from Multiple Sources.
Proceedings of the Handbook of Data Quality, Research and Practice., 2013

2012
10th international workshop on quality in databases: QDB 2012.
SIGMOD Rec., 2012

Chronos: Facilitating History Discovery by Linking Temporal Records.
Proc. VLDB Endow., 2012

Truth Finding on the Deep Web: Is the Problem Solved?
Proc. VLDB Endow., 2012

Less is More: Selecting Sources Wisely for Integration.
Proc. VLDB Endow., 2012

Detecting Clones, Copying and Reuse on the Web.
Proceedings of the IEEE 28th International Conference on Data Engineering (ICDE 2012), 2012

Detecting Clones, Copying and Reuse on the Web (DASFAA 2012 Tutorial).
Proceedings of the Database Systems for Advanced Applications, 2012

2011
Online Data Fusion.
Proc. VLDB Endow., 2011

Linking Temporal Records.
Proc. VLDB Endow., 2011

Letter from the Special Issue Editors.
IEEE Data Eng. Bull., 2011

Large-scale copy detection.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2011

We challenge you to certify your updates.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2011

Data integration with dependent sources.
Proceedings of the EDBT 2011, 2011

Solomon: seeking the truth via copying detection.
Proceedings of the 2nd International Workshop on Business intelligencE and the WEB, 2011

Uncertainty in Data Integration and Dataspace Support Platforms.
Proceedings of the Schema Matching and Mapping, 2011

2010
13th international workshop on the web and databases: WebDB 2010.
SIGMOD Rec., 2010

Record Linkage with Uniqueness Constraints and Erroneous Values.
Proc. VLDB Endow., 2010

SOLOMON: Seeking the Truth Via Copying Detection.
Proc. VLDB Endow., 2010

Global Detection of Complex Copying Relationships Between Sources.
Proc. VLDB Endow., 2010

2009
XML Indexing.
Proceedings of the Encyclopedia of Database Systems, 2009

Data integration with uncertainty.
VLDB J., 2009

Data fusion - Resolving Data Conflicts for Integration.
Proc. VLDB Endow., 2009

Truth Discovery and Copying Detection in a Dynamic World.
Proc. VLDB Endow., 2009

Integrating Conflicting Data: The Role of Source Dependence.
Proc. VLDB Endow., 2009

Functional Dependency Generation and Applications in Pay-As-You-Go Data Integration Systems.
Proceedings of the 12th International Workshop on the Web and Databases, 2009

Sailing the Information Ocean with Awareness of Currents: Discovery and Application of Source Dependence.
Proceedings of the Fourth Biennial Conference on Innovative Data Systems Research, 2009

Data Modeling in Dataspace Support Platforms.
Proceedings of the Conceptual Modeling: Foundations and Applications, 2009

2008
Bootstrapping pay-as-you-go data integration systems.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2008

2007
Visualization of Heterogeneous Data.
IEEE Trans. Vis. Comput. Graph., 2007

Indexing dataspaces.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2007

Web-Scale Data Integration: You can afford to Pay as You Go.
Proceedings of the Third Biennial Conference on Innovative Data Systems Research, 2007

2006
Structured Data Meets the Web: A Few Observations.
IEEE Data Eng. Bull., 2006

Answering Structured Queries on Unstructured Data.
Proceedings of the Ninth International Workshop on the Web and Databases, 2006

2005
Malleable Schemas: A Preliminary Report.
Proceedings of the Eight International Workshop on the Web & Databases (WebDB 2005), 2005

Reference Reconciliation in Complex Information Spaces.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2005

Personal information management with SEMEX.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2005

A Platform for Personal Information Management and Integration.
Proceedings of the Second Biennial Conference on Innovative Data Systems Research, 2005

2004
Mining structures for semantics.
SIGKDD Explor., 2004

Containment of Nested XML Queries.
Proceedings of the (e)Proceedings of the Thirtieth International Conference on Very Large Data Bases, VLDB 2004, Toronto, Canada, August 31, 2004

Simlarity Search for Web Services.
Proceedings of the (e)Proceedings of the Thirtieth International Conference on Very Large Data Bases, VLDB 2004, Toronto, Canada, August 31, 2004

2003
The Piazza peer data management project.
SIGMOD Rec., 2003

ROLEX: Relational On-Line Exchange with XML.
Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, 2003


  Loading...