Paolo Papotti

Orcid: 0000-0003-0651-4128

Affiliations:
  • EURECOM, Campus SophiaTech, France
  • Arizona State University, USA (former)
  • Roma Tre University, Rome, Italy (former)


According to our database1, Paolo Papotti authored at least 143 papers between 2004 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Retrieve, Merge, Predict: Augmenting Tables with Data Lakes.
CoRR, 2024

Similarity Measures For Incomplete Database Instances.
Proceedings of the Proceedings 27th International Conference on Extending Database Technology, 2024

Relational Data Imputation with Graph Neural Networks.
Proceedings of the Proceedings 27th International Conference on Extending Database Technology, 2024

Querying Large Language Models with SQL.
Proceedings of the Proceedings 27th International Conference on Extending Database Technology, 2024

2023
Generation of Training Examples for Tabular Natural Language Inference.
Proc. ACM Manag. Data, December, 2023

Introduction to the Special Issue on Truth and Trust Online.
ACM J. Data Inf. Qual., March, 2023

Exploratory Training: When Annonators Learn About Data.
Proc. ACM Manag. Data, 2023

Variable Selection in Maximum Mean Discrepancy for Interpretable Distribution Comparison.
CoRR, 2023

The Community Notes Observatory: Can Crowdsourced Fact-Checking be Trusted in Practice?
Proceedings of the Companion Proceedings of the ACM Web Conference 2023, 2023

Analyzing COVID-Related Social Discourse on Twitter using Emotion, Sentiment, Political Bias, Stance, Veracity and Conspiracy Theories.
Proceedings of the Companion Proceedings of the ACM Web Conference 2023, 2023

Integrity 2023: Integrity in Social Networks and Media.
Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining, 2023

Models and Practice of Neural Table Representations.
Proceedings of the Companion of the 2023 International Conference on Management of Data, 2023

Attribute Ambiguity Discovery: A Deep Learning Approach via Unsupervised Learning.
Proceedings of the 31st Symposium of Advanced Database Systems, 2023

QATCH: Benchmarking SQL-centric tasks with Table Representation Learning Models on Your Data.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Tactics, Threats & Targets: Modeling Disinformation and its Mitigation.
Proceedings of the 30th Annual Network and Distributed System Security Symposium, 2023

Maximizing Neutrality in News Ordering.
Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023

Data Ambiguity Profiling for the Generation of Training Examples.
Proceedings of the 39th IEEE International Conference on Data Engineering, 2023

Definitions Matter: Guiding GPT for Multi-label Classification.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

2022
Technical Perspective of TURL: Table Understanding through Representation Learning.
SIGMOD Rec., 2022

Transformers for Tabular Data Representation: A Tutorial on Models and Applications.
Proc. VLDB Endow., 2022

Editorial: Special Issue on Deep Learning for Data Quality.
ACM J. Data Inf. Qual., 2022

Integrity 2022: Integrity in Social Networks and Media.
CoRR, 2022

Pythia: Unsupervised Generation of Ambiguous Textual Claims from Relational Data.
Proceedings of the SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12, 2022

Exploratory training: when trainers learn.
Proceedings of the HILDA@SIGMOD 2022: Proceedings of the Workshop on Human-In-the-Loop Data Analytics, 2022

Ambiguity Detection and Textual Claims Generation from Relational Data.
Proceedings of the 30th Italian Symposium on Advanced Database Systems, 2022

Detection of COVID-19-Related Conspiracy Theories in Tweets using Transformer-Based Models and Node Embedding Techniques.
Proceedings of the Working Notes Proceedings of the MediaEval 2022 Workshop, 2022

Unsupervised Matching of Data and Text.
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022

You Are My Type! Type Embeddings for Pre-trained Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

Building A Knowledge Graph for Audit Information.
Proceedings of the Workshops of the EDBT/ICDT 2022 Joint Conference, 2022

Crowdsourced Fact-Checking at Twitter: How Does the Crowd Compare With Experts?
Proceedings of the 31st ACM International Conference on Information & Knowledge Management, 2022

2021
Fact-Checking Statistical Claims with Tables.
IEEE Data Eng. Bull., 2021

Computational Fact Checking is Real, but will it Stop Misinformation? (Extended Abstract).
Proceedings of the Workshop on Misinformation Integrity in Social Networks 2021 co-located with 30th The Web Conference (TheWebConf 2021), 2021

Few-Shot Knowledge Validation using Rules.
Proceedings of the WWW '21: The Web Conference 2021, 2021

Wikidata Logical Rules and Where to Find Them.
Proceedings of the Companion of The Web Conference 2021, 2021

EmbDI: Generating Embeddings for Relational Data Integration (Discussion Paper).
Proceedings of the 29th Italian Symposium on Advanced Database Systems, 2021

Detecting COVID-19-Related Conspiracy Theories in Tweets.
Proceedings of the Working Notes Proceedings of the MediaEval 2021 Workshop, 2021

Automatic Verification of Data Summaries.
Proceedings of the 14th International Conference on Natural Language Generation, 2021

Automated Fact-Checking for Assisting Human Fact-Checkers.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

RuleBERT: Teaching Soft Rules to Pre-Trained Language Models.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Data Challenges in Disinformation Diffusion Analysis.
Proceedings of the Workshops of the EDBT/ICDT 2021 Joint Conference, 2021

2020
Cleaning data with Llunatic.
VLDB J., 2020

Making AI Machines Work for Humans in FoW.
SIGMOD Rec., 2020

Scrutinizer: Fact Checking Statistical Claims.
Proc. VLDB Endow., 2020

Scrutinizer: A Mixed-Initiative Approach to Large-Scale, Data-Driven Claim Verification.
Proc. VLDB Endow., 2020

RuleHub: A Public Corpus of Rules for Knowledge Graphs.
ACM J. Data Inf. Qual., 2020

Mining Expressive Rules in Knowledge Graphs.
ACM J. Data Inf. Qual., 2020

Creating Embeddings of Heterogeneous Relational Datasets for Data Integration Tasks.
Proceedings of the 2020 International Conference on Management of Data, 2020

User-driven Error Detection for Time Series with Events.
Proceedings of the 36th IEEE International Conference on Data Engineering, 2020

LIBRE: Learning Interpretable Boolean Rule Ensembles.
Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

2019
Data Integration.
Proceedings of the Encyclopedia of Big Data Technologies., 2019

Schema Mapping.
Proceedings of the Encyclopedia of Big Data Technologies., 2019

Buckle: Evaluating Fact Checking Algorithms Built on Knowledge Bases.
Proc. VLDB Endow., 2019

Meta-Mappings for Schema Mapping Reuse.
Proc. VLDB Endow., 2019

Local Embeddings for Relational Data Integration.
CoRR, 2019

Explainable Fact Checking with Probabilistic Answer Set Programming.
Proceedings of the 2019 Truth and Trust Online Conference (TTO 2019), 2019

GAIA: A Framework for Schema Mapping Reuse (extended abstract).
Proceedings of the 27th Italian Symposium on Advanced Database Systems, 2019

A Benchmark for Fact Checking Algorithms Built on Knowledge Bases.
Proceedings of the 28th ACM International Conference on Information and Knowledge Management, 2019

2018
RuDiK: Rule Discovery in Knowledge Bases.
Proc. VLDB Endow., 2018

RHEEM: Enabling Cross-Platform Data Processing - May The Big Data Be With You! -.
Proc. VLDB Endow., 2018

Towards a Benchmark for Fact Checking with Knowledge Bases.
Proceedings of the Companion of the The Web Conference 2018 on The Web Conference 2018, 2018

Let's Make It Dirty with BART!
Proceedings of the 26th Italian Symposium on Advanced Database Systems, 2018

Robust Discovery of Positive and Negative Rules in Knowledge Bases.
Proceedings of the 34th IEEE International Conference on Data Engineering, 2018

Schema Mappings: From Data Translation to Data Cleaning.
Proceedings of the A Comprehensive Guide Through the Italian Database Research Over the Last 25 Years., 2018

2017
Fast and scalable inequality joins.
VLDB J., 2017

Synthesizing Entity Matching Rules by Examples.
Proc. VLDB Endow., 2017

Errata for "Lightning Fast and Space Efficient Inequality Joins" (PVLDB 8(13): 2074-2085).
Proc. VLDB Endow., 2017

Query-limited Black-box Attacks to Classifiers.
CoRR, 2017

Generating Concise Entity Matching Rules.
Proceedings of the 2017 ACM International Conference on Management of Data, 2017

Interactive Data Repairing: the FALCON Dive.
Proceedings of the 25th Italian Symposium on Advanced Database Systems, 2017

Benchmarking the Chase.
Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, 2017

A Framework for Interactive Geospatial Map Cleaning using GPS Trajectories.
Proceedings of the 10th ACM SIGSPATIAL Workshop on Computational Transportation Science, 2017

2016
Detecting Data Errors: Where are we and what needs to be done?
Proc. VLDB Endow., 2016

Benchmarking Data Curation Systems.
IEEE Data Eng. Bull., 2016

BART in Action: Error Generation and Empirical Evaluations of Data-Cleaning Systems.
Proceedings of the 2016 International Conference on Management of Data, 2016

Interactive and Deterministic Data Cleaning.
Proceedings of the 2016 International Conference on Management of Data, 2016


Towards User-Aware Rule Discovery.
Proceedings of the Information Search, Integration, and Personlization, 2016

Big data quality - whose problem is it?
Proceedings of the 32nd IEEE International Conference on Data Engineering, 2016

Data quality between promises and results.
Proceedings of the 32nd IEEE International Conference on Data Engineering Workshops, 2016

DataXFormer: A robust transformation discovery system.
Proceedings of the 32nd IEEE International Conference on Data Engineering, 2016

Road to Freedom in Big Data Analytics.
Proceedings of the 19th International Conference on Extending Database Technology, 2016

2015
Lightning Fast and Space Efficient Inequality Joins.
Proc. VLDB Endow., 2015

KATARA: Reliable Data Cleaning with Knowledge Bases and Crowdsourcing.
Proc. VLDB Endow., 2015

Messing Up with BART: Error Generation for Evaluating Data-Cleaning Algorithms.
Proc. VLDB Endow., 2015

Temporal Rules Discovery for Web Data Cleaning.
Proc. VLDB Endow., 2015

Editorial.
ACM J. Data Inf. Qual., 2015

DataXFormer: An Interactive Data Transformation Tool.
Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31, 2015

BigDansing: A System for Big Data Cleansing.
Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31, 2015

KATARA: A Data Cleaning System Powered by Knowledge Bases and Crowdsourcing.
Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31, 2015

Beyond Declarative Data Cleaning.
Proceedings of the 23rd Italian Symposium on Advanced Database Systems, 2015

Estimating Data Integration and Cleaning Effort.
Proceedings of the 18th International Conference on Extending Database Technology, 2015

Dataxformer: Leveraging the Web for Semantic Transformations.
Proceedings of the Seventh Biennial Conference on Innovative Data Systems Research, 2015

2014
That's All Folks! LLUNATIC Goes Open Source.
Proc. VLDB Endow., 2014

Descriptive and prescriptive data cleaning.
Proceedings of the International Conference on Management of Data, 2014

An Overview of the Llunatic System.
Proceedings of the 22nd Italian Symposium on Advanced Database Systems, 2014

IQ-METER - An evaluation tool for data-transformation systems.
Proceedings of the IEEE 30th International Conference on Data Engineering, Chicago, 2014

Mapping and cleaning.
Proceedings of the IEEE 30th International Conference on Data Engineering, Chicago, 2014

RuleMiner: Data quality rules discovery.
Proceedings of the IEEE 30th International Conference on Data Engineering, Chicago, 2014

2013
The LLUNATIC Data-Cleaning Framework.
Proc. VLDB Endow., 2013

Discovering Denial Constraints.
Proc. VLDB Endow., 2013

Extraction and Integration of Partially Overlapping Web Sources.
Proc. VLDB Endow., 2013

Introduction to the special issue on data quality.
Inf. Syst., 2013

On the Quality and Effectiveness of Data Transformation Systems.
Proceedings of the 21st Italian Symposium on Advanced Database Systems, 2013

Future Locations Prediction with Uncertain Data.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2013

Holistic data cleaning: Putting violations into context.
Proceedings of the 29th IEEE International Conference on Data Engineering, 2013

2012
The data analytics group at the qatar computing research institute.
SIGMOD Rec., 2012

Schema Mapping and Data Exchange Tools: Time for the Golden Age.
it Inf. Technol., 2012

Core schema mappings: Scalable core computations in data exchange.
Inf. Syst., 2012

Web Data Reconciliation: Models and Experiences.
Proceedings of the Search Computing - Broadening Web Search, 2012

A Short History of Schema Mapping Systems.
Proceedings of the Twentieth Italian Symposium on Advanced Database Systems, 2012

What is the IQ of your data transformation system?
Proceedings of the 21st ACM International Conference on Information and Knowledge Management, 2012

Flint: From Web Pages to Probabilistic Semantic Data.
Proceedings of the Semantic Search over the Web, 2012

2011
Repeatability and workability evaluation of SIGMOD 2011.
SIGMOD Rec., 2011

++Spicy: an OpenSource Tool for Second-Generation Schema Mapping and Data Exchange.
Proc. VLDB Endow., 2011

Characterizing the uncertainty of web data: models and experiences.
Proceedings of the 2011 Joint WICOW/AIRWeb Workshop on Web Quality, 2011

Automatically building probabilistic databases from the web.
Proceedings of the 20th International Conference on World Wide Web, 2011

Wrapper Generation for Overlapping Web Sources.
Proceedings of the 2011 IEEE/WIC/ACM International Conference on Web Intelligence, 2011

Contextual Data Extraction and Instance-Based Integration.
Proceedings of the First International Workshop on Searching and Integrating New Web Data Sources, 2011

Emerging Applications for Schema Mappings (Extended Abstract).
Proceedings of the Sistemi Evoluti per Basi di Dati, 2011

Discovery and Correctness of Schema Mapping Transformations.
Proceedings of the Schema Matching and Mapping, 2011

2010
Scalable Data Exchange with Functional Dependencies.
Proc. VLDB Endow., 2010

Exploiting information redundancy to wring out structured data from the web.
Proceedings of the 19th International Conference on World Wide Web, 2010

Redundancy-Driven Web Data Extraction and Integration.
Proceedings of the 13th International Workshop on the Web and Databases 2010, 2010

Probabilistic Reconciliation of Records from Inaccurate Web Sources (Extended Abstract).
Proceedings of the Eighteenth Italian Symposium on Advanced Database Systems, 2010

Probabilistic Models to Reconcile Complex Data from Inaccurate Data Sources.
Proceedings of the Advanced Information Systems Engineering, 22nd International Conference, 2010

2009
Concise and Expressive Mappings with +Spicy.
Proc. VLDB Endow., 2009

Schema exchange: Generic mappings for transforming data and metadata.
Data Knowl. Eng., 2009

Core schema mappings.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2009

Data Extraction and Integration from Imprecise Web Sources.
Proceedings of the Seventeenth Italian Symposium on Advanced Database Systems, 2009

2008
Data exchange with data-metadata translations.
Proc. VLDB Endow., 2008

Supporting the automatic construction of entity aware search engines.
Proceedings of the 10th ACM International Workshop on Web Information and Data Management (WIDM 2008), 2008

Clip: a tool for mapping hierarchical schemas.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2008

Searching Entities on the Web by Sample.
Proceedings of the Sixteenth Italian Symposium on Advanced Database Systems, 2008

Clip: a Visual Language for Explicit Schema Mappings.
Proceedings of the 24th International Conference on Data Engineering, 2008

Flint: Google-basing the Web.
Proceedings of the EDBT 2008, 2008

2007
On the Schema Exchange Problem.
Proceedings of the Fifteenth Italian Symposium on Advanced Database Systems, 2007

Creating Nested Mappings with Clio.
Proceedings of the 23rd International Conference on Data Engineering, 2007

Schema Exchange: A Template-Based Approach to Data and Metadata Translation.
Proceedings of the Conceptual Modeling, 2007

Automatic Generation of Model Translations.
Proceedings of the Advanced Information Systems Engineering, 19th International Conference, 2007

2006
Nested Mappings: Schema Mapping Reloaded.
Proceedings of the 32nd International Conference on Very Large Data Bases, 2006

2005
Heterogeneous Data Translation through XML Conversion.
J. Web Eng., 2005

Translation of Web Data with Chameleon.
Proceedings of the Thirteenth Italian Symposium on Advanced Database Systems, 2005

2004
An Approach to Heterogeneous Data Translation based on XML Conversion.
Proceedings of the CAiSE'04 Workshops in connection with The 16th Conference on Advanced Information Systems Engineering, 2004


  Loading...