# Christopher Ré

According to our database1, Christopher Ré authored at least 168 papers between 2002 and 2019.

Collaborative distances:

Book
In proceedings
Article
PhD thesis
Other

## Bibliography

2019
Osprey: Weak Supervision of Imbalanced Extraction Problems without Code.
Proceedings of the 3rd International Workshop on Data Management for End-to-End Machine Learning, 2019

Snorkel DryBell: A Case Study in Deploying Weak Supervision at Industrial Scale.
Proceedings of the 2019 International Conference on Management of Data, 2019

Automating the generation of hardware component knowledge bases.
Proceedings of the 20th ACM SIGPLAN/SIGBED International Conference on Languages, 2019

Learning Dependency Structures for Weak Supervision Models.
Proceedings of the 36th International Conference on Machine Learning, 2019

A Kernel Theory of Modern Data Augmentation.
Proceedings of the 36th International Conference on Machine Learning, 2019

Learning Fast Algorithms for Linear Transforms Using Butterfly Factorizations.
Proceedings of the 36th International Conference on Machine Learning, 2019

Learning Mixed-Curvature Representations in Product Spaces.
Proceedings of the 7th International Conference on Learning Representations, 2019

A Formal Framework for Probabilistic Unclean Databases.
Proceedings of the 22nd International Conference on Database Theory, 2019

The Role of Massively Multi-Task and Weak Supervision in Software 2.0.
Proceedings of the CIDR 2019, 2019

Low-Precision Random Fourier Features for Memory-constrained Kernel Approximation.
Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

Training Complex Models with Multi-Task Weak Supervision.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Knowledge Base Construction in the Machine-learning Era.
ACM Queue, 2018

Snuba: Automating Weak Supervision to Label Training Data.
PVLDB, 2018

It's All a Matter of Degree - Using Degree Information to Optimize Multiway Joins.
Theory Comput. Syst., 2018

Worst-case Optimal Join Algorithms.
J. ACM, 2018

Research for practice: knowledge base construction in the machine-learning era.
Commun. ACM, 2018

A Two-pronged Progress in Structured Dense Matrix Vector Multiplication.
Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, 2018

Exploring the Utility of Developer Exhaust.
Proceedings of the Second Workshop on Data Management for End-To-End Machine Learning, 2018

Snorkel MeTaL: Weak Supervision for Multi-Task Learning.
Proceedings of the Second Workshop on Data Management for End-To-End Machine Learning, 2018

Fonduer: Knowledge Base Construction from Richly Formatted Data.
Proceedings of the 2018 International Conference on Management of Data, 2018

Machine learning and deep analytics for biocomputing: Call for better explainability.
Proceedings of the Biocomputing 2018: Proceedings of the Pacific Symposium, 2018

Learning Compressed Transforms with Low Displacement Rank.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Software 2.0 and Snorkel: Beyond Hand-Labeled Data.
Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2018

Proceedings of the 35th International Conference on Machine Learning, 2018

Learning Invariance with Compact Transforms.
Proceedings of the 6th International Conference on Learning Representations, 2018

Proceedings of the 34th IEEE International Conference on Data Engineering, 2018

Unraveling the Molecular Basis of Lung Adenocarcinoma Dedifferentiation and Prognosis by Integrating Omics and Histopathology.
Proceedings of the AMIA 2018, 2018

Accelerated Stochastic Power Iteration.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2018

Training Classifiers with Natural Language Explanations.
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018

2017
Incremental knowledge base construction using DeepDive.
VLDB J., 2017

EmptyHeaded: A Relational Engine for Graph Processing.
ACM Trans. Database Syst., 2017

Report from the third workshop on Algorithms and Systems for MapReduce and Beyond (BeyondMR'16).
SIGMOD Record, 2017

HoloClean: Holistic Data Repairs with Probabilistic Inference.
PVLDB, 2017

Snorkel: Rapid Training Data Creation with Weak Supervision.
PVLDB, 2017

PVLDB, 2017

Weighted SGD for $\ell_p$ Regression with Randomized Preconditioning.
J. Mach. Learn. Res., 2017

DeepDive: declarative knowledge base construction.
Commun. ACM, 2017

Snorkel: Beyond Hand-labeled Data.
Proceedings of the Thirty-Third Conference on Uncertainty in Artificial Intelligence, 2017

Flipper: A Systematic Approach to Debugging Training Sets.
Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics, 2017

SLiMFast: Guaranteed Results for Data Fusion and Source Reliability.
Proceedings of the 2017 ACM International Conference on Management of Data, 2017

Snorkel: Fast Training Set Generation for Information Extraction.
Proceedings of the 2017 ACM International Conference on Management of Data, 2017

A Relational Framework for Classifier Engineering.
Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, 2017

Inferring Generative Model Structure with Static Analysis.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Learning to Compose Domain-Specific Transformations for Data Augmentation.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

ShortFuse: Biomedical Time Series Representations in the Presence of Structured Information.
Proceedings of the Machine Learning for Health Care Conference, 2017

Understanding and Optimizing Asynchronous Low-Precision Stochastic Gradient Descent.
Proceedings of the 44th Annual International Symposium on Computer Architecture, 2017

Ensuring Rapid Mixing and Low Bias for Asynchronous Gibbs Sampling.
Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017

Learning the Structure of Generative Models without Labeled Data.
Proceedings of the 34th International Conference on Machine Learning, 2017

GYM: A Multiround Distributed Join Algorithm.
Proceedings of the 20th International Conference on Database Theory, 2017

Snorkel: A System for Lightweight Extraction.
Proceedings of the CIDR 2017, 2017

Predicting Non-Small Cell Lung Cancer Diagnosis and Prognosis by Fully Automated Microscopic Pathology Image Features.
Proceedings of the AMIA 2017, 2017

2016
DeepDive: Declarative Knowledge Base Construction.
SIGMOD Record, 2016

The Beckman report on database research.
Commun. ACM, 2016

Large-scale extraction of gene interactions from full-text literature using DeepDive.
Bioinformatics, 2016

Weighted SGD for p Regression with Randomized Preconditioning.
Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, 2016

Extracting Databases from Dark Data with DeepDive.
Proceedings of the 2016 International Conference on Management of Data, 2016

Data programming with DDLite: putting humans in a different part of the loop.
Proceedings of the Workshop on Human-In-the-Loop Data Analytics, 2016

EmptyHeaded: A Relational Engine for Graph Processing.
Proceedings of the 2016 International Conference on Management of Data, 2016

AJAR: Aggregations and Joins over Annotated Relations.
Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, 2016

Sub-sampled Newton Methods with Non-uniform Sampling.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Data Programming: Creating Large Training Sets, Quickly.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Scan Order in Gibbs Sampling: Models in Which it Matters and Bounds on How Much.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

High Performance Parallel Stochastic Gradient Descent in Shared Memory.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016

Wikipedia Knowledge Graph with DeepDive.
Proceedings of the Wiki, 2016

Ensuring Rapid Mixing and Low Bias for Asynchronous Gibbs Sampling.
Proceedings of the 33nd International Conference on Machine Learning, 2016

It's All a Matter of Degree: Using Degree Information to Optimize Multiway Joins.
Proceedings of the 19th International Conference on Database Theory, 2016

Dark Data: Are we solving the right problems?
Proceedings of the 32nd IEEE International Conference on Data Engineering, 2016

Old techniques for new join algorithms: A case study in RDF processing.
Proceedings of the 32nd IEEE International Conference on Data Engineering Workshops, 2016

Asynchrony begets momentum, with an application to deep learning.
Proceedings of the 54th Annual Allerton Conference on Communication, 2016

2015
Incremental Knowledge Base Construction Using DeepDive.
PVLDB, 2015

Mindtagger: A Demonstration of Data Labeling in Knowledge Base Construction.
PVLDB, 2015

The mobilize center: an NIH big data to knowledge center to advance human movement research and improve mobility.
JAMIA, 2015

Energy-Efficient Abundant-Data Computing: The N3XT 1, 000x.
IEEE Computer, 2015

DunceCap: Query Plans Using Generalized Hypertree Decompositions.
Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31, 2015

Machine Learning and Databases: The Sound of Things to Come or a Cacophony of Hype?
Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31, 2015

DunceCap: Compiling Worst-Case Optimal Query Plans.
Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31, 2015

Join Processing for Graph Patterns: An Old Dog with New Tricks.
Proceedings of the Third International Workshop on Graph Data Management Experiences and Systems, 2015

Exploiting Correlations for Expensive Predicate Evaluation.
Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31, 2015

Caffe con Troll: Shallow Ideas to Speed Up Deep Learning.
Proceedings of the Fourth Workshop on Data analytics in the Cloud, 2015

Joins via Geometric Resolutions: Worst-case and Beyond.
Proceedings of the 34th ACM Symposium on Principles of Database Systems, 2015

Taming the Wild: A Unified Analysis of Hogwild-Style Algorithms.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Rapidly Mixing Gibbs Sampling for a Class of Factor Graphs Using Hierarchy Width.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Asynchronous stochastic convex optimization: the noise is in the noise and SGD don't care.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Global Convergence of Stochastic Gradient Descent for Some Non-convex Matrix Problems.
Proceedings of the 32nd International Conference on Machine Learning, 2015

Jedi: A Storage Manager for SIMD-aware, Worst-case Optimal Join Processing.
Proceedings of the Workshops of the EDBT/ICDT 2015 Joint Conference (EDBT/ICDT), 2015

A Database Framework for Classifier Engineering.
Proceedings of the 9th Alberto Mendelzon International Workshop on Foundations of Data Management, Lima, Peru, May 6, 2015

2014
The Beckman Report on Database Research.
SIGMOD Record, 2014

DimmWitted: A Study of Main-Memory Statistical Analytics.
PVLDB, 2014

Approximation trade-offs in a Markovian stream warehouse: An empirical study.
Inf. Syst., 2014

Feature Engineering for Knowledge Base Construction.
IEEE Data Eng. Bull., 2014

Tradeoffs in Main-Memory Statistical Analytics from Impala to DimmWitted.
Proceedings of the 2nd International Workshop on In Memory Data Management and Analytics, 2014

Materialization optimizations for feature selection workloads.
Proceedings of the International Conference on Management of Data, 2014

Beyond worst-case analysis for joins with minesweeper.
Proceedings of the 33rd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, 2014

Parallel Feature Selection Inspired by Group Testing.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Effectively Creating Weakly Labeled Training Examples via Approximate Domain Knowledge.
Proceedings of the Inductive Logic Programming - 24th International Conference, 2014

An Asynchronous Parallel Stochastic Coordinate Descent Algorithm.
Proceedings of the 31th International Conference on Machine Learning, 2014

The Theory of Zeta Graphs with an Application to Random Networks.
Proceedings of the Proc. 17th International Conference on Database Theory (ICDT), 2014

Links between Join Processing and Convex Geometry.
Proceedings of the Proc. 17th International Conference on Database Theory (ICDT), 2014

2013
Probabilistic Web Data Management.
World Wide Web, 2013

Skew strikes back: new developments in the theory of join algorithms.
SIGMOD Record, 2013

Hazy: Making it Easier to Build and Maintain Big-data Analytics.
ACM Queue, 2013

Feature Selection in Enterprise Analytics: A Demonstration using an R-based Data Analytics System.
PVLDB, 2013

Ringtail: A Generalized Nowcasting System.
PVLDB, 2013

Parallel stochastic gradient algorithms for large-scale matrix completion.
Math. Program. Comput., 2013

Hazy: making it easier to build and maintain big-data analytics.
Commun. ACM, 2013

Ringtail: Feature Selection For Easier Nowcasting.
Proceedings of the 16th International Workshop on the Web and Databases 2013, 2013

Bootstrapping Knowledge Base Acceleration.
Proceedings of The Twenty-Second Text REtrieval Conference, 2013

Evaluating Stream Filtering for Entity Profile Updates for TREC 2013.
Proceedings of The Twenty-Second Text REtrieval Conference, 2013

Towards high-throughput gibbs sampling at scale: a study across storage managers.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2013

GeoDeepDive: statistical inference using familiar data-processing languages.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2013

An Approximate, Efficient LP Solver for LP Rounding.
Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

Brainwash: A Data System for Feature Engineering.
Proceedings of the CIDR 2013, 2013

A Tutorial on Trained Systems: A New Generation of Data Management Systems?
Proceedings of the Big Data - 29th British National Conference on Databases, 2013

Understanding Tables in Context Using Standard NLP Toolkits.
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, 2013

Using Commonsense Knowledge to Automatically Create (Noisy) Training Examples from Text.
Proceedings of the Statistical Relational Artificial Intelligence, 2013

2012
PVLDB, 2012

Toward a Noncommutative Arithmetic-geometric Mean Inequality: Conjectures, Case-studies, and Consequences.
Proceedings of the COLT 2012, 2012

Elementary: Large-Scale Knowledge-Base Construction via Machine Learning and Statistical Inference.
Int. J. Semantic Web Inf. Syst., 2012

DeepDive: Web-scale Knowledge-base Construction using Statistical Learning and Inference.
Proceedings of the Second International Workshop on Searching and Integrating New Web Data Sources, 2012

Building an Entity-Centric Stream Filtering Test Collection for TREC 2012.
Proceedings of The Twenty-First Text REtrieval Conference, 2012

Towards a unified architecture for in-RDBMS analytics.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2012

Worst-case optimal join algorithms: [extended abstract].
Proceedings of the 31st ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, 2012

Factoring nonnegative matrices with linear programs.
Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

Scaling Inference for Markov Logic via Dual Decomposition.
Proceedings of the 12th IEEE International Conference on Data Mining, 2012

Optimizing Statistical Information Extraction Programs over Evolving Text.
Proceedings of the IEEE 28th International Conference on Data Engineering (ICDE 2012), 2012

Big Data versus the Crowd: Looking for Relationships in All the Right Places.
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, July 8-14, 2012, Jeju Island, Korea, 2012

2011
Probabilistic Databases
Synthesis Lectures on Data Management, Morgan & Claypool Publishers, 2011

Tuffy: Scaling up Statistical Inference in Markov Logic Networks using an RDBMS.
PVLDB, 2011

Probabilistic Management of OCR Data using an RDBMS.
PVLDB, 2011

Incrementally maintaining classification using an RDBMS.
PVLDB, 2011

Automatic Optimization for MapReduce Programs.
PVLDB, 2011

Queries and materialized views on probabilistic databases.
J. Comput. Syst. Sci., 2011

Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent.
Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

2010
Manimal: Relational Optimization for Data-Intensive Programs.
Proceedings of the 13th International Workshop on the Web and Databases 2010, 2010

Understanding cardinality estimation using entropy maximization.
Proceedings of the Twenty-Ninth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, 2010

Transducing Markov sequences.
Proceedings of the Twenty-Ninth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, 2010

Approximation trade-offs in Markovian stream processing: An empirical study.
Proceedings of the 26th International Conference on Data Engineering, 2010

2009
The trichotomy of HAVING queries on a probabilistic database.
VLDB J., 2009

Repeatability & workability evaluation of SIGMOD 2009.
SIGMOD Record, 2009

Lahar Demonstration: Warehousing Markovian Streams.
PVLDB, 2009

Probabilistic databases: diamonds in the dirt.
Commun. ACM, 2009

Query Containment of Tier-2 Queries over a Probabilistic Database.
Proceedings of the Third VLDB workshop on Management of Uncertain Data (MUD2009) in conjunction with VLDB 2009, 2009

Access Methods for Markovian Streams.
Proceedings of the 25th International Conference on Data Engineering, 2009

Large-Scale Deduplication with Constraints Using Dedupalog.
Proceedings of the 25th International Conference on Data Engineering, 2009

General Database Statistics Using Entropy Maximization.
Proceedings of the Database Programming Languages, 2009

2008
Approximate lineage for probabilistic databases.
PVLDB, 2008

Systems aspects of probabilistic data management.
PVLDB, 2008

Challenges for Event Queries over Markovian Streams.
IEEE Internet Computing, 2008

Managing Probabilistic Data with MystiQ: The Can-Do, the Could-Do, and the Can't-Do.
Proceedings of the Scalable Uncertainty Management, Second International Conference, 2008

Event queries on correlated probabilistic streams.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2008

A demonstration of Cascadia through a digital diary application.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2008

Implementing NOT EXISTS Predicates over a Probabilistic Database.
Proceedings of the International Workshop on Quality in Databases and Management of Uncertain Data, 2008

08421 Working Group: Report of the Probabilistic Databases Benchmarking.
Proceedings of the Uncertainty Management in Information Systems, 12.10. - 17.10.2008, 2008

2007
Managing Uncertainty in Social Networks.
IEEE Data Eng. Bull., 2007

Materialized Views in Probabilistic Databases for Information Exchange and Query Optimization.
Proceedings of the 33rd International Conference on Very Large Data Bases, 2007

Efficient Top-k Query Evaluation on Probabilistic Data.
Proceedings of the 23rd International Conference on Data Engineering, 2007

Efficient Evaluation of.
Proceedings of the Database Programming Languages, 11th International Symposium, 2007

Management of data with uncertainties.
Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management, 2007

Structured Querying of Web Text Data: A Technical Challenge.
Proceedings of the CIDR 2007, 2007

2006
Query Evaluation on Probabilistic Databases.
IEEE Data Eng. Bull., 2006

A Complete and Efficient Algebraic Compiler for XQuery.
Proceedings of the 22nd International Conference on Data Engineering, 2006

XQuery!: An XML Query Language with Side Effects.
Proceedings of the Current Trends in Database Technology - EDBT 2006, 2006

2005
A Framework for XML-Based Integration of Data, Visualization and Analysis in a Biomedical Domain.
Proceedings of the Database and XML Technologies, 2005

MYSTIQ: a system for finding more answers by using probabilities.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2005

Supporting workflow in a course management system.
Proceedings of the 36th SIGCSE Technical Symposium on Computer Science Education, 2005

2003
WS-Membership - Failure Management in a Web-Services World.
Proceedings of the Twelfth International World Wide Web Conference, 2003

2002
A Collaborative Infrastructure for Scalable and Robust News Delivery.
Proceedings of the 22nd International Conference on Distributed Computing Systems, 2002