Sebastian Schelter

Proceedings of the SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12, 2022

ReCANet: A Repeat Consumption-Aware Neural Network for Next Basket Recommendation in Grocery Shopping.

[BibT_eX]

[DOI]

Proceedings of the SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11, 2022

GitSchemas: A Dataset for Automating Relational Data Preparation Tasks.

[BibT_eX]

[DOI]

Proceedings of the 38th IEEE International Conference on Data Engineering Workshops, 2022

Towards data-centric what-if analysis for native machine learning pipelines.

[BibT_eX]

[DOI]

Stefan Grafberger

Paul Groth

Proceedings of the DEEM '22: Proceedings of the Sixth Workshop on Data Management for End-To-End Machine Learning Philadelphia, 2022

Screening Native Machine Learning Pipelines with ArgusEyes.

[BibT_eX]

[DOI]

Proceedings of the 12th Conference on Innovative Data Systems Research, 2022

2021

Letter from the Special Issue Editor.

[BibT_eX]

[DOI]

IEEE Data Eng. Bull., 2021

Parameter Efficient Deep Probabilistic Forecasting.

[BibT_eX]

[DOI]

Olivier Sprangers

Maarten de Rijke

CoRR, 2021

HedgeCut: Maintaining Randomised Trees for Low-Latency Machine Unlearning.

[BibT_eX]

[DOI]

Stefan Grafberger

Ted Dunning

Proceedings of the SIGMOD '21: International Conference on Management of Data, 2021

MLINSPECT: A Data Distribution Debugger for Machine Learning Pipelines.

[BibT_eX]

[DOI]

Proceedings of the SIGMOD '21: International Conference on Management of Data, 2021

Probabilistic Gradient Boosting Machines for Large-Scale Probabilistic Regression.

[BibT_eX]

[DOI]

Olivier Sprangers

Maarten de Rijke

Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021

Learnings from a Retail Recommendation System on Billions of Interactions at bol.com.

[BibT_eX]

[DOI]

Barrie Kersbergen

Proceedings of the 37th IEEE International Conference on Data Engineering, 2021

JENGA - A Framework to Study the Impact of Data Errors on the Predictions of Machine Learning Models.

[BibT_eX]

[DOI]

Tammo Rukat

Felix Biessmann

Proceedings of the 24th International Conference on Extending Database Technology, 2021

Automating Data Quality Validation for Dynamic Data Ingestion.

[BibT_eX]

[DOI]

Proceedings of the 24th International Conference on Extending Database Technology, 2021

Understanding Multi-channel Customer Behavior in Retail.

[BibT_eX]

[DOI]

Proceedings of the CIKM '21: The 30th ACM International Conference on Information and Knowledge Management, Virtual Event, Queensland, Australia, November 1, 2021

Lightweight Inspection of Data Preprocessing in Native Machine Learning Pipelines.

[BibT_eX]

[DOI]

Stefan Grafberger

Julia Stoyanovich

Proceedings of the 11th Conference on Innovative Data Systems Research, 2021

2020

Technical Perspective: Query Optimization for Faster Deep CNN Explanations.

[BibT_eX]

[DOI]

SIGMOD Rec., 2020

Apache Mahout: Machine Learning on Distributed Dataflow Systems.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2020

Taming Technical Bias in Machine Learning Pipelines.

[BibT_eX]

[DOI]

Julia Stoyanovich

IEEE Data Eng. Bull., 2020

Analyzing and Predicting Purchase Intent in E-commerce: Anonymous vs. Identified Customers.

[BibT_eX]

[DOI]

CoRR, 2020

A Comparison of Supervised Learning to Match Methods for Product Search.

[BibT_eX]

[DOI]

CoRR, 2020

HDDse: Enabling High-Dimensional Disk State Embedding for Generic Failure Detection System of Heterogeneous Disks in Large Data Centers.

[BibT_eX]

[DOI]

Proceedings of the 2020 USENIX Annual Technical Conference, 2020

Learning to Validate the Predictions of Black Box Classifiers on Unseen Data.

[BibT_eX]

[DOI]

Tammo Rukat

Felix Bießmann

Proceedings of the 2020 International Conference on Management of Data, 2020

Elastic Machine Learning Algorithms in Amazon SageMaker.

[BibT_eX]

[DOI]

Proceedings of the 2020 International Conference on Management of Data, 2020

Three Challenges in Building Industrial-Scale Recommender Systems.

[BibT_eX]

[DOI]

Proceedings of the 3rd Workshop on Online Recommender Systems and User Modeling co-located with the 14th ACM Conference on Recommender Systems (RecSys 2020), 2020

Demand Forecasting in the Presence of Privileged Information.

[BibT_eX]

[DOI]

Mozhdeh Ariannezhad

Maarten de Rijke

Proceedings of the Advanced Analytics and Learning on Temporal Data, 2020

FairPrep: Promoting Data to a First-Class Citizen in Studies on Fairness-Enhancing Interventions.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Conference on Extending Database Technology, 2020

Towards Unsupervised Data Quality Validation on Dynamic Data.

[BibT_eX]

[DOI]

Sergey Redyuk

Proceedings of the Workshops of the EDBT/ICDT 2020 Joint Conference, 2020

Zooming Out on an Evolving Graph.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Conference on Extending Database Technology, 2020

Tier-Scrubbing: An Adaptive and Tiered Disk Scrubbing Scheme with Improved MTTD and Reduced Cost.

[BibT_eX]

[DOI]

Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

"Amnesia" - Machine Learning Models That Can Forget User Data Very Fast.

[BibT_eX]

[DOI]

Proceedings of the 10th Conference on Innovative Data Systems Research, 2020

2019

An Intermediate Representation for Optimizing Machine Learning Pipelines.

[BibT_eX]

[DOI]

Andreas Kunft

Proc. VLDB Endow., 2019

DataWig: Missing Value Imputation for Tables.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2019

ADABench - Towards an Industry Standard Benchmark for Advanced Analytics.

[BibT_eX]

[DOI]

Rodrigo Escobar Palacios

Proceedings of the Performance Evaluation and Benchmarking for the Era of Cloud(s), 2019

Efficient Incremental Cooccurrence Analysis for Item-Based Collaborative Filtering.

[BibT_eX]

[DOI]

Ufuk Celebi

Ted Dunning

Proceedings of the 31st International Conference on Scientific and Statistical Database Management, 2019

DEEM 2019: Workshop on Data Management for End-to-End Machine Learning.

[BibT_eX]

[DOI]

Proceedings of the 2019 International Conference on Management of Data, 2019

Unit Testing Data with Deequ.

[BibT_eX]

[DOI]

Proceedings of the 2019 International Conference on Management of Data, 2019

Learning to Validate the Predictions of Black Box Machine Learning Models on Unseen Data.

[BibT_eX]

[DOI]

Proceedings of the Workshop on Human-In-the-Loop Data Analytics, 2019

Differential Data Quality Verification on Partitioned Data.

[BibT_eX]

[DOI]

Proceedings of the 35th IEEE International Conference on Data Engineering, 2019

2018

Automating Large-Scale Data Quality Verification.

[BibT_eX]

[DOI]

Proc. VLDB Endow., 2018

On the Ubiquity of Web Tracking: Insights from a Billion-Page Web Crawl.

[BibT_eX]

[DOI]

J. Web Sci., 2018

On Challenges in Machine Learning Model Management.

[BibT_eX]

[DOI]

IEEE Data Eng. Bull., 2018

Benchmarking Distributed Data Processing Systems for Machine Learning Workloads.

[BibT_eX]

[DOI]

Proceedings of the Performance Evaluation and Benchmarking for the Era of Artificial Intelligence, 2018

"Deep" Learning for Missing Value Imputationin Tables with Non-Numerical Data.

[BibT_eX]

[DOI]

Proceedings of the 27th ACM International Conference on Information and Knowledge Management, 2018

2017

BlockJoin: Efficient Matrix Partitioning Through Joins.

[BibT_eX]

[DOI]

Andreas Kunft

Tilmann Rabl

Proc. VLDB Endow., 2017

Probabilistic Demand Forecasting at Scale.

[BibT_eX]

[DOI]

Proc. VLDB Endow., 2017

'Dark Germany': Temporal Characteristics and Connectivity Patterns in Online Far-Right Protests Against Refugee Housing.

[BibT_eX]

[DOI]

Proceedings of the 2017 ACM on Web Science Conference, 2017

'Dark Germany': Hidden Patterns of Participation in Online Far-Right Protests Against Refugee Housing.

[BibT_eX]

[DOI]

Proceedings of the Social Informatics, 2017

Gilbert: Declarative Sparse Linear Algebra on Massively Parallel Dataflow Systems.

[BibT_eX]

[DOI]

Proceedings of the Datenbanksysteme für Business, 2017

2016

Scaling data mining in massively parallel dataflow systems.

[BibT_eX]

[DOI]

PhD thesis, 2016

Doubly stochastic large scale kernel learning with the empirical kernel map.

[BibT_eX]

[DOI]

Nikolaas Steenbergen

Felix Bießmann

CoRR, 2016

Tracking the Trackers: A Large-Scale Analysis of Embedded Web Trackers.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Web and Social Media, 2016

Structural Patterns in the Rise of Germany's New Right on Facebook.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Data Mining Workshops, 2016

Apache Flink: Stream Analytics at Scale.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Cloud Engineering Workshop, 2016

2015

Optimistic Recovery for Iterative Dataflows in Action.

[BibT_eX]

[DOI]

Sergey Dudoladov

Chen Xu

Stephan Ewen

Kostas Tzoumas

Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31, 2015

Efficient sample generation for scalable meta learning.

[BibT_eX]

[DOI]

Alexandre V. Evfimievski

Proceedings of the 31st IEEE International Conference on Data Engineering, 2015

2014

The Stratosphere platform for big data analytics.

[BibT_eX]

[DOI]

Alexander Alexandrov

Rico Bergmann

Stephan Ewen

Johann-Christoph Freytag

VLDB J., 2014

Factorbird - a Parameter Server Approach to Distributed Matrix Factorization.

[BibT_eX]

[DOI]

Venu Satuluri

Reza Zadeh

CoRR, 2014

Scaling data mining in massively parallel dataflow systems.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Management of Data, 2014

2013

Iterative parallel data processing with stratosphere: an inside look.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGMOD International Conference on Management of Data, 2013

Distributed matrix factorization with mapreduce using a series of broadcast-joins.

[BibT_eX]

[DOI]

Proceedings of the Seventh ACM Conference on Recommender Systems, 2013

"All roads lead to Rome": optimistic recovery for distributed iterative data processing.

[BibT_eX]

[DOI]

Proceedings of the 22nd ACM International Conference on Information and Knowledge Management, 2013

2012

Scalable similarity-based neighborhood methods with MapReduce.

[BibT_eX]

[DOI]

Christoph Boden