Learning to Validate the Predictions of Black Box Machine Learning Models on Unseen Data

Published in the 4th Workshop on Human-In-the-Loop Data Analytics (HILDA), 2019

When end users apply a machine learning (ML) model on new unlabeled data, it is difficult for them to decide whether they can trust its predictions. Errors or shifts in the target data can lead to hard-to-detect drops in the predictive quality of the model. We therefore propose an approach to assist non-ML experts working with pretrained ML models. Our approach estimates the change in prediction performance of a model on unseen target data. It does not require explicit distributional assumptions on the dataset shift between the training and target data. Instead, a domain expert can declaratively specify typical cases of dataset shift that she expects to observe in real-world data. Based on this information, we learn a performance predictor for pretrained black box models, which can be combined with the model, and automatically warns end users in case of unexpected performance drops. We demonstrate the effectiveness of our approach on two models – logistic regression and a neural network, applied to several real-world datasets.

Recommended citation: S. Redyuk, S. Schelter, T. Rukat, V. Markl, F. Biessmann (2019) Learning to Validate the Predictions of Black Box Machine Learning Models on Unseen Data. HILDA’19, Amsterdam, Netherlands

Automated Documentation of End-to-End Experiments in Data Science

Published in the 35th IEEE International Conference on Data Engineering (ICDE), 2019

Reproducibility plays a crucial role in experimentation. However, the modern research ecosystem and the underlying frameworks are constantly evolving and thereby making it extremely difficult to reliably reproduce scientific artifacts such as data, algorithms, trained models and visualizations. We therefore aim to design a novel system for assisting data scientists with rigorous end-to-end documentation of data-oriented experiments. Capturing data lineage, metadata, and other artifacts helps reproducing and sharing experimental results. We summarize this challenge as automated documentation of data science experiments. We aim at reducing manual overhead for experimenting researchers, and intend to create a novel approach in dataflow and metadata tracking based on the analysis of the experiment source code. The envisioned system will accelerate the research process in general, and enable capturing fine-grained meta information by deriving a declarative representation of data science experiments.

Recommended citation: S. Redyuk (2019). Automated Documentation of End-to-End Experiments in Data Science. In Ph.D. Symposium track, IEEE 35th International Conference on Data Engineering (ICDE’19), Macau, China