1

Red Onions, Soft Cheese and Data: From Food Safety to Data Traceability for Responsible AI

Software systems that learn from data with AI and machine learning (ML) are becoming ubiquitous and are increasingly used to automate impactful decisions. The risks arising from this widespread use of AI/ML are garnering attention from policy makers, …

mlwhatif: What If You Could Stop Re-Implementing Your Machine Learning Pipeline Analyses Over and Over?

Software systems that learn from data with machine learning (ML) are used in critical decision-making processes. Unfortunately, real-world experience shows that the pipelines for data preparation, feature encoding and model training in ML systems are …

Provenance Tracking for End-to-End Machine Learning Pipelines

Software systems that learn from data are being deployed in increasing numbers in real-world application scenarios. It is a difficult and tedious task to ensure at development time that the end-to-end ML pipelines for such applications adhere to …

Automating and Optimizing Data-Centric What-If Analyses on Native Machine Learning Pipelines

Software systems that learn from data with machine learning (ML) are used in critical decision-making processes. Unfortunately, real-world experience shows that the pipelines for data preparation, feature encoding and model training in ML systems are …

Proactively Screening Machine Learning Pipelines with ArgusEyes

Software systems that learn from data with machine learning (ML) are ubiquitous. ML pipelines in these applications often suffer from a variety of data-related issues, such as data leakage, label errors or fairness violations, which require reasoning …

Towards Data-Centric What-If Analysis for Native Machine Learning Pipelines

An important task of data scientists is to understand the sensitivity of their models to changes in the data that the models are trained and tested upon. Currently, conducting such data-centric what-if analyses requires significant and costly manual …

Screening Native Machine Learning Pipelines with ArgusEyes

Software systems that learn from data are being deployed in increasing numbers in real world application scenarios. It is a difficult and tedious task to ensure at development time that the end-to-end ML pipelines for such applications adhere to …

HedgeCut: Maintaining Randomized Trees for Low-Latency Machine Unlearning

Software systems that learn from user data with machine learning (ML) have become ubiquitous over the last years. Recent law such as the General Data Protection Regulation (GDPR) requires organisations that process personal data to delete user data …

mlinspect: a Data Distribution Debugger for Machine Learning Pipelines

Machine Learning (ML) is increasingly used to automate impactful decisions, and the risks arising from this wide-spread use are garnering attention from policymakers, scientists, and the media. ML applications are often very brittle with respect to …

Lightweight Inspection of Data Preprocessing in Native Machine Learning Pipelines

Machine Learning (ML) is increasingly used to automate impactful decisions, and the risks arising from this wide-spread use are garnering attention from policy makers, scientists, and the media. ML applications are often very brittle with respect to …