ML Data Drift Detection with Evidently AI & Valohai | MLOps Monitoring

Automated data drift detection for machine learning models using Evidently AI and Valohai — with conditional retraining when drift is detected.

What This Project Does

This repository shows how to detect data drift in ML pipelines with Evidently AI on Valohai. It includes:

Data preprocessing and model training (scikit-learn)
Drift monitoring with Evidently AI reports (JSON, HTML)
Conditional retraining when drift is detected (with optional human approval)

Use it as a reference for ML monitoring, model reliability, and production ML pipelines.

What is Data Drift?

Data drift is when the distribution of input data (or the input–output relationship) changes over time. It can degrade model accuracy in production. Monitoring and detecting drift helps you decide when to retrain and keep models reliable.

Features

Evidently AI integration for data drift reports
Valohai pipelines: training and inference + drift detection
Conditional retraining triggered by drift (with approval step)
California Housing dataset example; works with your own data
Reports in JSON and HTML for analysis and dashboards

Tech Stack

Component	Technology
Drift detection	Evidently AI
Orchestration	Valohai
ML framework	scikit-learn
Data	pandas
Language	Python 3.9

Training Pipeline

Preprocesses data and trains the model.

Steps

Data preprocessing
- Load dataset from Valohai inputs or fetch California Housing if not provided.
- Preprocess and save with a Valohai alias.
Model training
- Load preprocessed data, train with scikit-learn, save the model with a Valohai alias.

Pipeline in Valohai

Drift Detection Pipeline

Runs inference and drift analysis with Evidently AI.

Steps

Inference and drift detection
- Load reference data, current data, and trained model.
- Run inference on current data.
- Generate Evidently AI drift reports (e.g. Data Drift preset).
- Save reports (JSON, HTML).
Conditional retraining
- Evaluate drift from reports.
- If drift is detected: update status and trigger retraining (with approval).
- If no drift: stop the pipeline.

Pipeline in Valohai

Project Flow

Preprocess and store data.
Train and evaluate the model.
Run inference on new data and detect drift with Evidently.
If drift is detected → trigger retraining (with human approval).
If no drift → stop the pipeline.

Flow overview

Getting Started

Clone the repo and follow Running on Valohai.
Ensure you have a Valohai account and Evidently is used as in the code (installed via valohai.yaml).
For secrets (e.g. Valohai API token), see Secrets & Environment Variables.

Running on Valohai

1. Configure the repository

Install Valohai CLI:
```
pip install valohai-cli
```
Log in:
```
vh login
```

Create and enter a project directory, then create a Valohai project:

mkdir valohai-evidently-example
cd valohai-evidently-example
vh project create

Clone this repository into that directory:

git clone https://github.com/KuchikiRenji/evidently-drift-detection.git .

2. Run executions (single steps)

vh execution run <step-name> --adhoc

Example (preprocess):

vh execution run preprocess --adhoc

3. Run pipelines (full flow)

vh pipeline run <pipeline-name> --adhoc

Example (drift detection pipeline):

vh pipeline run inference-drift-detection-pipeline --adhoc

Secrets & Environment Variables

The step call-retrain.py uses the Valohai API and needs a private token. Do not commit the token; use Valohai secrets instead.

In Valohai you can:

Set environment variables when creating an execution (Create Execution → Environment Variables). They apply only to that execution.
Set project environment variables (Project Settings → Environment Variables, mark as Secret). They apply to all executions in the project.

Author & Contact


Author	KuchikiRenji
Email	KuchikiRenji@outlook.com
GitHub	github.com/KuchikiRenji
Discord	`kuchiki_renji`

This project demonstrates ML data drift detection and MLOps monitoring with Evidently AI and Valohai.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
images		images
README.md		README.md
call-retrain.py		call-retrain.py
preprocess.py		preprocess.py
report.py		report.py
train_model.py		train_model.py
valohai.yaml		valohai.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ML Data Drift Detection with Evidently AI & Valohai | MLOps Monitoring

Table of Contents

What This Project Does

What is Data Drift?

Features

Tech Stack

Training Pipeline

Steps

Pipeline in Valohai

Drift Detection Pipeline

Steps

Pipeline in Valohai

Project Flow

Flow overview

Getting Started

Running on Valohai

1. Configure the repository

2. Run executions (single steps)

3. Run pipelines (full flow)

Secrets & Environment Variables

Author & Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ML Data Drift Detection with Evidently AI & Valohai | MLOps Monitoring

Table of Contents

What This Project Does

What is Data Drift?

Features

Tech Stack

Training Pipeline

Steps

Pipeline in Valohai

Drift Detection Pipeline

Steps

Pipeline in Valohai

Project Flow

Flow overview

Getting Started

Running on Valohai

1. Configure the repository

2. Run executions (single steps)

3. Run pipelines (full flow)

Secrets & Environment Variables

Author & Contact

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages