EZR(1) - Explainable Multi-Objective Optimization

NAME

ezr — explainable multi-objective optimization via decision trees, clustering, Naive Bayes, and active learning

SYNOPSIS

ezr [--key=val ...] CMD [args]
ezr --list
ezr --help

DESCRIPTION

ezr is a lightweight toolkit for multi-objective optimization and explainable AI. It summarizes CSV data into Num/Sym columns, builds decision trees that minimize distance to ideal outcomes, clusters rows via k-means or recursive halving, and supports active learning with Naive Bayes or centroid-based acquisition.

ezr is an experiment in "how low can you go?" — how little data is needed for effective AI. The code uses active learning to label a small number of (say) 50 informative examples. These build a regression tree which sorts the unlabelled test data. Repeated studies show that by labelling just the first ~5 examples, the selected row optimizes as well or better than state-of-the-art optimizers like SMAC (which runs two orders of magnitude slower).

Input is CSV. The header row defines column roles:

[A-Z]*    Numeric (e.g. "Age")
[a-z]*    Symbolic (e.g. "job")
[A-Z]*+   Maximize goal (e.g. "Pay+")
[A-Z]*-   Minimize goal (e.g. "Cost-")
[a-z]*!   Class label (e.g. "sick!")
*X        Ignored (e.g. "idX")
?         Missing value (in data rows, not header)

LAYOUT

Two files. No package structure, no test scaffolding.

ezr.py    Library. Section banners for each app.
cli.py    CLI dispatch. `eg_<app>` demos + `eg_test_<app>` tests.

ezr.py sections: Types, Col (Num, Sym), Data, Distance, Bayes, Comparison (pick, picks, extrapolate), Format, Stats (same, bestRanks, confused), Tree, Cluster, Classify, Search (sa, ls, de, oneplus1), Acquire, Textmine (tokenize, stem, tfidf, cnb).

cli.py exposes everything in ezr.py as eg_<name> commands. Tests are eg_test_<name> and run as plain function calls — no pytest dependency.

INSTALLATION

git clone http://github.com/timm/ezr
cd ezr
pip install -e .

Creates the global ezr command. Edits to ezr.py or cli.py take effect immediately. Python 3.12+. Zero runtime dependencies.

To uninstall:

pip uninstall ezr

Run without installing

git clone http://github.com/timm/ezr
cd ezr
python3 cli.py --list

Sample data

mkdir -p $HOME/gits
git clone http://github.com/timm/moot $HOME/gits/moot

COMMANDS

List everything:

ezr --list

Common commands:

ezr classify FILE       Incremental Naive Bayes; print confusion
ezr tree FILE           Grow regression tree; show structure
ezr cluster FILE        kmeans++ + kmeans; one row per cluster
ezr search sa FILE      Simulated annealing
ezr search ls FILE      Local search
ezr search de FILE      Differential evolution
ezr acquire FILE        Active learning; print best labeled rows
ezr textmine FILE       CNB text classification
ezr stats               Demo of same/bestRanks/confused

Tests (assertions over real data files):

ezr test_core
ezr test_tree
ezr test_cluster
ezr test_search
ezr test_acquire
ezr test_classify
ezr test_textmine
ezr test_stats
ezr test_all            Run every test, report pass/fail count

OPTIONS

Flags update the global config namespace the. Use --key=value. Nested keys use dots.

Learning & Trees

--learn.leaf=3      Minimum examples per leaf
--learn.budget=50   Number of rows to evaluate
--learn.check=5     Number of guesses to check
--learn.start=4     Initial number of labels

Distance & Bayes

--p=2               Distance metric (1=Manhattan, 2=Euclidean)
--bayes.m=2         m-estimate for Naive Bayes
--bayes.k=1         k-estimate (Laplace smoothing)
--few=128           Max unlabelled rows in active learning

Statistics

--stats.cliffs=0.195  Cliff's Delta threshold
--stats.conf=1.36     KS test confidence coefficient
--stats.eps=0.35      Margin of error multiplier

Textmine

--textmine.top=100    Top TF-IDF features kept
--textmine.yes=20     Positive warm-start samples
--textmine.no=20      Negative warm-start samples
--textmine.valid=20   Repeats for stats testing

Display

--seed=1            Random number seed
--show.show=30      Tree display width
--show.decimals=2   Decimal places for floats

Flags and commands interleave. Flags apply to all subsequent commands in the same invocation:

ezr --seed=42 --learn.budget=30 acquire auto93.csv

LIBRARY USAGE

from ezr import *

d = Data(csv("auto93.csv"))
win = wins(d)
t = treeGrow(d, d.rows)
treeShow(t)

for r in sorted(d.rows, key=lambda r: disty(d, r))[:5]:
    print(win(r), r)

Sample tree output. D is distance to heaven (lower is better), N is examples in branch, Goals shows centroid:

$ ezr tree ~/gits/moot/optimize/misc/auto93.csv
                               D       N     Goals
                               ====  =====   =====
                              ,0.66 ,( 50), {Acc+=15.51, Lbs-=2888.64, Mpg+=24.60}
Clndrs <= 5                   ,0.61 ,( 26), {Acc+=16.43, Lbs-=2204.46, Mpg+=30.38}
|   Volume <= 98              ,0.59 ,( 14), {Acc+=17.15, Lbs-=2024.64, Mpg+=33.57}
|   |   Volume <= 91          ,0.59 ,(  9), {Acc+=17.09, Lbs-=1927.67, Mpg+=35.56}
|   |   |   origin != 3       ,0.58 ,(  4), {Acc+=17.35, Lbs-=1908.00, Mpg+=37.50}
|   |   |   origin == 3       ,0.59 ,(  5), {Acc+=16.88, Lbs-=1943.40, Mpg+=34.00}
|   |   Volume > 91           ,0.60 ,(  5), {Acc+=17.26, Lbs-=2199.20, Mpg+=30.00}
|   Volume > 98               ,0.64 ,( 12), {Acc+=15.58, Lbs-=2414.25, Mpg+=26.67}
Clndrs > 5                    ,0.72 ,( 24), {Acc+=14.52, Lbs-=3629.83, Mpg+=18.33}
|   origin != 1               ,0.63 ,(  3), {Acc+=14.93, Lbs-=3000.00, Mpg+=26.67}
|   origin == 1               ,0.73 ,( 21), {Acc+=14.46, Lbs-=3719.81, Mpg+=17.14}
...

Key exports (all from ezr.py):

Data: Data, Num, Sym, Col, Cols, adds, add, sub, clone, mid, spread, mode, entropy, norm
Distance: distx, disty, nearest, minkowski, aha, wins
Bayes: like, likes
Comparison: pick, picks, extrapolate
Format / IO: csv, o, table, nest, thing, the
Stats: same, bestRanks, confused
Tree: Tree, treeGrow, treeCuts, treeSplit, treeLeaf, treeNodes, treeShow, treePlan
Cluster: kmeans, kpp, half, rhalf, neighbors
Classify: classify
Search: oneplus1, sa, ls, de, oracleNearest, last
Acquire: acquire, warm_start, rebalance, acquireWithBayes, acquireWithCentroid
Textmine: tmPrepare, tmTokenize, tmNostop, tmStem, tmTfidf, tmData, cnb, cnbLike, cnbLikes, tmRandom, tmActive

FILES

ezr/
  ezr.py          Library (all algorithms, section-banner organized)
  cli.py          Dispatcher + eg_* demos + eg_test_* tests
  pyproject.toml  Package config (ezr binary, version, deps)
  README.md       This file
  CHANGELOG.md    Release notes
  LICENSE.md      MIT
  resources/      Text-mining stop-words + suffix lists
  etc/            Build helpers, docs scaffolding (non-runtime)

AUTHOR

Tim Menzies timm@ieee.org, 2026. MIT License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EZR(1) - Explainable Multi-Objective Optimization

NAME

SYNOPSIS

DESCRIPTION

LAYOUT

INSTALLATION

Run without installing

Sample data

COMMANDS

OPTIONS

Learning & Trees

Distance & Bayes

Statistics

Textmine

Display

LIBRARY USAGE

FILES

AUTHOR

SEE ALSO

About

Uh oh!

Releases 4

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1,374 Commits
.github/workflows		.github/workflows
docs		docs
etc		etc
resources/text		resources/text
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE.md		LICENSE.md
Makefile		Makefile
README.md		README.md
cli.py		cli.py
ezr.py		ezr.py
pyproject.toml		pyproject.toml
todo.md		todo.md

Folders and files

Latest commit

History

Repository files navigation

EZR(1) - Explainable Multi-Objective Optimization

NAME

SYNOPSIS

DESCRIPTION

LAYOUT

INSTALLATION

Run without installing

Sample data

COMMANDS

OPTIONS

Learning & Trees

Distance & Bayes

Statistics

Textmine

Display

LIBRARY USAGE

FILES

AUTHOR

SEE ALSO

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages