Vendor demos under demos/ and link from README for landing-page visibility
This commit is contained in:
+128
@@ -0,0 +1,128 @@
|
||||
# sem_cython12 - sample projects
|
||||
|
||||
Three short, runnable Python projects that demonstrate the `sem_cython12`
|
||||
library on small but realistic problems. Each demo is a single file,
|
||||
self-contained, and produces a clear printable result.
|
||||
|
||||
The demos use **only** `sem_cython12.wrapper`, `numpy`, and (for the
|
||||
Iris and anomaly demos) `scikit-learn`.
|
||||
|
||||
## What each demo shows
|
||||
|
||||
| File | Domain | "Wow" |
|
||||
|---|---|---|
|
||||
| [`01_iris_boundary.py`](./01_iris_boundary.py) | The 1936 Iris dataset | Rediscovers the famous versicolor/virginica boundary specimens **without training a classifier** and without setting any threshold. |
|
||||
| [`02_anomaly_detection.py`](./02_anomaly_detection.py) | Synthetic 5-D anomalies | Detects 10/10 injected anomalies with **a single function call** and matches/beats sklearn's IsolationForest on ROC AUC. |
|
||||
| [`03_multicriteria_selection.py`](./03_multicriteria_selection.py) | Multi-criteria candidate ranking | Identifies the **hidden all-rounders** that naive sum-of-scores ranking misses entirely. |
|
||||
|
||||
## Install
|
||||
|
||||
```bash
|
||||
# Get the library (private repo)
|
||||
git clone https://git.sevana.biz/vvs/sem_cython12.git ../sem_cython12
|
||||
export PYTHONPATH="$(pwd)/../sem_cython12:$PYTHONPATH"
|
||||
|
||||
# Demo dependencies
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
The pre-built Linux x86_64 / CPython 3.12 binary ships with the
|
||||
library; no compilation step is required.
|
||||
|
||||
## Run
|
||||
|
||||
```bash
|
||||
python 01_iris_boundary.py
|
||||
python 02_anomaly_detection.py
|
||||
python 03_multicriteria_selection.py
|
||||
```
|
||||
|
||||
Each demo finishes in well under a second on a laptop.
|
||||
|
||||
## What you'll see
|
||||
|
||||
### 01_iris_boundary.py
|
||||
|
||||
```
|
||||
Auto-derived kernel scale lam = 3.4762
|
||||
|
||||
Top 10 most ambiguous specimens (highest cross-species score):
|
||||
|
||||
rank idx species sim->setosa sim->versic sim->virgin cross
|
||||
1 138 virginica 0.2330 0.9096 1.0000 0.9096
|
||||
2 70 versicolor 0.2396 1.0000 0.9096 0.9096
|
||||
3 127 virginica 0.2222 0.8806 1.0000 0.8806
|
||||
4 83 versicolor 0.2084 1.0000 0.8689 0.8689
|
||||
5 133 virginica 0.2062 0.8689 1.0000 0.8689
|
||||
...
|
||||
|
||||
Top 10 distribution by species:
|
||||
setosa : 0 of 10
|
||||
versicolor : 3 of 10
|
||||
virginica : 7 of 10
|
||||
|
||||
*** Confirmed: zero setosa specimens; the top-10 boundary cases ***
|
||||
*** all come from the famous versicolor/virginica overlap zone. ***
|
||||
```
|
||||
|
||||
### 02_anomaly_detection.py
|
||||
|
||||
```
|
||||
SEM (sem_cython12 - one batch_max_similarity call)
|
||||
Top-10 retrieved as anomalous: precision = 10/10
|
||||
ROC AUC = 1.0000
|
||||
|
||||
Baseline: sklearn IsolationForest (default settings)
|
||||
Top-10 retrieved as anomalous: precision = 10/10
|
||||
ROC AUC = 1.0000
|
||||
|
||||
SEM matches IsolationForest within noise (+0.0000 AUC),
|
||||
with one function call and zero tuning.
|
||||
```
|
||||
|
||||
### 03_multicriteria_selection.py
|
||||
|
||||
```
|
||||
Best-tradeoff frontier size : 35
|
||||
Cross-criterion winners (NRW) : 31
|
||||
Hidden all-rounders we injected : 5 (indices 0-4)
|
||||
|
||||
NRW recovered hidden all-rounders : 5/5 [0, 1, 2, 3, 4]
|
||||
Naive top-10 found hidden all-rounders: 3/5 [1, 2, 3]
|
||||
|
||||
*** SEM's NRW filter recovered 5/5 hidden all-rounders. ***
|
||||
*** Naive sum-of-scores top-10 found only 3/5. ***
|
||||
*** SEM surfaces 2 candidates the naive ranking misses ***
|
||||
*** because they don't peak on any single criterion. ***
|
||||
```
|
||||
|
||||
## What to try next
|
||||
|
||||
- Replace the synthetic data in `02_*` with your own observations and
|
||||
see what gets flagged.
|
||||
- Replace the synthetic candidate matrix in `03_*` with your
|
||||
real-world multi-criteria evaluation (job applicants, vendor
|
||||
proposals, product features, drug screens).
|
||||
- Extend `01_*` to your own classification problems: any time you
|
||||
have multiple classes with overlapping members, the NRW operator
|
||||
surfaces the structurally informative boundary cases.
|
||||
|
||||
The library has more capabilities than these three demos exercise.
|
||||
See the `sem_cython12.wrapper` API for the full operator set
|
||||
(pairwise distances, multi-class similarity matrix, incremental
|
||||
aggregation, etc.).
|
||||
|
||||
## Licence
|
||||
|
||||
The demos and the underlying `sem_cython12` library are licensed
|
||||
under the terms in the [LICENSE](./LICENSE) file:
|
||||
|
||||
- Research and non-commercial use: free under the conditions
|
||||
stated in the licence.
|
||||
- Commercial use: requires a separate written commercial licence.
|
||||
Contact `sales@sevana.biz`.
|
||||
- The Software is provided strictly "AS IS", without warranty of
|
||||
any kind.
|
||||
|
||||
Please read the LICENSE file in full before using the demos or the
|
||||
underlying library.
|
||||
Reference in New Issue
Block a user