diff --git a/README.md b/README.md index b3db6fe..abeab93 100644 --- a/README.md +++ b/README.md @@ -4,6 +4,44 @@ OpenMP-parallel numerical kernel library for Python. Pre-built Linux and Windows binaries included; no compilation required at install time. +## What is this for? + +`sem_cython12` is a small, focused toolbox of fast C-level routines +exposed through a thin numpy wrapper. It is not a general-purpose +numerical library; it accelerates three specific jobs that are +awkward or slow to do in pure numpy once `N` reaches the thousands: + +1. **Similarity / distance over batches of vectors.** Full + pairwise distance matrices, nearest-neighbour distances, and + kernel-based `[0, 1]` similarity scores of a query set against + one or many reference sets. Useful for nearest-neighbour + search, kernel-density-style scoring, and "how close is each + query to this concept?" lookups. +2. **Multi-objective ("best-tradeoff") filtering of score matrices.** + Given a matrix of `N` candidates × `k` criteria, select the + rows on the Pareto frontier, isolate rows that only spike on a + single criterion, and recover the rows that contribute + meaningfully across several criteria - candidates a naive + sum-of-scores ranker would miss. +3. **An incremental aggregation primitive** for streaming + clustering / frontier-expansion algorithms: a fused bulk update + that, given `F` running summaries (centre + radius) and `A` + new contributions, produces all `F·A` updated summaries in one + parallel pass. + +The kernels release the GIL, scale near-linearly to ~8 OpenMP +threads on commodity x86, and operate on shared-memory numpy +arrays with no inter-process serialisation. The Python wrapper +handles contiguous-float64 casting and degrades loudly (via +`available()` / `backend()` plus `RuntimeError`) when the compiled +extension cannot load on the host - there is no slow pure-Python +fallback path. + +The [`demos/`](./demos/) directory contains three runnable +end-to-end examples (Iris boundary discovery, parameter-free +anomaly detection, multi-criteria candidate selection) that +exercise these three jobs against well-known baselines. + ## Contents - `sem_cython12/sem_core12.cpython-312-x86_64-linux-gnu.so` -