From a98c55cea79b5a4ddfdef3b7c6ccd4bd216e5f26 Mon Sep 17 00:00:00 2001 From: Dmytro Bogovych Date: Sun, 10 May 2026 10:42:07 +0300 Subject: [PATCH] README: replace 'What is this for?' prose with pointers to docs/ Co-Authored-By: Claude Opus 4.7 --- README.md | 50 +++++++------------------------------------------- 1 file changed, 7 insertions(+), 43 deletions(-) diff --git a/README.md b/README.md index bdeb839..acd75a8 100644 --- a/README.md +++ b/README.md @@ -6,41 +6,14 @@ install time. ## What is this for? -`sem_cython12` is a small, focused toolbox of fast C-level routines -exposed through a thin numpy wrapper. It is not a general-purpose -numerical library; it accelerates three specific jobs that are -awkward or slow to do in pure numpy once `N` reaches the thousands: +For an introduction to SEM (Similarity Energy Model) and how +`sem_cython12` fits in, see: -1. **Similarity / distance over batches of vectors.** Full - pairwise distance matrices, nearest-neighbour distances, and - kernel-based `[0, 1]` similarity scores of a query set against - one or many reference sets. Useful for nearest-neighbour - search, kernel-density-style scoring, and "how close is each - query to this concept?" lookups. -2. **Multi-objective ("best-tradeoff") filtering of score matrices.** - Given a matrix of `N` candidates × `k` criteria, select the - rows on the Pareto frontier, isolate rows that only spike on a - single criterion, and recover the rows that contribute - meaningfully across several criteria - candidates a naive - sum-of-scores ranker would miss. -3. **An incremental aggregation primitive** for streaming - clustering / frontier-expansion algorithms: a fused bulk update - that, given `F` running summaries (centre + radius) and `A` - new contributions, produces all `F·A` updated summaries in one - parallel pass. - -The kernels release the GIL, scale near-linearly to ~8 OpenMP -threads on commodity x86, and operate on shared-memory numpy -arrays with no inter-process serialisation. The Python wrapper -handles contiguous-float64 casting and degrades loudly (via -`available()` / `backend()` plus `RuntimeError`) when the compiled -extension cannot load on the host - there is no slow pure-Python -fallback path. - -The [`demos/`](./demos/) directory contains three runnable -end-to-end examples (Iris boundary discovery, parameter-free -anomaly detection, multi-criteria candidate selection) that -exercise these three jobs against well-known baselines. +- [`docs/SEM_Overview.md`](./docs/SEM_Overview.md) — non-internal + introduction to SEM, what it does, and how this library fits in. +- [`docs/SEM_Mathematical_Apparatus.md`](./docs/SEM_Mathematical_Apparatus.md) + — capabilities-level description of the operators and engines + exposed by the library. ## Contents @@ -147,15 +120,6 @@ internally cast to contiguous `float64`. Outputs are numpy arrays. See the wrapper docstrings for exact semantics of each function. -## Documentation - -- [`docs/SEM_Overview.md`](./docs/SEM_Overview.md) — non-internal - introduction to SEM (Similarity Energy Model), what it does, and - how the `sem_cython12` library fits in. -- [`docs/SEM_Mathematical_Apparatus.md`](./docs/SEM_Mathematical_Apparatus.md) - — capabilities-level description of the operators and engines - exposed by the library. - ## Demos Three runnable demos live in [`demos/`](./demos/):