Add SEM_Overview.md and SEM_Mathematical_Apparatus.md under docs/ and link from README

2026-05-09 19:24:43 +01:00
parent 80f99d1d15
commit fa87dbb473
3 changed files with 550 additions and 0 deletions
@@ -147,6 +147,15 @@ internally cast to contiguous `float64`.  Outputs are numpy arrays.

 See the wrapper docstrings for exact semantics of each function.

+## Documentation
+
+- [`docs/SEM_Overview.md`](./docs/SEM_Overview.md) — non-internal
+  introduction to SEM (Similarity Energy Model), what it does, and
+  how the `sem_cython12` library fits in.
+- [`docs/SEM_Mathematical_Apparatus.md`](./docs/SEM_Mathematical_Apparatus.md)
+  — capabilities-level description of the operators and engines
+  exposed by the library.
+
 ## Demos

 Three runnable demos live in [`demos/`](./demos/):
@@ -0,0 +1,270 @@
+# SEM — Mathematical Apparatus (Capability Catalog)
+
+*A non-internal catalog of the operators SEM offers, what each is for,
+and which entry points of the `sem_cython12` library back them.*
+
+This document describes WHAT the apparatus does and WHERE to use it.
+It does not describe HOW any operator works internally — algorithms,
+formulas, lemmas and proofs are intentionally not reproduced here.
+
+---
+
+## Conventions
+
+- "Item" / "world" / "observation": one row of input data.  Items live
+  in some payload space (real numbers, vectors, matrices, sampled
+  functions, sampled manifolds, distributions, complex amplitudes,
+  time-series windows, recursive concept trees) — the apparatus
+  treats them uniformly via a small set of structural operators.
+- "Concept": a subset of items that share structural meaning.  The
+  apparatus can either be told the concepts (labelled mode) or
+  discover them from data (unsupervised mode).
+- "Witness": an item whose structural position carries information
+  beyond merely belonging to one concept.
+- "Verdict": the system's qualified output for a new observation -
+  one of `confident`, `gap`, `incoherent` (see §4.6).
+
+All of the apparatus is parameter-free and threshold-free: there are
+no fitting parameters, no numeric cut-offs, no fidelity knobs.
+
+---
+
+## 1.  Structural similarity primitives
+
+These are the lowest-level building blocks.  Each is exposed directly
+in `sem_cython12.wrapper`.
+
+### 1.1  Pairwise similarity
+
+| | |
+|---|---|
+| Purpose | Score how close a query item is to the most similar member of a reference set. |
+| Output | A score in `[0, 1]` per query (1 = at the reference set, 0 = effectively far). |
+| Applications | Membership tests, retrieval, anomaly detection, k-nearest-neighbour pre-filtering, similarity-weighted aggregation. |
+| Cython entry point | `batch_max_similarity(X_query, X_members, lam)` |
+
+### 1.2  Multi-class similarity matrix
+
+| | |
+|---|---|
+| Purpose | The same operation applied across `K` independent reference sets in one call, returning a `(Q, K)` score matrix. |
+| Applications | Multi-class classification scoring, multi-criterion membership, class-confusion matrices, support-vector inputs to higher-level filters. |
+| Cython entry point | `concept_support_matrix(X_query, member_mats, lam)` |
+
+### 1.3  Pairwise distance matrix
+
+| | |
+|---|---|
+| Purpose | Symmetric `(N, N)` distance matrix between rows of `X`. |
+| Applications | Graph construction, clustering, scale estimation, downstream filtering and ranking. |
+| Cython entry point | `pairwise_distances(X)` |
+
+### 1.4  Nearest-neighbour distance vector
+
+| | |
+|---|---|
+| Purpose | For each row, the minimum positive distance to any other row.  Rows with no positive-distance neighbour receive `inf`. |
+| Applications | Local-density estimation, intrinsic-scale derivation, duplicate detection, outlier identification. |
+| Cython entry point | `nn_distances(X)` |
+
+---
+
+## 2.  Multi-criterion filtering primitives
+
+Given a real-valued matrix `S` of shape `(N, k)` (rows are items,
+columns are independent criteria — each in maximisation orientation),
+these primitives identify structurally informative subsets of rows.
+
+### 2.1  Best-tradeoff filter
+
+| | |
+|---|---|
+| Purpose | Mask the rows that survive a multi-objective best-tradeoff filter (i.e. items that are not strictly worse than another item on every criterion). |
+| Applications | Multi-objective optimisation frontier, concept-membership trade-off, candidate winnowing before further analysis. |
+| Cython entry point | `pareto_core_mask(S)` |
+
+### 2.2  One-sided peak flagging
+
+| | |
+|---|---|
+| Purpose | Flag row/column pairs where the row is the column-wise winner but contributes nothing on the remaining columns - i.e. items that "peak" on a single criterion alone. |
+| Applications | Removing items that are only locally informative; finding cross-criterion contributors; bridge identification. |
+| Cython entry point | `one_sided_mask(S)` |
+
+### 2.3  Non-redundant witness identification
+
+| | |
+|---|---|
+| Purpose | The subset of rows that survive both 2.1 and 2.2 — items that contribute meaningfully across multiple criteria, not just on one. |
+| Applications | Bridge-witness selection between concept regions, structurally informative subset extraction, downstream gap analysis. |
+| Cython entry point | `non_redundant_witnesses(S)` |
+
+---
+
+## 3.  Incremental aggregation primitive
+
+### 3.1  Fused centroid + radius update
+
+| | |
+|---|---|
+| Purpose | One-pass bulk update for an incremental aggregation step.  Given `F` reference items - each summarised by a centre vector and a radius (representing the dispersion of `cur_arity` underlying points) - and `A` candidate new contributions, produce all `F * A` updated (centre, radius) pairs that result from appending one candidate to one reference item. |
+| Applications | Streaming centroid / radius maintenance, candidate-frontier expansion in multi-stage selection, online aggregation pipelines. |
+| Cython entry point | `extend_frontier_kernel(cur_centers, cur_radii, new_emb, cur_arity)` |
+
+---
+
+## 4.  Higher-level apparatus
+
+Built on the primitives in §1–§3.  These are the operators that
+distinguish SEM as a reasoning system rather than a computation
+library.  Their internal construction is not reproduced here; the
+"Cython entry points used" column lists the public primitives the
+operator composes.
+
+### 4.1  Intrinsic scale
+
+| | |
+|---|---|
+| Purpose | Derive the kernel scale from the data's own structural geometry, so that no manual `lam` value is ever required. |
+| Applications | Any pipeline that wants the scale property to be a function of the data, not a tuning knob; cross-application portability. |
+| Cython entry points used | `nn_distances`, `pairwise_distances` |
+
+### 4.2  Concept discovery
+
+| | |
+|---|---|
+| Purpose | Group observations into structurally coherent regions without using labels, ML training, or numeric thresholds.  Returns the concepts the data itself supports. |
+| Applications | Unsupervised classification, regime identification, exploratory analysis, foundation for downstream operators. |
+| Cython entry points used | `pairwise_distances`, `nn_distances`, `pareto_core_mask` |
+
+### 4.3  Relational hypothesis generation
+
+| | |
+|---|---|
+| Purpose | Enumerate candidate structural relationships between concepts (pair-wise and higher-arity) and rank them by support. |
+| Applications | Discovering laws / regularities between groups, cross-concept analysis, scientific structure recovery. |
+| Cython entry points used | `concept_support_matrix`, `pareto_core_mask`, `extend_frontier_kernel` |
+
+### 4.4  Semantic gap detection
+
+| | |
+|---|---|
+| Purpose | Identify positions in structural space where the data should produce a witness bridging two or more concepts but does not. |
+| Applications | Detecting missing variables, hidden mediators, unobserved confounders; identifying where additional measurement would resolve ambiguity. |
+| Cython entry points used | `concept_support_matrix`, `non_redundant_witnesses` |
+
+### 4.5  Prototype construction
+
+| | |
+|---|---|
+| Purpose | Predict the structural features of an item that should exist between known concepts but has not yet been observed. |
+| Applications | Drug-candidate suggestion, missing-mediator prediction, "what if" scenario generation, hypothesis-driven data acquisition. |
+| Cython entry points used | `batch_max_similarity`, `concept_support_matrix` |
+
+### 4.6  Verdict-qualified inference
+
+| | |
+|---|---|
+| Purpose | Decide which concept best explains a new observation, returning one of three outcomes: `confident` (a single concept dominates), `gap` (multiple concepts are equally admissible), `incoherent` (no concept admits the observation consistently). |
+| Applications | Decision-support systems that must abstain when ambiguous, safety-critical classification, regime change detection, automated triage. |
+| Cython entry points used | `concept_support_matrix`, `pareto_core_mask`, `batch_max_similarity` |
+
+### 4.7  Lifecycle / dominance verification
+
+| | |
+|---|---|
+| Purpose | When a real observation arrives, decide whether it confirms, displaces, or co-exists with a previously predicted prototype.  Maintains the prototype's status across its lifetime. |
+| Applications | Continuous-learning pipelines, theory revision under new evidence, audit-trail-preserving inference. |
+| Cython entry points used | `pareto_core_mask` |
+
+### 4.8  Hierarchical recursion
+
+| | |
+|---|---|
+| Purpose | Apply every operator above to recursive concept trees — concepts whose members are themselves concepts.  Operators bubble through the hierarchy and remain mathematically consistent at every level. |
+| Applications | Taxonomies, organisational hierarchies, multi-scale analysis (chemical → biological → organism, file → folder → project, etc.). |
+| Cython entry points used | the operators above, recursively |
+
+### 4.9  Streaming kNN graph maintenance
+
+| | |
+|---|---|
+| Purpose | Maintain an exact k-nearest-neighbour graph as items are added or removed one at a time, without rebuilding from scratch on each update. |
+| Applications | Online time-series ingest, sliding-window analytics, sensor-stream monitoring, real-time anomaly detection. |
+| Cython entry points used | `pairwise_distances`, `nn_distances` (on the contiguous buffer); `scipy.spatial.cKDTree` is used internally above 1000 items for exact O(log N) queries — no fidelity knob. |
+
+### 4.10  Time-series streaming model
+
+| | |
+|---|---|
+| Purpose | A complete reasoning model over sliding windows of a stream: state extraction, transition modelling, intrinsic-scale maintenance, and verdict-qualified prediction on novel windows.  Optionally projects high-dimensional windows to lower dimensions when configured to do so. |
+| Applications | Multivariate time-series classification, regime detection, online anomaly identification, signal-quality forecasting. |
+| Cython entry points used | `nn_distances` (intrinsic scale), `concept_support_matrix` (verdict), the streaming-kNN apparatus from 4.9 |
+
+---
+
+## 5.  Composition properties
+
+The operators in §1–§4 compose along several axes:
+
+- **Across payload types**: the same operator works for scalars,
+  vectors, matrices, tensors, functions, manifolds, complex states,
+  distributions, time-series windows.  The caller supplies the
+  appropriate distance function or, equivalently, an embedding into
+  Euclidean space.
+- **Across hierarchy levels**: concepts can themselves be members of
+  parent concepts; operators recurse through the tree (§4.8).
+- **Under wrapping**: stochastic and temporal extensions can be
+  layered over any base payload type.  Triple compositions like
+  "hierarchy of stochastic time-series" are admissible and produce
+  consistent results at every level.
+
+---
+
+## 6.  What the apparatus does NOT offer
+
+Stated explicitly so users can plan around the limits:
+
+- No probability distributions over outcomes.  Verdicts are
+  structural, not Bayesian.
+- No reward / objective optimisation.  The apparatus does not learn
+  policies; it identifies structural relationships.
+- No tuning knobs that trade fidelity for speed.  Where some
+  alternatives expose `epsilon`, `top_k`, `temperature`, etc., the
+  apparatus uses data-derived structural boundaries instead.
+- No approximate-mode kNN (HNSW / IVF / LSH / FAISS lossy modes).
+  Every kNN-related operator returns exact results.
+
+---
+
+## 7.  Mapping summary
+
+| Apparatus operator | Cython entry point(s) |
+|---|---|
+| Pairwise similarity | `batch_max_similarity` |
+| Multi-class similarity | `concept_support_matrix` |
+| Pairwise distance | `pairwise_distances` |
+| Nearest-neighbour distance | `nn_distances` |
+| Best-tradeoff filter | `pareto_core_mask` |
+| One-sided peak flag | `one_sided_mask` |
+| Non-redundant witness | `non_redundant_witnesses` |
+| Fused centroid + radius update | `extend_frontier_kernel` |
+| Intrinsic scale | composed of `nn_distances`, `pairwise_distances` |
+| Concept discovery | composed of `pairwise_distances`, `nn_distances`, `pareto_core_mask` |
+| Relational hypothesis generation | composed of `concept_support_matrix`, `pareto_core_mask`, `extend_frontier_kernel` |
+| Semantic gap detection | composed of `concept_support_matrix`, `non_redundant_witnesses` |
+| Prototype construction | composed of `batch_max_similarity`, `concept_support_matrix` |
+| Verdict-qualified inference | composed of `concept_support_matrix`, `pareto_core_mask`, `batch_max_similarity` |
+| Lifecycle / dominance verification | composed of `pareto_core_mask` |
+| Hierarchical recursion | every operator above, recursively |
+| Streaming kNN graph | `pairwise_distances`, `nn_distances` |
+| Time-series streaming model | `nn_distances`, `concept_support_matrix`, streaming kNN |
+
+## 8.  Library availability
+
+The Cython entry points in the right column of §7 are all in
+`sem_cython12.wrapper`, distributed at
+[https://git.sevana.biz/vvs/sem_cython12](https://git.sevana.biz/vvs/sem_cython12).
+Higher-level apparatus (composed operators in §4) is built on those
+primitives and ships in the SEM foundation package, separate from
+this library.
@@ -0,0 +1,271 @@
+# SEM — An Overview of Structural Reasoning
+
+*A non-internal introduction to the SEM (Similarity Energy Model)
+reasoning system, its applications, and the `sem_cython12` library.*
+
+---
+
+## 1.  What SEM is
+
+SEM is a reasoning system for **discovering structure in observed
+data** and producing **decision-qualified predictions** about new
+observations.  Unlike conventional machine learning, SEM is not a
+parameterised model fitted to training data: its outputs are derived
+directly from the geometry of the observed world set.  Where ML asks
+"what is the most likely label?", SEM asks "what is the structural
+position of this observation relative to everything we have seen?"
+— and reports the answer as a verdict, not a probability.
+
+The system has been used as a discovery engine, an anomaly detector,
+a missing-mediator predictor, a regime-change identifier, and an
+explainable inference layer over neural-network embeddings.  Each
+application reuses the same small set of structural operators.
+
+## 2.  Properties that distinguish SEM
+
+- **Parameter-free.**  No learning rates, no regularisation
+  coefficients, no tuning knobs in the reasoning pipeline.  Every
+  scale or boundary the system consults is computed from the data
+  itself.
+- **Threshold-free.**  No `if score > 0.85` decisions.  Where
+  conventional pipelines impose a numeric cut-off, SEM uses
+  data-derived structural boundaries that adapt to the observed
+  geometry.
+- **Three-valued verdict.**  A prediction returns one of:
+  - **confident** — a single best-fitting concept dominates;
+  - **gap** — multiple concepts are equally admissible, signalling
+    that the query lies in a region the current theory has not
+    resolved;
+  - **incoherent** — no concept admits the query consistently;
+    further data is required.
+  This refusal-to-guess is the system's most useful safety property:
+  it never collapses uncertainty into a forced label.
+- **Detects what is missing.**  SEM identifies positions where
+  observed data should produce a structural witness but does not, and
+  predicts the features the missing entity should carry.  Conventional
+  ML cannot signal that a hidden mediator or unobserved variable is
+  required.
+- **Explainable by construction.**  Every prediction comes with a
+  decomposition of the supporting evidence, so a downstream system
+  (or human reviewer) can audit which structural relations argue for
+  a given verdict.
+- **Composable across data types.**  The same reasoning apparatus
+  applies to scalars, vectors, matrices, sampled functions, sampled
+  manifolds, complex (quantum) state vectors, distributions, time-
+  series windows, and recursive concept hierarchies.  The operators
+  see all of these through a common interface.
+
+## 3.  Where SEM has been applied
+
+| Domain | Capability used |
+|---|---|
+| Multivariate time series | Regime detection, forecast verdicts, anomaly identification |
+| Scientific law discovery | Recovering analytic relationships from raw measurements |
+| Drug / molecule screening | Structural similarity beyond fingerprints |
+| Network monitoring | Silent-failure detection in encrypted traffic |
+| Causal inference | Discovering missing variables from observational data |
+| Image / signal analysis | Structural feature extraction with explainability |
+| LLM explainability | Interpreting embedding-space behaviour |
+| Geopolitical forecasting | Producing confident / abstain forecasts on event data |
+| Trading & market structure | Regime-switch decisions with abstain semantics |
+
+In each case the value is the same: the system either gives a
+high-confidence answer or refuses to, and never delivers a confident
+wrong answer disguised as a probability.
+
+## 4.  How SEM differs from machine learning
+
+|  | Machine learning | SEM |
+|---|---|---|
+| Has training phase | yes | no |
+| Has hyper-parameters | yes | no |
+| Can detect missing entities | no | yes |
+| Refuses to predict | no (returns argmax) | yes (gap / incoherent verdict) |
+| Output | numeric / probabilistic | structural with verdict |
+| Explanation | post-hoc (SHAP, LIME, attention) | inherent in the inference |
+| Scale of usable data | requires many examples | works on small data, even single-digit examples |
+
+SEM and ML are not exclusive — SEM is sometimes layered on top of
+neural-network embeddings to provide an explainability and abstention
+layer, and ML can supply the embeddings SEM reasons over.
+
+## 5.  The `sem_cython12` library
+
+`sem_cython12` is the high-performance numerical kernel layer that
+backs SEM's reasoning operators.  It is delivered as a pre-compiled
+Linux shared object plus a thin Python wrapper; users do not compile
+anything at install time.
+
+The library exposes one module:
+
+- `sem_cython12.wrapper` — Python API over the compiled kernels.
+
+Inside the module, the public functions are grouped by purpose.
+
+### 5.1  Configuration
+
+| Function | Purpose |
+|---|---|
+| `available() -> bool` | Reports whether the compiled extension loaded |
+| `backend() -> str` | `'cython12'` or `'python-fallback'` |
+| `get_num_threads() -> int` | Active OpenMP worker count |
+| `set_num_threads(n: int)` | Set OpenMP worker count (≥ 1) |
+
+OpenMP thread count defaults to roughly 50 % of the host's logical
+cores, so other processes are not starved on shared machines.  The
+caller can override via `set_num_threads()` or the `SEM_NUM_THREADS`
+environment variable.
+
+### 5.2  Distance and similarity
+
+| Function | What it does |
+|---|---|
+| `batch_max_similarity(X_query, X_members, lam)` | For each row of `X_query`, returns a similarity score in `[0, 1]` summarising its closeness to the most similar row of `X_members`.  `lam` (> 0) is the scale that determines how quickly similarity decays with separation. |
+| `concept_support_matrix(X_query, member_mats, lam)` | The same operation applied across `K` independent reference sets, returning a `(Q, K)` score matrix. |
+| `pairwise_distances(X)` | Symmetric `(N, N)` distance matrix between rows of `X`. |
+| `nn_distances(X)` | Per-row minimum positive distance to any other row. |
+
+These four cover the bulk of SEM's structural-similarity workload.
+
+### 5.3  Pareto / dominance reasoning
+
+| Function | What it computes |
+|---|---|
+| `pareto_core_mask(S)` | Boolean mask of rows not strictly dominated in the maximisation order |
+| `one_sided_mask(S)` | Per-row, per-column mask used for non-redundant-witness selection |
+| `non_redundant_witnesses(S)` | Indices of rows that survive both the Pareto and one-sided filters |
+
+These let the caller reason about which observations *meaningfully*
+contribute to bridging multiple structural classes — versus those that
+are merely peaks of a single class.
+
+### 5.4  Vector reduction
+
+| Function | What it computes |
+|---|---|
+| `extend_frontier_kernel(...)` | Fused centroid + radius computation for incremental hypothesis generation |
+
+Used by higher-level routines that need to enumerate candidate
+relational hypotheses bridging multiple regions of structural space.
+
+### 5.5  Performance
+
+Measured on commodity x86_64 hardware with 8 OpenMP threads against
+the equivalent pure-numpy reference implementations:
+
+| Operation | Speed-up |
+|---|---|
+| `batch_max_similarity` (N=2000, D=50) | ~14× |
+| `pareto_core_mask` (N=1000, k=8) | ~50× |
+| Streaming kNN ingest (sliding-window, len=600) | ~100× |
+| Higher-arity hypothesis frontier (k=4, m=20) | brute force is intractable; pruned form runs sub-second |
+
+All routines release the GIL during their inner loops, so calling
+them concurrently from Python threads is safe.
+
+## 6.  A worked Python example
+
+The following snippet uses only `sem_cython12.wrapper` and `numpy`.
+It shows how a downstream pipeline would identify the **structurally
+informative** members of a small synthetic dataset — those that
+mediate between two clusters rather than sitting at one cluster's
+peak.
+
+```python
+import numpy as np
+from sem_cython12 import wrapper as cy
+
+assert cy.available(), "compiled extension did not load"
+print("backend:", cy.backend(), "  threads:", cy.get_num_threads())
+
+# Two well-separated clusters in 4-D, plus three "bridging" candidates
+# whose similarity profile spans both clusters.
+rng = np.random.default_rng(0)
+cluster_a = rng.standard_normal((20, 4)) +  3.0
+cluster_b = rng.standard_normal((20, 4)) -  3.0
+bridges   = np.array([
+    [ 0.0, 0.0,  0.0, 0.0],
+    [ 0.5, 0.5, -0.2, 0.1],
+    [-0.3, 0.1,  0.4, -0.2],
+])
+members = np.vstack([cluster_a, cluster_b, bridges])
+
+# 1. Build a 2-class similarity matrix:
+#    columns = (sim to cluster_a, sim to cluster_b)
+sim_a = cy.batch_max_similarity(members, cluster_a, lam=1.0)
+sim_b = cy.batch_max_similarity(members, cluster_b, lam=1.0)
+S = np.column_stack([sim_a, sim_b])               # (N, 2)
+
+# 2. Find the Pareto frontier of (sim_a, sim_b).
+#    Members whose support vector is strictly dominated by another
+#    member are excluded.
+keep_mask = cy.pareto_core_mask(S)
+print("Pareto-frontier members:", int(keep_mask.sum()), "/", len(members))
+
+# 3. Of those, which are NOT one-sided peaks?
+#    A one-sided member is a peak of exactly one cluster and gains
+#    nothing on the other.  We want members that score on BOTH.
+non_redundant = cy.non_redundant_witnesses(S)
+print("Non-redundant witnesses:", non_redundant.tolist())
+
+# 4. Inspect the ones that survived: these are the data points that
+#    structurally connect the two clusters.
+for idx in non_redundant:
+    print(f"  row {idx}:  sim_a={S[idx, 0]:.3f}  sim_b={S[idx, 1]:.3f}")
+```
+
+A typical run prints something like:
+
+```
+backend: cython12   threads: 4
+Pareto-frontier members: 8 / 43
+Non-redundant witnesses: [40, 41, 42]
+  row 40:  sim_a=0.428  sim_b=0.428
+  row 41:  sim_a=0.412  sim_b=0.401
+  row 42:  sim_a=0.402  sim_b=0.395
+```
+
+The library has filtered out the 40 cluster members (which sit at
+their own cluster's peak and contribute nothing across cluster
+boundaries) and identified the three synthetic "bridges" as the
+structurally informative observations.  This is the kind of
+elementary operation that higher-level SEM reasoning composes into
+concept discovery, gap detection and prototype prediction.
+
+## 7.  When to consider SEM
+
+| Situation | Consider SEM |
+|---|---|
+| You have small data (10–10,000 examples) and need a defensible decision | Yes |
+| You need to know *what is missing* from your data | Yes |
+| You need a model that refuses to guess when the data is ambiguous | Yes |
+| You want explanations that are inherent to the inference, not bolted on | Yes |
+| You have millions of labelled examples and need raw classification accuracy | Stay with ML |
+| You have a regression task with smooth dependencies | Stay with classical statistics |
+
+## 8.  Library availability
+
+`sem_cython12` is distributed as a pre-compiled Linux x86_64 / CPython
+3.12 shared object.  Installation is:
+
+```bash
+git clone https://git.sevana.biz/vvs/sem_cython12.git
+cd sem_cython12
+pip install -r requirements.txt
+export PYTHONPATH=$PWD:$PYTHONPATH
+```
+
+The package contains `sem_cython12/__init__.py`, `sem_cython12/wrapper.py`,
+and the compiled `.so`, plus `requirements.txt` and a README describing
+the public API.
+
+## 9.  Summary
+
+SEM is a structural reasoning system whose promise is decision
+quality, not raw accuracy.  Its key product is a verdict-qualified
+prediction: the system tells you whether it is confident, whether
+the data is genuinely ambiguous, or whether the observation lies
+outside the apparatus's coherent coverage.  The `sem_cython12`
+library provides the high-performance numerical layer beneath this
+reasoning, exposing a small, well-defined Python API that downstream
+applications compose into domain-specific pipelines.