Compare commits
2 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 1f9cbe4a48 | |||
| a98c55cea7 |
@@ -6,41 +6,14 @@ install time.
|
||||
|
||||
## What is this for?
|
||||
|
||||
`sem_cython12` is a small, focused toolbox of fast C-level routines
|
||||
exposed through a thin numpy wrapper. It is not a general-purpose
|
||||
numerical library; it accelerates three specific jobs that are
|
||||
awkward or slow to do in pure numpy once `N` reaches the thousands:
|
||||
For an introduction to SEM (Similarity Energy Model) and how
|
||||
`sem_cython12` fits in, see:
|
||||
|
||||
1. **Similarity / distance over batches of vectors.** Full
|
||||
pairwise distance matrices, nearest-neighbour distances, and
|
||||
kernel-based `[0, 1]` similarity scores of a query set against
|
||||
one or many reference sets. Useful for nearest-neighbour
|
||||
search, kernel-density-style scoring, and "how close is each
|
||||
query to this concept?" lookups.
|
||||
2. **Multi-objective ("best-tradeoff") filtering of score matrices.**
|
||||
Given a matrix of `N` candidates × `k` criteria, select the
|
||||
rows on the Pareto frontier, isolate rows that only spike on a
|
||||
single criterion, and recover the rows that contribute
|
||||
meaningfully across several criteria - candidates a naive
|
||||
sum-of-scores ranker would miss.
|
||||
3. **An incremental aggregation primitive** for streaming
|
||||
clustering / frontier-expansion algorithms: a fused bulk update
|
||||
that, given `F` running summaries (centre + radius) and `A`
|
||||
new contributions, produces all `F·A` updated summaries in one
|
||||
parallel pass.
|
||||
|
||||
The kernels release the GIL, scale near-linearly to ~8 OpenMP
|
||||
threads on commodity x86, and operate on shared-memory numpy
|
||||
arrays with no inter-process serialisation. The Python wrapper
|
||||
handles contiguous-float64 casting and degrades loudly (via
|
||||
`available()` / `backend()` plus `RuntimeError`) when the compiled
|
||||
extension cannot load on the host - there is no slow pure-Python
|
||||
fallback path.
|
||||
|
||||
The [`demos/`](./demos/) directory contains three runnable
|
||||
end-to-end examples (Iris boundary discovery, parameter-free
|
||||
anomaly detection, multi-criteria candidate selection) that
|
||||
exercise these three jobs against well-known baselines.
|
||||
- [`docs/SEM_Overview.md`](./docs/SEM_Overview.md) — non-internal
|
||||
introduction to SEM, what it does, and how this library fits in.
|
||||
- [`docs/SEM_Mathematical_Apparatus.md`](./docs/SEM_Mathematical_Apparatus.md)
|
||||
— capabilities-level description of the operators and engines
|
||||
exposed by the library.
|
||||
|
||||
## Contents
|
||||
|
||||
@@ -157,15 +130,6 @@ internally cast to contiguous `float64`. Outputs are numpy arrays.
|
||||
|
||||
See the wrapper docstrings for exact semantics of each function.
|
||||
|
||||
## Documentation
|
||||
|
||||
- [`docs/SEM_Overview.md`](./docs/SEM_Overview.md) — non-internal
|
||||
introduction to SEM (Similarity Energy Model), what it does, and
|
||||
how the `sem_cython12` library fits in.
|
||||
- [`docs/SEM_Mathematical_Apparatus.md`](./docs/SEM_Mathematical_Apparatus.md)
|
||||
— capabilities-level description of the operators and engines
|
||||
exposed by the library.
|
||||
|
||||
## Demos
|
||||
|
||||
Three runnable demos live in [`demos/`](./demos/):
|
||||
|
||||
Reference in New Issue
Block a user