- Linux x86_64: add cp310, cp311, cp313 (.so), built in conda-forge envs. - Windows AMD64: add cp310, cp311, cp313 (.pyd), built with MSVC v14.50. - All eight binaries verified to produce identical numerical output. - README compatibility table + build provenance updated. - macOS still deferred.
sem_cython12
OpenMP-parallel numerical kernel library for Python. Pre-built Linux and Windows binaries included; no compilation required at install time.
What is this for?
For an introduction to SEM (Similarity Energy Model) and how
sem_cython12 fits in, see:
docs/SEM_Overview.md— non-internal introduction to SEM, what it does, and how this library fits in.docs/SEM_Mathematical_Apparatus.md— capabilities-level description of the operators and engines exposed by the library.
Contents
sem_cython12/sem_core12.cpython-3{10,11,12,13}-x86_64-linux-gnu.so- compiled extensions (Linux, x86_64) for CPython 3.10 / 3.11 / 3.12 / 3.13.sem_cython12/sem_core12.cp3{10,11,12,13}-win_amd64.pyd- compiled extensions (Windows, AMD64) for CPython 3.10 / 3.11 / 3.12 / 3.13.sem_cython12/wrapper.py- Python API.sem_cython12/__init__.py- package entry.
Python's import system selects the correct binary for the running
interpreter automatically — install the whole package and the right
.so / .pyd is picked up by ABI tag.
Compatibility
| Platform | Architecture | Python | Runtime requirements |
|---|---|---|---|
| Linux | x86_64 | CPython 3.10/3.11/3.12/3.13 | glibc >= 2.31, libgomp |
| Windows 10/11 | AMD64 | CPython 3.10/3.11/3.12/3.13 | vcomp (ships with Windows) |
| macOS | - | - | not provided (contact sales@sevana.biz) |
Single Python dependency: numpy >= 1.23 (see requirements.txt).
How the binaries were built
- Linux (
*.so), cp312: system gcc 13.3 on Ubuntu, OpenMP vialibgomp, flags-O3 -ffast-math -march=native -fopenmp. - Linux (
*.so), cp310 / cp311 / cp313: conda-forge gcc inside isolatedpython=3.10/3.11/3.13envs (clean, system-Python-free build), same OpenMP and optimisation flags. - Windows (
*.pyd), all four versions: MSVC v14.50 (Visual Studio Build Tools 2026), OpenMP viavcomp, flags/O2 /openmp. Each built against the matching CPython interpreter installed viawinget.
All eight binaries pass the same numerical smoke test
(batch_max_similarity over fixed-seed data) and produce identical
output to within float64 round-off.
Install
git clone https://git.sevana.biz/vvs/sem_cython12.git
cd sem_cython12
pip install -r requirements.txt
# Make the package importable, either:
pip install -e . # if pyproject.toml/setup.py is added
# or just put the package on PYTHONPATH:
export PYTHONPATH=$PWD:$PYTHONPATH
Quick start
import numpy as np
from sem_cython12 import wrapper as cy
# Sanity check
assert cy.available(), "compiled extension did not load"
print("backend:", cy.backend())
# Thread count (defaults to ~50% of logical cores; set explicitly via
# either the SEM_NUM_THREADS env var or set_num_threads()):
cy.set_num_threads(8)
print("threads:", cy.get_num_threads())
# Example workload
rng = np.random.default_rng(0)
Q = rng.standard_normal((1000, 32)) # 1000 queries
M = rng.standard_normal((5000, 32)) # 5000 reference points
# For each query: max similarity to any reference, with kernel scale lam.
sim = cy.batch_max_similarity(Q, M, lam=1.0)
print(sim.shape, sim.dtype) # (1000,) float64
API reference
All functions accept either Python lists or numpy arrays; inputs are
internally cast to contiguous float64. Outputs are numpy arrays.
Configuration
| Function | Purpose |
|---|---|
available() -> bool |
True iff the compiled extension loaded |
backend() -> str |
'cython12' or 'python-fallback' |
get_num_threads() -> int |
Active OpenMP worker count |
set_num_threads(n: int) |
Set OpenMP worker count (n >= 1) |
Distance / similarity
| Function | Inputs | Output |
|---|---|---|
batch_max_similarity(X_query, X_members, lam) |
(Q, D), (M, D), lam > 0 |
(Q,) - per-query similarity score in [0, 1] against the closest member |
concept_support_matrix(X_query, member_mats, lam) |
(Q, D), list of (M_k, D), lam > 0 |
(Q, K) - one similarity column per member set |
pairwise_distances(X) |
(N, D) |
(N, N) - symmetric distance matrix between rows |
nn_distances(X) |
(N, D) |
(N,) - min positive distance per row; inf if none |
Best-tradeoff filtering
| Function | Inputs | Output |
|---|---|---|
pareto_core_mask(S) |
(N, k) |
(N,) byte mask: rows that survive the multi-objective best-tradeoff filter |
one_sided_mask(S) |
(N, k) |
(N, k) byte mask: rows contributing meaningfully on a single column only |
non_redundant_witnesses(S) |
(N, k) |
int32 array of row indices contributing meaningfully across multiple columns |
Vector reduction
| Function | Inputs | Output |
|---|---|---|
extend_frontier_kernel(cur_centers, cur_radii, new_emb, cur_arity) |
(F, D), (F,), (A, D), int |
(flat_centers (F*A, D), flat_radii (F*A,)) |
See the wrapper docstrings for exact semantics of each function.
Demos
Three runnable demos live in demos/:
01_iris_boundary.py— rediscovers the famous Iris versicolor/virginica boundary specimens with no training, using onlyconcept_support_matrixandpairwise_distances.02_anomaly_detection.py— parameter-free anomaly detection that matches IsolationForest's AUC=1.0 on a synthetic benchmark, using onlybatch_max_similarity.03_multicriteria_selection.py— recovers 5/5 hidden balanced candidates that naive sum-of-scores ranking misses, usingpareto_core_maskandnon_redundant_witnesses.
A standalone copy of the demos repository is also published at https://git.sevana.biz/vvs/sem_cython12-demos.
Performance notes
Threads are configured globally per process; calling
set_num_threads(n) updates the OpenMP team size for all subsequent
calls. The default uses approximately 50% of the host's logical
cores so other processes are not starved on shared machines.
For workloads dominated by pairwise_distances and
pareto_core_mask, near-linear scaling up to ~8 threads is typical
on commodity x86 hardware. batch_max_similarity is BLAS-friendly
and benefits most from larger M (reference set) at fixed D.
Memory / threading model
- All arrays are processed in shared memory; no inter-process serialisation.
- Each routine releases the GIL during its inner loops, so calling it concurrently from Python threads is safe.
- The compiled extension links against the system OpenMP runtime
(
libgomp); avoid mixing with conda'sintel-openmpin the same process if possible.
Privacy / telemetry
sem_cython12 performs no network I/O, opens no sockets, and
writes no files outside the calling process's working directory.
There is no telemetry, no usage reporting, and no licence-server
check-in. All computation is in-process on local arrays.
Diagnostics
backend() returns 'python-fallback' only when the .so failed
to import (wrong architecture, glibc too old, missing libgomp). In
that state, every numerical function raises RuntimeError; check
available() before each batch to fail loudly rather than silently
fall back.
Licence
The Software is licensed under the terms contained in the LICENSE file in this repository.
In short:
- Research and non-commercial use: granted free of charge under the conditions in section 2 of the LICENSE.
- Commercial use: requires a separate written commercial
licence from the Licensor. Contact
sales@sevana.biz. - No warranty: the Software is provided strictly "AS IS", without warranty of any kind. The Licensor's total aggregate liability is limited to zero.
Please read the LICENSE file in full before using the Software.
Support
Open an issue at https://git.sevana.biz/vvs/sem_cython12.