diff --git a/CLAUDE.md b/CLAUDE.md new file mode 100644 index 00000000..488cad5f --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1,79 @@ +# CLAUDE.md + +This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. + +A more detailed agent reference (style guide, JSON command list, integration notes, full module class listings) is in `AGENTS.md`. Read that for deeper context; this file only captures what is needed to be productive quickly. + +## What this repo produces + +A single C++20 static library, `librtphone.a`, that implements a SIP/RTP softphone stack (codecs, media transport, SIP user agent, cross-platform audio I/O). It is consumed by other applications via a JSON command interface (`AgentImpl`). There is no executable target in `src/` — only the library. + +## Build and run + +The Python scripts wipe and recreate their build directory each invocation, so they are clean configure+build runs, not incremental. + +```bash +python3 build_linux.py # wipes build_linux/, configures with Ninja, outputs build_linux/librtphone.a +python3 build_android.py # needs ANDROID_NDK_HOME and VCPKG_ROOT; arm64-v8a, API 24 +python3 build_windows.py # needs VCPKG_ROOT (defaults to C:\tools\vcpkg); VS 2022 x64 +./run_ci.sh # Make-based CI build into ./build (used by Jenkins; pings Telegram on failure) +``` + +For incremental work, configure once and rebuild manually instead of re-running the wrapper: + +```bash +mkdir -p build && cd build +cmake ../src -G Ninja -D OPUS_X86_MAY_HAVE_SSE4_1=ON +cmake --build . -j$(nproc) +``` + +CMake options (in `src/CMakeLists.txt`): `USE_AMR_CODEC` (default ON, force-OFF on Android), `USE_EVS_CODEC` (default ON), `USE_MUSL` (default OFF). + +## Tests + +There is no unit test suite for the library itself. The only first-party test is `test/rtp_decode/` — a standalone CMake project that links the library and exercises RTP decoding from a capture file. Build and run it via: + +```bash +mkdir -p test/rtp_decode/build && cd test/rtp_decode/build +cmake .. && cmake --build . -j$(nproc) +./rtp_decode +``` + +Note: `test/rtp_decode/CMakeLists.txt` does `add_subdirectory(../../src ...)`, so it rebuilds the whole library inside its own build tree. Don't expect to share artifacts with `build_linux/`. + +## Submodules and external paths + +The `src/libs/` directory mixes vendored sources with three git submodules: `resiprocate` (SIP stack, sevana branch), `libsrtp`, and `libraries` (prebuilt platform binaries — OpenSSL 1.1, Boost headers, etc.). After clone: + +```bash +git submodule update --init --recursive +``` + +## Architecture: how a call flows through the modules + +The five modules under `src/engine/` form a vertical stack. Understanding the flow matters more than the file list (which is in `AGENTS.md`): + +1. **`agent/`** — Public API surface. Hosts apps create one `AgentImpl`, push JSON command strings into `command()`, and pull JSON event strings out via `waitForData()`/`read()`. This is the only thing external code should touch. +2. **`endpoint/`** — Wraps reSIProcate. `UserAgent` (in `EP_Engine.h`) owns the SIP transports, registrations, and session lifecycle; `EP_Account` and `EP_Session` are the SIP-side objects the agent manipulates. `EP_AudioProvider`/`EP_DataProvider` bridge SIP sessions to media streams. +3. **`media/`** (`MT::` namespace) — Codec registry (`MT_CodecList`), per-call audio pipeline (`MT_AudioStream`, `MT_AudioReceiver`), RTP send (`MT_NativeRtpSender`), SRTP (`MT_SrtpHelper`), DTMF (`MT_Dtmf`). Codec wrappers like `MT_AmrCodec`/`MT_EvsCodec` adapt vendored libraries in `src/libs/` to the `MT::Codec` base. +4. **`audio/`** (`Audio::` namespace) — Cross-platform device I/O. `Audio::Interface` is the abstraction; concrete implementations are picked at compile time: `Audio_DirectSound` (Windows), `Audio_CoreAudio` (macOS/iOS), `Audio_AndroidOboe` (Android, via `libs/oboe`), `Audio_Null` (testing). Also hosts mixing, resampling, WAV I/O, AEC integration. +5. **`helper/`** (`HL::` namespace) — Cross-cutting primitives used by every other module: `HL_Sync` (mutex/event), `HL_NetworkSocket`, `HL_Log`, `HL_VariantMap` (used as the runtime config bag passed to `UserAgent`), `HL_Rtp`, `HL_IuUP` (3G Iu-UP framing), `HL_ThreadPool`, `HL_ByteBuffer`. + +ICE/STUN lives separately under `src/libs/ice/` (it is compiled directly into the library, not as a subproject) and is wired into the media path for NAT traversal. + +Compile-time tunables — sample rate (48 kHz), buffer sizes, RTP/codec payload types, media port range — are all in `src/engine/engine_config.h`. Runtime configuration flows through `HL::VariantMap` keyed by the `CONFIG_*` enum in `EP_Engine.h`. + +Per-stream call-quality metrics (RTT, jitter per RFC 3550 §A.8, packet-loss timeline, RFC 2833 DTMF events, network MOS) are collected in `MT::Statistics` / `MT::JitterStatistics` in `src/engine/media/MT_Statistics.{h,cpp}` and surfaced through the agent's JSON event stream. + +## Conventions worth knowing before editing + +- **File prefixes encode the module**: `Agent_*`, `EP_*`, `MT_*`, `Audio_*`, `HL_*`. Match this when adding files. +- **Members use `m` prefix** (`mAgentMutex`, `mSessionMap`); smart-pointer typedefs use `P` prefix (`PSession`, `PVariantMap`). +- **Platform code is gated by `TARGET_WIN` / `TARGET_LINUX` / `TARGET_OSX` / `TARGET_ANDROID` / `TARGET_MUSL`**, set in `src/CMakeLists.txt`. Don't sniff `_WIN32`/`__linux__` directly. +- **Every source file carries the MPL 2.0 header** (see top of any existing `.cpp`). New files need the same block. +- **Thread safety is via `std::recursive_mutex`** (e.g. `mAgentMutex` guards the agent's public surface). Memory uses `std::shared_ptr` extensively — prefer the existing `P*` typedefs over raw pointers. +- The codebase recently migrated to **C++20 and `std::chrono`**; avoid reintroducing older idioms or hand-rolled time math. + +## Patent caveat + +AMR-NB/AMR-WB and EVS sources are included but no patent licenses are bundled. If a change touches `MT_AmrCodec`/`MT_EvsCodec` or their build flags, keep in mind users are expected to license these codecs themselves — don't enable them by default in contexts where that hasn't been arranged. diff --git a/src/engine/media/MT_AudioStream.cpp b/src/engine/media/MT_AudioStream.cpp index 96606af5..ffbb7627 100644 --- a/src/engine/media/MT_AudioStream.cpp +++ b/src/engine/media/MT_AudioStream.cpp @@ -375,9 +375,24 @@ void AudioStream::dataArrived(PDatagramSocket s, const void* buffer, int length, // Process incoming data packet rtpStream->process(packet); + // RTT sanity filter: jrtplib's INF_GetRoundtripTime() does the + // RFC 3550 §6.4.1 math but omits clock-skew / outlier guards; + // without these, a skewed or buggy peer can poison mRttDelay + // (and therefore the Id term in MOS). double rtt = mRtpSession.GetCurrentSourceInfo()->INF_GetRoundtripTime().GetDouble(); - if (rtt > 0) + if (rtt > 0 && rtt < 30.0) // reject "RTT not making any sense" (>30s) + { + // Once an average is established, cap a new sample at 3x mean + // so a single outlier can't skew the running RTT. + constexpr double kRttNormalizeFactor = 3.0; + const double meanRtt = mStat.mRttDelay.average(); + if (mStat.mRttDelay.is_initialized() && meanRtt > 0.0 && + rtt > meanRtt * kRttNormalizeFactor) + { + rtt = meanRtt * kRttNormalizeFactor; + } mStat.mRttDelay.process(rtt); + } } hasData = mRtpSession.GotoNextSourceWithData(); } diff --git a/src/engine/media/MT_Statistics.cpp b/src/engine/media/MT_Statistics.cpp index f9f9ee90..67042366 100644 --- a/src/engine/media/MT_Statistics.cpp +++ b/src/engine/media/MT_Statistics.cpp @@ -1,4 +1,6 @@ #include +#include +#include #include #include "MT_Statistics.h" @@ -6,60 +8,154 @@ using namespace MT; +namespace +{ + +// Per-codec impairment parameters (Ie, Bpl) from ITU-T G.113 / G.107. +// clockRate == 0 means "any". +struct MosCodecEntry { const char* mName; unsigned mClockRate; double mIe; double mBpl; }; + +constexpr MosCodecEntry kMosCodecTable[] = { + { "PCMU", 8000, 0.0, 25.0 }, + { "PCMA", 8000, 0.0, 25.0 }, + { "G722", 8000, 13.0, 21.0 }, + { "G7221", 16000, 13.0, 21.0 }, + { "G7221", 32000, 13.0, 21.0 }, + { "G729", 8000, 11.0, 19.0 }, + { "G729A", 8000, 11.0, 19.0 }, + { "G729AB", 8000, 11.0, 19.0 }, + { "G723", 8000, 15.0, 16.0 }, + { "iLBC", 8000, 11.0, 18.0 }, + { "GSM", 8000, 20.0, 10.0 }, + { "AMR", 8000, 5.0, 10.0 }, + { "AMR-WB", 16000, 7.0, 10.0 }, + { "speex", 8000, 15.0, 20.0 }, + { "speex", 16000, 10.0, 20.0 }, + { "speex", 32000, 10.0, 20.0 }, + { "opus", 48000, 5.0, 25.0 }, + + // EVS — no published G.113 value. Using AMR-WB-family Bpl with a + // conservative Ie that matches typical commercial VQM tools for EVS + // Primary ~13.2 kbps WB. + { "EVS", 16000, 5.0, 10.0 }, +}; + +constexpr double kMosDefaultIe = 0.0; +constexpr double kMosDefaultBpl = 25.0; + +bool iequals(const std::string& a, const char* b) +{ + const size_t n = std::strlen(b); + if (a.size() != n) return false; + for (size_t i = 0; i < n; ++i) + if (std::tolower(static_cast(a[i])) != + std::tolower(static_cast(b[i]))) + return false; + return true; +} + +void resolveMosCodecParams(const std::string& codecName, double& ie, double& bpl) +{ + ie = kMosDefaultIe; + bpl = kMosDefaultBpl; + if (codecName.empty()) + return; + + // Map known codec-name aliases before looking up Ie/Bpl entries. + std::string lookup = codecName; + if (iequals(lookup, "GSM-06.10")) + lookup = "GSM"; + + for (const auto& e: kMosCodecTable) + if (iequals(lookup, e.mName)) + { + ie = e.mIe; + bpl = e.mBpl; + return; + } +} + +} // anonymous namespace + void JitterStatistics::process(jrtplib::RTPPacket* packet, int rate) { - // Get current timestamp and receive time - uint32_t timestamp = packet->GetTimestamp(); - jrtplib::RTPTime receiveTime = packet->GetReceiveTime(); + // RFC 3550 §A.8 jitter. Two guards: + // + // 1. Update only when the new packet is exactly one sequence number + // after the previous in-sequence packet. Skipping this check across + // packet-loss gaps inflates jitter; skipping out-of-order packets + // entirely (the previous behaviour) under-reports it. + // 2. Ignore the first few in-sequence samples while transit time + // settles after call setup. + constexpr uint32_t kIgnoreFirstPackets = 5; + const uint32_t timestamp = packet->GetTimestamp(); + const uint32_t extSeqno = packet->GetExtendedSequenceNumber(); + const jrtplib::RTPTime receiveTime = packet->GetReceiveTime(); + + // First packet: just stash state. if (!mLastJitter) { - // First packet - mReceiveTime = receiveTime; + mReceiveTime = receiveTime; mReceiveTimestamp = timestamp; - mLastJitter = 0.0; + mLastExtSeqno = extSeqno; + mLastJitter = 0.0; + mPacketsProcessed = 1; + return; } - else + + // RFC 3550 §A.8: only adjacent packets contribute to jitter. + // Out-of-order, duplicate, and post-loss packets are skipped silently — + // but state must still advance so the *next* in-sequence pair works. + const bool adjacent = mLastExtSeqno && (extSeqno == mLastExtSeqno.value() + 1); + + if (!adjacent) { - // It is in units - int64_t receiveDelta = int64_t(receiveTime.GetDouble() * rate) - int64_t(mReceiveTime.GetDouble() * rate); - - // Check if packets are ordered ok - if (timestamp <= mReceiveTimestamp) - return; - - // Find differences in timestamp - int64_t timestampDelta = timestamp - mReceiveTimestamp; - - if (!timestampDelta) - // Skip current packet silently. Most probably it is error in RTP stream like duplicated packet. - return; - - // Find delta in units - int64_t delta = receiveDelta - timestampDelta; - - // Update max delta in milliseconds - float delta_in_seconds = float(fabs(double(delta) / rate)); - if (delta_in_seconds > mMaxDelta) - mMaxDelta = delta_in_seconds; - - // Update jitter value in units - mLastJitter = mLastJitter.value() + (fabs(double(delta)) - mLastJitter.value()) / 16.0; - /*printf("PacketNo: %d, current delta in ms: %f, jitter in ms: %f\n", - (int)packet->GetSequenceNumber(), - delta_in_ms, - float(mLastJitter.value() / (rate / 1000)));*/ - - // Save last values - mReceiveTime = receiveTime; - mReceiveTimestamp = timestamp; - - // And mJitter are in milliseconds again - float jitter_s = mLastJitter.value() / (float(rate)); - // std::cout << "Jitter (in seconds): " << std::dec << jitter_s << std::endl; - - mJitter.process(jitter_s); + // Reset the transit reference if a discontinuity (loss / reorder) + // happened, restarting from the latest known good packet. + if (mLastExtSeqno && extSeqno > mLastExtSeqno.value()) + { + mReceiveTime = receiveTime; + mReceiveTimestamp = timestamp; + mLastExtSeqno = extSeqno; + } + return; } + + // RTP FAQ: also skip when timestamp is unchanged (multi-packet frame, dup). + if (timestamp == mReceiveTimestamp) + { + mLastExtSeqno = extSeqno; + return; + } + + // Wrap-safe signed delta on the 32-bit RTP timestamp: + // transit = arrival - rtp_ts; d = transit - prev_transit (signed 32-bit). + const int32_t timestampDelta = static_cast(timestamp - mReceiveTimestamp); + const int64_t receiveDelta = + static_cast(receiveTime.GetDouble() * rate) - + static_cast(mReceiveTime.GetDouble() * rate); + const int64_t delta = receiveDelta - timestampDelta; + + // Save state for the next pair regardless of warmup. + mReceiveTime = receiveTime; + mReceiveTimestamp = timestamp; + mLastExtSeqno = extSeqno; + ++mPacketsProcessed; + + // Skip the first N in-sequence samples while transit time settles. + if (mPacketsProcessed <= kIgnoreFirstPackets) + return; + + const float deltaSec = static_cast(std::fabs(static_cast(delta) / rate)); + if (deltaSec > mMaxDelta) + mMaxDelta = deltaSec; + + // J = J + (|D| - J) / 16 + mLastJitter = mLastJitter.value() + + (std::fabs(static_cast(delta)) - mLastJitter.value()) / 16.0; + + mJitter.process(mLastJitter.value() / static_cast(rate)); } @@ -94,11 +190,9 @@ void Statistics::calculateBurstr(double* burstr, double* lossr) const if (mReceivedRtp > 0 && bursts > 0) { - *burstr = (double)((double)lost / (double)bursts) / (double)(1.0 / (1.0 - (double)lost / (double)mReceivedRtp)); - if (*burstr < 0) - *burstr = -*burstr; - else if (*burstr < 1) - *burstr = 1; + *burstr = ((double)lost / (double)bursts) * (1.0 - (double)lost / (double)mReceivedRtp); + if (*burstr < 1.0) + *burstr = 1.0; } else *burstr = 0; @@ -111,34 +205,56 @@ void Statistics::calculateBurstr(double* burstr, double* lossr) const double Statistics::calculateMos(double maximalMos) const { - // calculate lossrate and burst rate - double burstr = 0, lossr = 0; - calculateBurstr(&burstr, &lossr); - - double r = 0.0; - double bpl = 8.47627; //mos = -4.23836 + 0.29873 * r - 0.00416744 * r * r + 0.0000209855 * r * r * r; - double mos = 0.0; + // Network MOS via the simplified ITU-T G.107 E-Model: + // + // d_oneway = rtt/2 + jitter + jb_delay (ms) + // Id = 0.024*d + 0.11*max(0, d - 177.3) + // Ie_eff = Ie + (95 - Ie) * Ppl / (Ppl + Bpl) (BurstR=1) + // R = 93.2 - Id - Ie_eff (clamped to [0,100]) + // MOS = 1 + 0.035*R + 7e-6*R*(R-60)*(100-R) (clamped to [1, maximalMos]) + // + // Ie/Bpl are looked up from a per-codec table; safe defaults are used + // when the codec is unknown. if (mReceivedRtp < 10) return 0.0; - if (lossr == 0.0 || burstr == 0.0) - { - return maximalMos; - } + // Loss percent is computed as lost / (lost + received). + const uint64_t expected = static_cast(mReceivedRtp) + + static_cast(mPacketLoss); + const double Ppl = expected > 0 + ? static_cast(mPacketLoss) * 100.0 / static_cast(expected) + : 0.0; - if (lossr > 0.5) - return 1; + double Ie = kMosDefaultIe, Bpl = kMosDefaultBpl; + resolveMosCodecParams(mCodecName, Ie, Bpl); + if (Bpl <= 0.0) + Bpl = 1.0; - bpl = 17.2647; - r = 93.2062077233 - 95.0 * (lossr * 100 / (lossr * 100 / burstr + bpl)); - mos = 2.06405 + 0.031738 * r - 0.000356641 * r * r + 2.93143 * pow(10, -6) * r * r * r; - if (mos < 1) - return 1; + // mRttDelay and mJitter are stored in seconds. jb_delay is unknown at + // this layer, so it is treated as zero. + const double rttMs = static_cast(mRttDelay.average()) * 1000.0; + const double jitterMs = static_cast(mJitter) * 1000.0; + const double d = rttMs / 2.0 + jitterMs; - if (mos > maximalMos) - return maximalMos; + double Id = 0.024 * d; + if (d > 177.3) + Id += 0.11 * (d - 177.3); + const double Ie_eff = Ie + (95.0 - Ie) * Ppl / (Ppl + Bpl); + + double R = 93.2 - Id - Ie_eff; + if (R < 0.0) R = 0.0; + if (R > 100.0) R = 100.0; + + double mos; + if (R == 0.0) + mos = 1.0; + else + mos = 1.0 + 0.035 * R + 7e-6 * R * (R - 60.0) * (100.0 - R); + + if (mos < 1.0) mos = 1.0; + if (mos > maximalMos) mos = maximalMos; return mos; } diff --git a/src/engine/media/MT_Statistics.h b/src/engine/media/MT_Statistics.h index 727ac117..9ab6a52b 100644 --- a/src/engine/media/MT_Statistics.h +++ b/src/engine/media/MT_Statistics.h @@ -39,6 +39,14 @@ protected: // Last timestamp from packet in units uint32_t mReceiveTimestamp = 0; + // Last extended sequence number, used to apply RFC 3550 §A.8 rule + // ("update jitter only when packets are truly adjacent in sequence"). + std::optional mLastExtSeqno; + + // Number of in-sequence packets seen so far. Used to skip the first few + // packets while transit-time settles after call setup. + uint32_t mPacketsProcessed = 0; + // It is classic jitter value in units std::optional mLastJitter;