Delta Analysis
Every comparison produces a set of canonical deltas. Each delta fires only when the difference is statistically meaningful — no noise, no false signals, no manual threshold tuning.
Five canonical delta types
Section titled “Five canonical delta types”Deltas are ordered by causal salience, not mathematical complexity:
| Delta | Full Name | Category | What It Measures | Fires When |
|---|---|---|---|---|
| ΔF | Failure Rate | Event | Anomaly frequency | Failure frequency or kind differs between runs |
| ΔTc | Convergence Time | Timing | Steps to reach stable latency | Steady-state reached at different steps (3+ step separation) |
| ΔTd | Total Duration | Timing | Wall-clock time / structural emergence | Dominance onset differs (suppressed in TFRT preset) |
| ΔĀ | Average Latency | Behavior | Mean metric value | Mean differs meaningfully (suppressed in TFRT preset) |
| ΔO | Output Variability | Behavior | Oscillation / runtime instability | Area-above-threshold score differs beyond noise floor |
Each delta has three possible statuses:
- Present — the difference is real and meaningful
- Suppressed — the difference is below threshold or irrelevant for the active preset
- Indeterminate — cannot determine (insufficient data)
How deltas are computed
Section titled “How deltas are computed”Each delta type has its own detector with configurable thresholds. Every delta includes:
- Confidence score (0.0 to 1.0) — how certain the difference is meaningful
- Anchors — specific data points and view targets that triggered the delta
- Trigger type — what kind of signal caused the detection (e.g., sustained, recurrence, area episode, persistence-weighted)
- Human-readable explanation — auto-generated text describing the finding (target: 12 words or fewer)
- Summary sentence — neutral, descriptive sentence for export summaries
Convergence detection (ΔTc)
Section titled “Convergence detection (ΔTc)”The convergence detector looks for when a signal stays within an epsilon band for a sustained window:
- Window: number of consecutive stable steps required (default: 5, minimum: 3)
- Epsilon: base tolerance band; effective epsilon = max(base epsilon, 0.5 * robust sigma)
- Resolution: minimum 3-step separation between runs to count as meaningful
- Confidence is a heuristic based on tail stability and noise level — it affects visual intensity but never suppresses the delta
Stability detection (ΔO)
Section titled “Stability detection (ΔO)”The stability detector uses area-above-threshold scoring with an adaptive threshold:
- Threshold adapts based on both median and sigma of curvature magnitudes
- Episodes must sustain for at least 4 steps to avoid flicker false positives
- Between-run suppression applies a delta floor of 0.05
- Within-run noise floor of 0.1 filters out trivial episodes
Failure detection (ΔF)
Section titled “Failure detection (ΔF)”Detects anomalies using a persistence window (default: 3 steps). Can trigger on norm violations, loss explosions, or other failure kinds. Reports which run failed, at what step, and what kind of failure occurred.
Emergence detection (ΔTd)
Section titled “Emergence detection (ΔTd)”Detects structural emergence through eigenvalue dominance. Fires when one eigenvalue exceeds k times the next (default k = 1.5) for a sustained window, or through a recurrence rule (repeated dominance segments within a rolling window).
Runtime presets
Section titled “Runtime presets”TFRT (TensorFlow-TRT)
Section titled “TFRT (TensorFlow-TRT)”The built-in TensorFlow-TRT preset (tensorflowrt-runtime-v1) is designed for inference comparison:
- Maps inference-specific signals: latency, throughput, memory, CPU/GPU load
- Suppresses ΔĀ and ΔTd — these have no meaning for inference workloads
- Active deltas use latency as the primary signal (ΔTc for stabilization time, ΔO for oscillation, ΔF for outliers)
TFRT guardrails
Section titled “TFRT guardrails”The preset raises warnings when:
- No steady-state milestone is found in the trace
- Warmup exceeds 50% of the run duration
- Only aggregated stats are available (disables time-based deltas entirely)
These guardrails appear as inline warnings in the Compare tab so you know when results may be limited.