Generalised Probability of Detection

In traditional non-destructive evaluation (NDE), the standard \(\hat{a}\)-versus-\(a\) approach relies on several restrictive assumptions: 1. Linearity of the signal response to defect size. 2. Homoscedasticity (constant variance) of noise across all defect sizes. 3. Normality (Gaussian shape) of the residuals.

Interaction with complex physics—such as crack roughness or angle variations—often breaks these assumptions. digiqual implements a generalized framework based on Malkiel et al. (2025) to relax these constraints.

1. Automated Model Selection (Relaxing Linearity)

Instead of forcing a linear relationship \(\hat{a} = \beta_0 + \beta_1 a\), digiqual treats the expectation model as a selection problem.

The Mathematics

It evaluates a pool of candidate models \(f(x)\) including polynomials up to degree \(10\) and Gaussian Process (Kriging) models. For each model, it performs \(K\)-fold Cross-Validation (CV) to estimate the Mean Squared Error (MSE): \[ MSE_{CV} = \frac{1}{n} \sum_{i=1}^n (y_i - \hat{f}^{-k(i)}(x_i))^2 \] Where \(\hat{f}^{-k(i)}\) is the model trained without the \(k\)-th fold containing observation \(i\). The model minimizing this error is automatically selected, balancing bias and variance.

2. Modeling Heteroscedasticity (Relaxing Constant Variance)

Signal scatter often increases with flaw size. digiqual abandons the constant variance assumption \(\sigma^2 = c\), replacing it with a localized estimation \(\sigma^2(x)\).

The Mathematics

It implements a Nadaraya-Watson Gaussian kernel average smoother. The variance at any point \(x\) is calculated as a weighted average of the surrounding squared residuals \(e_i^2 = (y_i - \hat{y}_i)^2\): \[ \hat{\sigma}^2(x) = \frac{\sum_{i=1}^n K_h(x - x_i) e_i^2}{\sum_{i=1}^n K_h(x - x_i)} \] Where \(K_h\) is the Gaussian kernel with bandwidth \(h\): \[ K_h(u) = \frac{1}{\sqrt{2\pi h^2}} \exp\left(-\frac{u^2}{2h^2}\right) \] The optimal bandwidth \(h\) is determined automatically via Leave-One-Out Cross-Validation (LOO-CV).

3. Inferring Error Distributions (Relaxing Normality)

Noise is rarely perfectly Gaussian. digiqual infers the true shape of the error.

The Mathematics

Residuals are first standardized into \(Z\)-scores using the local standard deviation: \[ Z_i = \frac{y_i - \hat{f}(x_i)}{\hat{\sigma}(x_i)} \] These \(Z\)-scores are then fitted against a suite of distributions: Normal, Gumbel, Logistic, Laplace, and t-Student. The best-fitting distribution is selected using the Akaike Information Criterion (AIC): \[ AIC = 2k - 2\ln(\hat{L}) \] Where \(k\) is the number of parameters and \(\hat{L}\) is the Maximum Likelihood.

4. Multi-Dimensional Integration (Active Marginalisation)

If evaluating multidimensional datasets with nuisance variations (e.g., angle \(\theta\), roughness \(R\)), digiqual utilizes Monte Carlo Integration to output a 1D or 2D marginal Probability of Detection.

The Mathematics

The expected PoD for a Parameter of Interest (PoI) vector \(\mathbf{x}\) is calculated by integrating the conditional PoD over the joint probability density function \(p(\mathbf{z})\) of the nuisance variables \(\mathbf{z}\): \[ PoD(\mathbf{x}) = \int_{\Omega_z} PoD(\mathbf{x}, \mathbf{z}) p(\mathbf{z}) d\mathbf{z} \] digiqual approximates this integral computationally via Latin Hypercube Sampling (LHS) across the continuous physics space.

5. Bootstrapping for Confidence Bounds

Because the model is non-linear and non-normal, traditional analytical confidence bounds are invalid.

The Process

digiqual uses Bootstrap resampling: 1. Draw \(N\) samples with replacement from the original dataset. 2. Completely physically re-fit the mean model, variance model, and error distribution. 3. Calculate the PoD curve. 4. Repeat this \(B\) times (typically \(B=1000\)). 5. Extract the \(2.5^{th}\) and \(97.5^{th}\) percentiles at every point \(a\) to form the robust 95% confidence interval \(PoD_{95}\).