pod.bootstrap_pod_ci

bootstrap_pod_ci(
    X,
    y,
    X_eval,
    threshold,
    model_type,
    model_params,
    bandwidth,
    dist_info,
    n_boot=1000,
    nuisance_ranges=None,
    n_jobs=None,
    feature_names=None,
    poi_names=None,
)

Estimates 95% Confidence Bounds for the PoD curve via Bootstrapping.

This function resamples the original data with replacement n_boot times. For each resample, it refits the Mean Model (dynamically rebuilding either a Polynomial or Kriging model), recalculates residuals, and generates a new PoD curve. If Kriging is selected, the optimizer is disabled during bootstrapping to remain computationally tractable.

Parameters

Name Type Description Default
X np.ndarray Original input data. required
y np.ndarray Original outcome data. required
X_eval np.ndarray Grid points for evaluation. required
threshold float Detection threshold. required
model_type str The type of mean model (‘Polynomial’ or ‘Kriging’). required
model_params Any Model parameters (integer degree for Poly, kernel for Kriging). required
bandwidth float Smoothing bandwidth (fixed from original fit). required
dist_info Tuple[str, Tuple] Error distribution (fixed from original fit). required
n_boot int Number of bootstrap iterations. Defaults to 1000. 1000
n_jobs int | None Number of CPU cores to use. None or 1 means single-core execution (no parallelisation). -1 means use all available cores minus one. Defaults to None. None
feature_names list Names of all feature columns in X, in the exact same order as the columns appear in X. For one-dimensional inputs this can be omitted or contain a single name, but for multi-dimensional bootstrapping it is used to identify which variables are parameters of interest versus nuisance variables. None
poi_names list Names of the parameters of interest (PoIs). Each entry must correspond to a name in feature_names. During multi-dimensional bootstrapping, PoD curves are evaluated and resampled with respect to these variables, while any remaining features in X are treated as nuisance variables. This should therefore be provided whenever X has multiple columns and PoIs need to be distinguished from nuisance inputs. None

Returns:

Returns

Name Type Description
Tuple[np.ndarray, np.ndarray] Tuple[np.ndarray, np.ndarray]: - lower_ci: The 2.5th percentile PoD curve (Lower 95% Bound). - upper_ci: The 97.5th percentile PoD curve (Upper 95% Bound).

Examples

# Generate 95% confidence bounds
lower, upper = bootstrap_pod_ci(
    X, y, X_eval, threshold=0.5,
    model_type='Polynomial', model_params=3,
    bandwidth=1.5, dist_info=('norm', (0, 1)), n_boot=100
)