diagnostics.sample_sufficiency
sample_sufficiency(df, input_cols, outcome_col)
Performs statistical tests on sampling sufficiency.
Runs 3 checks
- Input Space Coverage (Gaps)
- Model Fit Stability (CV Score)
- Bootstrap Convergence (Coefficient of Variation)
Parameters
| df |
pd.DataFrame |
The simulation data. Will be validated via validate_simulation internally. |
required |
| input_cols |
List[str] |
List of input variable names. |
required |
| outcome_col |
str |
Name of the outcome variable. |
required |
Returns
|
pd.DataFrame |
pd.DataFrame: A table containing pass/fail metrics for each test, including the threshold values evaluated against. |
Examples
import pandas as pd
df = pd.DataFrame({
'Length': [1.0, 2.5, 5.0, 10.0, 15.0, 20.0, 25.0, 30.0, 35.0, 40.0],
'Signal': [0.5, 0.8, 1.2, 1.5, 1.8, 2.2, 2.5, 2.8, 3.2, 3.5]
})
report = sample_sufficiency(df, input_cols=['Length'], outcome_col='Signal')
print(report[['Test', 'Pass']])