adaptive.generate_targeted_samples

generate_targeted_samples(
    df,
    input_cols,
    outcome_col,
    n_new_per_fix=10,
    failed_data=None,
    distance_threshold=0.05,
)

Active Learning Engine: Generates new samples based on diagnostic failures.

It consumes the results table from sample_sufficiency. - If Input Coverage fails -> Triggers _fill_gaps (Exploration). - If Model Fit or Bootstrap fails -> Triggers _sample_uncertainty (Exploitation).

Parameters

Name Type Description Default
df pd.DataFrame Current simulation data. required
input_cols List[str] Input variable names. required
outcome_col str Outcome variable name. required
n_new_per_fix int Number of samples to generate per detected issue. 10
failed_data Optional[pd.DataFrame] Graveyard of inputs that crashed the solver. None
distance_threshold float Minimum normalized distance to maintain from failed points. 0.05

Returns

Name Type Description
pd.DataFrame pd.DataFrame: New recommended samples.

Examples

import pandas as pd
# 1. Setup data with a massive gap in 'Length' (0-1, then 9-10)
df = pd.DataFrame({'Length': [0.1, 0.9, 9.1, 9.9], 'Signal': [1, 1, 1, 1]})

# 2. Ask for new samples to fix the gap
new_pts = generate_targeted_samples(
    df=df,
    input_cols=['Length'],
    outcome_col='Signal',
    n_new_per_fix=2
)
print(new_pts)
#    Length Refinement_Reason
# 0     5.4      Gap in Length
# 1     3.2      Gap in Length