evolve_features.RdRun evolutionary feature engineering
evolve_features(
data,
target_col,
task = "classification",
generations = 10,
pop_size = 10,
cv_folds = 3,
evaluation_strategy = "cv",
split_ratio = c(0.6, 0.2, 0.2),
split_ids = NULL,
early_stopping_rounds = 3,
evaluator = "lightgbm",
dynamic_population = TRUE,
crossover_type = "both",
threads = 8,
max_clustering_size = 5000,
seed = NULL,
verbose = TRUE
)A data.frame or data.table
Name of the target column
"classification" or "regression"
Number of generations (max iterations)
Population size
Number of cross-validation folds
"cv" or "split". Strategy to evaluate candidate recipes.
A numeric vector of length 2 or 3 defining train/validation/holdout proportions (e.g. c(0.6, 0.2, 0.2)).
An optional character vector of split assignments (e.g. "train", "val", "holdout").
Stop if fitness doesn't improve for this many generations
The ML model to use ("lightgbm" or "xgboost")
Logical. If TRUE, population expands dynamically during stagnation.
Crossover type: "both" (default, 50% random / 50% union), "random", or "union"
Number of threads to use for parallel execution (default 8)
Maximum unique training rows to cluster (default 5000, 0/NULL for unlimited)
Optional integer seed for reproducibility.
Logical. If TRUE, prints progress.