前回はニューラルネットワークを使ってモデリングをしました。
今回はAutoMLを使ってモデリングをしたいと思います。
AutoMLに慣れてしまうとインデータさえきちんとしていれば簡単に精度の出せるモデルが作成出来てしまうので、少し戸惑ってしまいますね 笑
今回は3つのAutoMLを使います。
・mljar
・AutoGluon
・auto-sklearn
モデリング用のデータ作成は「モデリング用データの準備」にコードがありますので参照ください。
以降、学習用データとテスト用データの「X_train、y_train、X_test、y_test」の4変数を作成してあると見做して分析を進めます。
mljarでモデリング
from supervised.automl import AutoML
automl = AutoML(mode="Compete", random_state=100)
# fitで学習させる
automl.fit(X_train,y_train)
AutoML directory: AutoML_1 The task is binary_classification with evaluation metric logloss AutoML will use algorithms: ['Decision Tree', 'Linear', 'Random Forest', 'Extra Trees', 'LightGBM', 'Xgboost', 'CatBoost', 'Neural Network', 'Nearest Neighbors'] AutoML will stack models AutoML will ensemble available models AutoML steps: ['adjust_validation', 'simple_algorithms', 'default_algorithms', 'not_so_random', 'golden_features', 'kmeans_features', 'insert_random_feature', 'features_selection', 'hill_climbing_1', 'hill_climbing_2', 'boost_on_errors', 'ensemble', 'stack', 'ensemble_stacked'] * Step adjust_validation will try to check up to 1 model 1_DecisionTree logloss 2.02506 trained in 0.97 seconds Adjust validation. Remove: 1_DecisionTree *** Disable stacking for small dataset (nrows < 500) Validation strategy: 10-fold CV Shuffle,Stratify * Step simple_algorithms will try to check up to 4 models 1_DecisionTree logloss 1.177056 trained in 3.0 seconds 2_DecisionTree logloss 0.701035 trained in 2.67 seconds 3_DecisionTree logloss 0.53853 trained in 3.58 seconds 4_Linear logloss 0.390309 trained in 7.7 seconds * Step default_algorithms will try to check up to 7 models 5_Default_LightGBM logloss 0.433876 trained in 20.82 seconds 6_Default_Xgboost logloss 0.451812 trained in 5.31 seconds 7_Default_CatBoost logloss 0.404469 trained in 6.43 seconds 8_Default_NeuralNetwork logloss 0.498901 trained in 6.26 seconds 9_Default_RandomForest logloss 0.424915 trained in 13.36 seconds 10_Default_ExtraTrees logloss 0.426999 trained in 11.49 seconds 11_Default_NearestNeighbors logloss 1.001154 trained in 5.8 seconds * Step not_so_random will try to check up to 61 models 21_LightGBM logloss 0.407261 trained in 5.82 seconds 12_Xgboost logloss 0.374154 trained in 9.48 seconds 30_CatBoost logloss 0.38979 trained in 7.77 seconds 39_RandomForest logloss 0.42454 trained in 16.44 seconds 48_ExtraTrees logloss 0.422043 trained in 15.21 seconds 57_NeuralNetwork logloss 0.508284 trained in 7.92 seconds 66_NearestNeighbors logloss 0.713205 trained in 6.84 seconds 22_LightGBM logloss 0.378577 trained in 9.72 seconds 13_Xgboost logloss 0.405763 trained in 8.78 seconds 31_CatBoost logloss 0.408574 trained in 12.41 seconds 40_RandomForest logloss 0.397487 trained in 12.54 seconds 49_ExtraTrees logloss 0.399412 trained in 11.95 seconds 58_NeuralNetwork logloss 0.499919 trained in 9.13 seconds 67_NearestNeighbors logloss 0.714694 trained in 7.49 seconds 23_LightGBM logloss 0.469583 trained in 8.28 seconds 14_Xgboost logloss 0.438485 trained in 9.95 seconds 32_CatBoost logloss 0.427753 trained in 21.26 seconds 41_RandomForest logloss 0.424355 trained in 19.32 seconds 50_ExtraTrees logloss 0.43091 trained in 23.07 seconds 59_NeuralNetwork logloss 0.620012 trained in 9.87 seconds 68_NearestNeighbors logloss 1.162394 trained in 8.58 seconds 24_LightGBM logloss 0.420758 trained in 9.82 seconds 15_Xgboost logloss 0.414963 trained in 11.08 seconds 33_CatBoost logloss 0.402595 trained in 11.54 seconds 42_RandomForest logloss 0.422792 trained in 17.52 seconds 51_ExtraTrees logloss 0.438129 trained in 15.44 seconds 60_NeuralNetwork logloss 0.714978 trained in 11.14 seconds 69_NearestNeighbors logloss 1.162394 trained in 10.22 seconds 25_LightGBM logloss 0.407529 trained in 10.94 seconds 16_Xgboost logloss 0.690224 trained in 11.63 seconds 34_CatBoost logloss 0.423424 trained in 29.71 seconds 43_RandomForest logloss 0.411063 trained in 18.97 seconds 52_ExtraTrees logloss 0.446924 trained in 23.67 seconds 61_NeuralNetwork logloss 0.819889 trained in 14.32 seconds 70_NearestNeighbors logloss 0.714694 trained in 15.81 seconds 26_LightGBM logloss 0.378221 trained in 13.95 seconds 17_Xgboost logloss 0.693147 trained in 13.63 seconds 35_CatBoost logloss 0.408431 trained in 18.64 seconds 44_RandomForest logloss 0.428817 trained in 25.81 seconds 53_ExtraTrees logloss 0.468533 trained in 26.32 seconds 62_NeuralNetwork logloss 0.444628 trained in 15.5 seconds 71_NearestNeighbors logloss 1.165557 trained in 13.86 seconds 27_LightGBM logloss 0.402419 trained in 15.58 seconds 18_Xgboost logloss 0.693147 trained in 15.26 seconds 36_CatBoost logloss 0.415805 trained in 18.93 seconds 45_RandomForest logloss 0.430171 trained in 20.91 seconds 54_ExtraTrees logloss 0.459077 trained in 18.89 seconds 63_NeuralNetwork logloss 0.562846 trained in 15.65 seconds 72_NearestNeighbors logloss 1.162394 trained in 14.11 seconds 28_LightGBM logloss 0.411542 trained in 14.82 seconds 19_Xgboost logloss 0.384715 trained in 16.08 seconds 37_CatBoost logloss 0.389346 trained in 15.76 seconds 46_RandomForest logloss 0.424571 trained in 22.12 seconds 55_ExtraTrees logloss 0.469758 trained in 23.9 seconds 64_NeuralNetwork logloss 0.760196 trained in 16.17 seconds 29_LightGBM logloss 0.394738 trained in 15.96 seconds 20_Xgboost logloss 0.693147 trained in 15.5 seconds 38_CatBoost logloss 0.404094 trained in 18.1 seconds 47_RandomForest logloss 0.424042 trained in 22.77 seconds 56_ExtraTrees logloss 0.443431 trained in 22.56 seconds 65_NeuralNetwork logloss 0.506584 trained in 17.8 seconds * Step golden_features will try to check up to 3 models None 10 Add Golden Feature: thal_7.0_sum_cp_4.0 Add Golden Feature: cp_4.0_sum_ca Add Golden Feature: thal_7.0_sum_slope_2.0 Add Golden Feature: slope_2.0_sum_cp_4.0 Add Golden Feature: thal_7.0_ratio_slope_2.0 Add Golden Feature: slope_2.0_ratio_thal_7.0 Add Golden Feature: thal_7.0_multiply_slope_2.0 Add Golden Feature: thal_7.0_sum_thal_6.0 Add Golden Feature: slope_2.0_sum_exang Add Golden Feature: cp_3.0_diff_slope_2.0 Created 10 Golden Features in 9.24 seconds. 12_Xgboost_GoldenFeatures logloss 0.385574 trained in 27.6 seconds 26_LightGBM_GoldenFeatures logloss 0.3848 trained in 17.88 seconds 22_LightGBM_GoldenFeatures logloss 0.387534 trained in 17.93 seconds * Step kmeans_features will try to check up to 3 models 12_Xgboost_KMeansFeatures logloss 0.404483 trained in 20.09 seconds 26_LightGBM_KMeansFeatures logloss 0.398293 trained in 19.17 seconds 22_LightGBM_KMeansFeatures logloss 0.39724 trained in 19.56 seconds * Step insert_random_feature will try to check up to 1 model 12_Xgboost_RandomFeature logloss 0.380384 trained in 21.54 seconds Drop features ['chol', 'thal_6.0', 'slope_3.0', 'restecg_1.0', 'cp_3.0', 'fbs', 'cp_2.0', 'trestbps', 'random_feature', 'thalach'] * Step features_selection will try to check up to 6 models 12_Xgboost_SelectedFeatures logloss 0.366608 trained in 19.46 seconds 26_LightGBM_SelectedFeatures logloss 0.358819 trained in 18.94 seconds 37_CatBoost_SelectedFeatures logloss 0.371678 trained in 20.21 seconds 40_RandomForest_SelectedFeatures logloss 0.406463 trained in 24.95 seconds 49_ExtraTrees_SelectedFeatures logloss 0.398534 trained in 26.73 seconds 62_NeuralNetwork_SelectedFeatures logloss 0.42059 trained in 20.64 seconds * Step hill_climbing_1 will try to check up to 25 models 73_LightGBM_SelectedFeatures logloss 0.359481 trained in 20.38 seconds 74_LightGBM_SelectedFeatures logloss 0.357452 trained in 20.12 seconds 75_Xgboost_SelectedFeatures logloss 0.366608 trained in 21.15 seconds 76_CatBoost_SelectedFeatures logloss 0.379173 trained in 22.7 seconds 77_Xgboost logloss 0.374154 trained in 22.12 seconds 78_LightGBM logloss 0.378577 trained in 22.85 seconds 79_LightGBM logloss 0.374824 trained in 22.25 seconds 80_LightGBM logloss 0.378221 trained in 21.11 seconds 81_Xgboost logloss 0.384715 trained in 31.71 seconds 82_CatBoost logloss 0.406751 trained in 25.02 seconds 83_RandomForest logloss 0.409271 trained in 31.82 seconds 84_ExtraTrees_SelectedFeatures logloss 0.397793 trained in 31.39 seconds 85_ExtraTrees logloss 0.419875 trained in 36.41 seconds 86_RandomForest_SelectedFeatures logloss 0.39693 trained in 27.03 seconds 87_RandomForest logloss 0.413053 trained in 27.54 seconds 88_RandomForest logloss 0.412969 trained in 27.45 seconds 89_NeuralNetwork_SelectedFeatures logloss 0.548245 trained in 23.17 seconds 90_ExtraTrees logloss 0.429518 trained in 28.52 seconds 91_NeuralNetwork logloss 0.429078 trained in 23.99 seconds 92_NeuralNetwork logloss 0.482681 trained in 24.22 seconds 93_NeuralNetwork logloss 0.453291 trained in 23.93 seconds 94_DecisionTree logloss 0.575294 trained in 23.18 seconds 95_DecisionTree logloss 0.535709 trained in 23.17 seconds 96_DecisionTree logloss 1.543826 trained in 23.22 seconds 97_DecisionTree logloss 0.587037 trained in 23.41 seconds * Step hill_climbing_2 will try to check up to 30 models 98_LightGBM_SelectedFeatures logloss 0.357452 trained in 24.23 seconds 99_LightGBM_SelectedFeatures logloss 0.358819 trained in 24.8 seconds 100_LightGBM_SelectedFeatures logloss 0.359481 trained in 25.31 seconds 101_Xgboost_SelectedFeatures logloss 0.381465 trained in 25.55 seconds 102_Xgboost_SelectedFeatures logloss 0.361988 trained in 25.85 seconds 103_Xgboost_SelectedFeatures logloss 0.381465 trained in 26.03 seconds 104_Xgboost_SelectedFeatures logloss 0.361988 trained in 26.39 seconds 105_CatBoost_SelectedFeatures logloss 0.370163 trained in 25.67 seconds 106_Xgboost logloss 0.385467 trained in 26.45 seconds 107_Xgboost logloss 0.379588 trained in 26.93 seconds 108_CatBoost_SelectedFeatures logloss 0.377549 trained in 26.41 seconds 109_CatBoost logloss 0.393326 trained in 26.61 seconds 110_RandomForest_SelectedFeatures logloss 0.390588 trained in 31.92 seconds 111_RandomForest_SelectedFeatures logloss 0.395898 trained in 32.28 seconds 112_RandomForest logloss 0.40556 trained in 31.96 seconds 113_RandomForest logloss 0.39588 trained in 32.23 seconds 114_ExtraTrees_SelectedFeatures logloss 0.396858 trained in 31.19 seconds 115_ExtraTrees_SelectedFeatures logloss 0.398634 trained in 31.75 seconds 116_ExtraTrees_SelectedFeatures logloss 0.397028 trained in 32.86 seconds 117_ExtraTrees_SelectedFeatures logloss 0.401296 trained in 34.35 seconds 118_ExtraTrees logloss 0.394118 trained in 32.23 seconds 119_ExtraTrees logloss 0.405253 trained in 58.66 seconds 120_RandomForest_SelectedFeatures logloss 0.402852 trained in 64.98 seconds 121_RandomForest_SelectedFeatures logloss 0.40528 trained in 47.67 seconds 122_NeuralNetwork_SelectedFeatures logloss 0.461389 trained in 35.11 seconds 123_NeuralNetwork_SelectedFeatures logloss 0.49609 trained in 52.19 seconds 124_NeuralNetwork logloss 0.576635 trained in 52.01 seconds 125_NeuralNetwork logloss 0.600478 trained in 52.65 seconds 126_NeuralNetwork logloss 0.641598 trained in 52.85 seconds 127_NeuralNetwork logloss 0.466681 trained in 53.34 seconds * Step boost_on_errors will try to check up to 1 model 98_LightGBM_SelectedFeatures_BoostOnErrors logloss 0.366409 trained in 51.7 seconds * Step ensemble will try to check up to 1 model Ensemble logloss 0.355239 trained in 201.76 seconds AutoML fit time: 3618.58 seconds AutoML best model: Ensemble
3618秒かかりました。
# 学習データとテストデータへの当てはまりを確認
from sklearn.metrics import accuracy_score
y_train_pred = automl.predict(X_train)
y_test_pred = automl.predict(X_test)
print("train",accuracy_score(y_train, y_train_pred))
print("test",accuracy_score(y_test, y_test_pred))
train 0.871900826446281 test 0.9180327868852459
UserWarning: X has feature names, but StandardScaler was fitted without feature namesが出ましたがとりあえず気にしないことにします 笑
精度はロジスティク回帰やニューラルネットワークより良いですね。
AutoGluonでモデリング (デフォルト設定)
AutoGluonは目的変数も含めたデータをモデルに渡してあげる必要があるので、X_train,y_train,X_test,y_testではなく、train変数とtest変数をそのまま利用します。
eval_metricはAutoGluonのbinary問題ではデフォルトだとaccuracyになるようです。mljarはlog_lossだったみたいなので合わせても良いのですがここはデフォルト設定のままにしようと思います。
If eval_metric = None, it is automatically chosen based on problem_type
引用: https://auto.gluon.ai/stable/api/autogluon.tabular.TabularPredictor.html
「デフォルトの設定のまま実行するパターン」と「AutoMLPipelineFeatureGeneratorや実行時間をmljarの実行時間と合わせるパターン」の2パターン試してみようと思います。
# https://auto.gluon.ai/stable/api/autogluon.tabular.TabularPredictor.html
# autogluonのモデル作成
from autogluon.tabular import TabularPredictor
predictor = TabularPredictor(label="target", problem_type="binary",path="RESULT_AUTOGLUON",eval_metric=None).fit(train)
/Users/hinomaruc/miniforge3/envs/conda-autogluon/lib/python3.8/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html from .autonotebook import tqdm as notebook_tqdm Beginning AutoGluon training ... AutoGluon will save models to "RESULT_AUTOGLUON/" AutoGluon Version: 0.8.2 Python Version: 3.8.17 Operating System: Darwin Platform Machine: x86_64 Platform Version: Darwin Kernel Version 19.6.0: Tue Jun 21 21:18:39 PDT 2022; root:xnu-6153.141.66~1/RELEASE_X86_64 Disk Space Avail: 128.68 GB / 239.85 GB (53.7%) Train Data Rows: 242 Train Data Columns: 18 Label Column: target Preprocessing data ... Selected class <--> label mapping: class 1 = 1, class 0 = 0 Using Feature Generators to preprocess the data ... Fitting AutoMLPipelineFeatureGenerator... Available Memory: 11508.48 MB Train Data (Original) Memory Usage: 0.03 MB (0.0% of available memory) Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features. Stage 1 Generators: Fitting AsTypeFeatureGenerator... Note: Converting 12 features to boolean dtype as they only contain 2 unique values. Stage 2 Generators: Fitting FillNaFeatureGenerator... Stage 3 Generators: Fitting IdentityFeatureGenerator... Stage 4 Generators: Fitting DropUniqueFeatureGenerator... Stage 5 Generators: Fitting DropDuplicatesFeatureGenerator... Types of features in original data (raw dtype, special dtypes): ('float', []) : 18 | ['age', 'sex', 'trestbps', 'chol', 'fbs', ...] Types of features in processed data (raw dtype, special dtypes): ('float', []) : 6 | ['age', 'trestbps', 'chol', 'thalach', 'oldpeak', ...] ('int', ['bool']) : 12 | ['sex', 'fbs', 'exang', 'cp_2.0', 'cp_3.0', ...] 0.2s = Fit runtime 18 features in original data used to generate 18 features in processed data. Train Data (Processed) Memory Usage: 0.01 MB (0.0% of available memory) Data preprocessing and feature engineering runtime = 0.27s ... AutoGluon will gauge predictive performance using evaluation metric: 'accuracy' To change this, specify the eval_metric parameter of Predictor() Automatically generating train/validation split with holdout_frac=0.2, Train Rows: 193, Val Rows: 49 User-specified model hyperparameters to be fit: { 'NN_TORCH': {}, 'GBM': [{'extra_trees': True, 'ag_args': {'name_suffix': 'XT'}}, {}, 'GBMLarge'], 'CAT': {}, 'XGB': {}, 'FASTAI': {}, 'RF': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}], 'XT': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}], 'KNN': [{'weights': 'uniform', 'ag_args': {'name_suffix': 'Unif'}}, {'weights': 'distance', 'ag_args': {'name_suffix': 'Dist'}}], } Fitting 13 L1 models ... Fitting model: KNeighborsUnif ... 0.5714 = Validation score (accuracy) 4.82s = Training runtime 0.03s = Validation runtime Fitting model: KNeighborsDist ... 0.551 = Validation score (accuracy) 0.02s = Training runtime 0.02s = Validation runtime Fitting model: LightGBMXT ... 0.8163 = Validation score (accuracy) 1.09s = Training runtime 0.0s = Validation runtime Fitting model: LightGBM ... 0.7755 = Validation score (accuracy) 0.18s = Training runtime 0.0s = Validation runtime Fitting model: RandomForestGini ... 0.7143 = Validation score (accuracy) 0.81s = Training runtime 0.06s = Validation runtime Fitting model: RandomForestEntr ... 0.7143 = Validation score (accuracy) 0.67s = Training runtime 0.06s = Validation runtime Fitting model: CatBoost ... 0.7755 = Validation score (accuracy) 0.64s = Training runtime 0.0s = Validation runtime Fitting model: ExtraTreesGini ... 0.7755 = Validation score (accuracy) 0.69s = Training runtime 0.06s = Validation runtime Fitting model: ExtraTreesEntr ... 0.7755 = Validation score (accuracy) 0.76s = Training runtime 0.06s = Validation runtime Fitting model: NeuralNetFastAI ... No improvement since epoch 1: early stopping 0.8367 = Validation score (accuracy) 2.48s = Training runtime 0.02s = Validation runtime Fitting model: XGBoost ... 0.7143 = Validation score (accuracy) 0.22s = Training runtime 0.0s = Validation runtime Fitting model: NeuralNetTorch ... 0.8163 = Validation score (accuracy) 1.01s = Training runtime 0.02s = Validation runtime Fitting model: LightGBMLarge ... 0.7143 = Validation score (accuracy) 0.33s = Training runtime 0.0s = Validation runtime Fitting model: WeightedEnsemble_L2 ... 0.8776 = Validation score (accuracy) 1.16s = Training runtime 0.0s = Validation runtime AutoGluon training complete, total runtime = 15.93s ... Best model: "WeightedEnsemble_L2" TabularPredictor saved. To load, use: predictor = TabularPredictor.load("RESULT_AUTOGLUON/")
15.93秒で完了しました。WeightedEnsemble_L2が選択されたようです。
importance stddev p_value n p99_high p99_low
ca 0.047934 0.016425 0.001424 5 0.081754 0.014114
thal_7.0 0.043802 0.015070 0.001446 5 0.074831 0.012773
cp_4.0 0.038843 0.015626 0.002564 5 0.071017 0.006668
sex 0.025620 0.012188 0.004653 5 0.050716 0.000524
thalach 0.017355 0.023266 0.085317 5 0.065260 -0.030549
exang 0.015702 0.013517 0.030099 5 0.043534 -0.012129
oldpeak 0.008264 0.013390 0.119832 5 0.035835 -0.019306
cp_3.0 0.008264 0.008264 0.044505 5 0.025281 -0.008752
slope_2.0 0.006612 0.010776 0.120991 5 0.028799 -0.015575
fbs 0.004959 0.008958 0.141756 5 0.023404 -0.013487
thal_6.0 0.004959 0.003457 0.016339 5 0.012077 -0.002160
trestbps 0.004132 0.005844 0.094502 5 0.016165 -0.007900
chol 0.003306 0.008958 0.227830 5 0.021751 -0.015140
age 0.000826 0.001848 0.186950 5 0.004631 -0.002979
restecg_2.0 0.000000 0.005061 0.500000 5 0.010421 -0.010421
restecg_1.0 0.000000 0.000000 0.500000 5 0.000000 0.000000
cp_2.0 -0.000826 0.003457 0.689346 5 0.006292 -0.007945
slope_3.0 -0.001653 0.002263 0.911096 5 0.003007 -0.006313
ca,thal_7,cp_4が重要な変数なようです。
predictor.leaderboard(test, silent=True)
model score_test score_val pred_time_test pred_time_val fit_time pred_time_test_marginal pred_time_val_marginal fit_time_marginal stack_level can_infer fit_order 0 NeuralNetTorch 0.918033 0.816327 0.013557 0.016752 1.010797 0.013557 0.016752 1.010797 1 True 12 1 NeuralNetFastAI 0.885246 0.836735 0.026323 0.016018 2.475871 0.026323 0.016018 2.475871 1 True 10 2 CatBoost 0.852459 0.775510 0.004657 0.002295 0.640289 0.004657 0.002295 0.640289 1 True 7 3 WeightedEnsemble_L2 0.852459 0.877551 0.033321 0.019821 4.722100 0.003053 0.001439 1.155352 2 True 14 4 RandomForestEntr 0.852459 0.714286 0.094608 0.063713 0.673625 0.094608 0.063713 0.673625 1 True 6 5 ExtraTreesEntr 0.836066 0.775510 0.077791 0.060216 0.757012 0.077791 0.060216 0.757012 1 True 9 6 ExtraTreesGini 0.836066 0.775510 0.077792 0.057055 0.687126 0.077792 0.057055 0.687126 1 True 8 7 RandomForestGini 0.836066 0.714286 0.120471 0.063285 0.809849 0.120471 0.063285 0.809849 1 True 5 8 LightGBMXT 0.819672 0.816327 0.003945 0.002364 1.090877 0.003945 0.002364 1.090877 1 True 3 9 XGBoost 0.819672 0.714286 0.010792 0.004440 0.219609 0.010792 0.004440 0.219609 1 True 11 10 LightGBMLarge 0.803279 0.714286 0.002192 0.003415 0.325471 0.002192 0.003415 0.325471 1 True 13 11 LightGBM 0.786885 0.775510 0.002399 0.002177 0.178292 0.002399 0.002177 0.178292 1 True 4 12 KNeighborsDist 0.622951 0.551020 0.015986 0.020918 0.023371 0.015986 0.020918 0.023371 1 True 2 13 KNeighborsUnif 0.622951 0.571429 0.016880 0.033736 4.824293 0.016880 0.033736 4.824293 1 True 1
各アルゴリズムに対する適用結果がまとめて分かります。
# 学習データとテストデータへの当てはまりを確認
from sklearn.metrics import accuracy_score
y_train_pred = predictor.predict(train)
y_test_pred = predictor.predict(test)
print("train",accuracy_score(y_train, y_train_pred))
print("test",accuracy_score(y_test, y_test_pred))
train 0.859504132231405 test 0.8524590163934426
オーバーフィットはしていなさそうですが、testデータへの当てはまり具合はmljarの方が断然良かったですね。
AutoGluonでモデリング (チューニングあり)
# https://auto.gluon.ai/stable/api/autogluon.tabular.TabularPredictor.html
# autogluonのモデル作成
from autogluon.tabular import TabularPredictor
from autogluon.features.generators import AutoMLPipelineFeatureGenerator
auto_ml_pipeline_feature_generator = AutoMLPipelineFeatureGenerator()
predictor = TabularPredictor(label="target",
problem_type="binary",
path="RESULT_AUTOGLUON2",
eval_metric=None).fit(train,
auto_stack=True,
time_limit=3619,
feature_generator=auto_ml_pipeline_feature_generator)
Stack configuration (auto_stack=True): num_stack_levels=0, num_bag_folds=5, num_bag_sets=20 Beginning AutoGluon training ... Time limit = 3619s AutoGluon will save models to "RESULT_AUTOGLUON2/" AutoGluon Version: 0.8.2 Python Version: 3.8.17 Operating System: Darwin Platform Machine: x86_64 Platform Version: Darwin Kernel Version 19.6.0: Tue Jun 21 21:18:39 PDT 2022; root:xnu-6153.141.66~1/RELEASE_X86_64 Disk Space Avail: 128.37 GB / 239.85 GB (53.5%) Train Data Rows: 242 Train Data Columns: 18 Label Column: target Preprocessing data ... Selected class <--> label mapping: class 1 = 1, class 0 = 0 Using Feature Generators to preprocess the data ... Fitting AutoMLPipelineFeatureGenerator... Available Memory: 11084.47 MB Train Data (Original) Memory Usage: 0.03 MB (0.0% of available memory) Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features. Stage 1 Generators: Fitting AsTypeFeatureGenerator... Note: Converting 12 features to boolean dtype as they only contain 2 unique values. Stage 2 Generators: Fitting FillNaFeatureGenerator... Stage 3 Generators: Fitting IdentityFeatureGenerator... Stage 4 Generators: Fitting DropUniqueFeatureGenerator... Stage 5 Generators: Fitting DropDuplicatesFeatureGenerator... Types of features in original data (raw dtype, special dtypes): ('float', []) : 18 | ['age', 'sex', 'trestbps', 'chol', 'fbs', ...] Types of features in processed data (raw dtype, special dtypes): ('float', []) : 6 | ['age', 'trestbps', 'chol', 'thalach', 'oldpeak', ...] ('int', ['bool']) : 12 | ['sex', 'fbs', 'exang', 'cp_2.0', 'cp_3.0', ...] 0.2s = Fit runtime 18 features in original data used to generate 18 features in processed data. Train Data (Processed) Memory Usage: 0.01 MB (0.0% of available memory) Data preprocessing and feature engineering runtime = 0.25s ... AutoGluon will gauge predictive performance using evaluation metric: 'accuracy' To change this, specify the eval_metric parameter of Predictor() User-specified model hyperparameters to be fit: { 'NN_TORCH': {}, 'GBM': [{'extra_trees': True, 'ag_args': {'name_suffix': 'XT'}}, {}, 'GBMLarge'], 'CAT': {}, 'XGB': {}, 'FASTAI': {}, 'RF': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}], 'XT': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}], 'KNN': [{'weights': 'uniform', 'ag_args': {'name_suffix': 'Unif'}}, {'weights': 'distance', 'ag_args': {'name_suffix': 'Dist'}}], } Fitting 13 L1 models ... Fitting model: KNeighborsUnif_BAG_L1 ... Training model for up to 3618.75s of the 3618.74s of remaining time. 0.6488 = Validation score (accuracy) 0.02s = Training runtime 0.02s = Validation runtime Fitting model: KNeighborsDist_BAG_L1 ... Training model for up to 3618.68s of the 3618.67s of remaining time. 0.6364 = Validation score (accuracy) 0.01s = Training runtime 0.02s = Validation runtime Fitting model: LightGBMXT_BAG_L1 ... Training model for up to 3618.6s of the 3618.6s of remaining time. Will use sequential fold fitting strategy because import of ray failed. Reason: ray is required to train folds in parallel. A quick tip is to install viapip install ray==2.2.0
, or use sequential fold fitting by passingsequential_local
toag_args_ensemble
when calling tabular.fitFor example:predictor.fit(..., ag_args_ensemble={'fold_fitting_strategy': 'sequential_local'})
Fitting 5 child models (S1F1 - S1F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8512 = Validation score (accuracy) 1.3s = Training runtime 0.01s = Validation runtime Fitting model: LightGBM_BAG_L1 ... Training model for up to 3617.2s of the 3617.2s of remaining time. Fitting 5 child models (S1F1 - S1F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8512 = Validation score (accuracy) 1.05s = Training runtime 0.01s = Validation runtime Fitting model: RandomForestGini_BAG_L1 ... Training model for up to 3616.08s of the 3616.07s of remaining time. 0.7934 = Validation score (accuracy) 0.72s = Training runtime 0.12s = Validation runtime Fitting model: RandomForestEntr_BAG_L1 ... Training model for up to 3615.19s of the 3615.18s of remaining time. 0.7769 = Validation score (accuracy) 0.65s = Training runtime 0.12s = Validation runtime Fitting model: CatBoost_BAG_L1 ... Training model for up to 3614.37s of the 3614.36s of remaining time. Fitting 5 child models (S1F1 - S1F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8595 = Validation score (accuracy) 2.69s = Training runtime 0.01s = Validation runtime Fitting model: ExtraTreesGini_BAG_L1 ... Training model for up to 3611.62s of the 3611.61s of remaining time. 0.7851 = Validation score (accuracy) 0.69s = Training runtime 0.16s = Validation runtime Fitting model: ExtraTreesEntr_BAG_L1 ... Training model for up to 3610.71s of the 3610.7s of remaining time. 0.7975 = Validation score (accuracy) 0.72s = Training runtime 0.11s = Validation runtime Fitting model: NeuralNetFastAI_BAG_L1 ... Training model for up to 3609.8s of the 3609.79s of remaining time. Fitting 5 child models (S1F1 - S1F5) | Fitting with SequentialLocalFoldFittingStrategy No improvement since epoch 2: early stopping No improvement since epoch 2: early stopping No improvement since epoch 1: early stopping No improvement since epoch 7: early stopping No improvement since epoch 8: early stopping 0.8388 = Validation score (accuracy) 9.62s = Training runtime 0.07s = Validation runtime Fitting model: XGBoost_BAG_L1 ... Training model for up to 3599.99s of the 3599.98s of remaining time. Fitting 5 child models (S1F1 - S1F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8017 = Validation score (accuracy) 1.07s = Training runtime 0.02s = Validation runtime Fitting model: NeuralNetTorch_BAG_L1 ... Training model for up to 3598.79s of the 3598.78s of remaining time. Fitting 5 child models (S1F1 - S1F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8595 = Validation score (accuracy) 5.61s = Training runtime 0.06s = Validation runtime Fitting model: LightGBMLarge_BAG_L1 ... Training model for up to 3593.06s of the 3593.05s of remaining time. Fitting 5 child models (S1F1 - S1F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8306 = Validation score (accuracy) 1.5s = Training runtime 0.01s = Validation runtime Repeating k-fold bagging: 2/20 Fitting model: LightGBMXT_BAG_L1 ... Training model for up to 3591.47s of the 3591.47s of remaining time. Fitting 5 child models (S2F1 - S2F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8388 = Validation score (accuracy) 2.32s = Training runtime 0.02s = Validation runtime Fitting model: LightGBM_BAG_L1 ... Training model for up to 3590.39s of the 3590.39s of remaining time. Fitting 5 child models (S2F1 - S2F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8099 = Validation score (accuracy) 2.04s = Training runtime 0.02s = Validation runtime Fitting model: CatBoost_BAG_L1 ... Training model for up to 3589.33s of the 3589.32s of remaining time. Fitting 5 child models (S2F1 - S2F5) | Fitting with SequentialLocalFoldFittingStrategy 0.843 = Validation score (accuracy) 5.46s = Training runtime 0.03s = Validation runtime Fitting model: NeuralNetFastAI_BAG_L1 ... Training model for up to 3586.49s of the 3586.48s of remaining time. Fitting 5 child models (S2F1 - S2F5) | Fitting with SequentialLocalFoldFittingStrategy No improvement since epoch 1: early stopping No improvement since epoch 6: early stopping No improvement since epoch 1: early stopping No improvement since epoch 6: early stopping No improvement since epoch 1: early stopping 0.843 = Validation score (accuracy) 18.69s = Training runtime 0.15s = Validation runtime Fitting model: XGBoost_BAG_L1 ... Training model for up to 3577.22s of the 3577.21s of remaining time. Fitting 5 child models (S2F1 - S2F5) | Fitting with SequentialLocalFoldFittingStrategy 0.7893 = Validation score (accuracy) 2.01s = Training runtime 0.04s = Validation runtime Fitting model: NeuralNetTorch_BAG_L1 ... Training model for up to 3576.16s of the 3576.15s of remaining time. Fitting 5 child models (S2F1 - S2F5) | Fitting with SequentialLocalFoldFittingStrategy 0.843 = Validation score (accuracy) 11.02s = Training runtime 0.11s = Validation runtime Fitting model: LightGBMLarge_BAG_L1 ... Training model for up to 3570.62s of the 3570.61s of remaining time. Fitting 5 child models (S2F1 - S2F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8099 = Validation score (accuracy) 2.97s = Training runtime 0.02s = Validation runtime Repeating k-fold bagging: 3/20 Fitting model: LightGBMXT_BAG_L1 ... Training model for up to 3569.08s of the 3569.07s of remaining time. Fitting 5 child models (S3F1 - S3F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8264 = Validation score (accuracy) 3.28s = Training runtime 0.03s = Validation runtime Fitting model: LightGBM_BAG_L1 ... Training model for up to 3568.06s of the 3568.05s of remaining time. Fitting 5 child models (S3F1 - S3F5) | Fitting with SequentialLocalFoldFittingStrategy 0.814 = Validation score (accuracy) 3.11s = Training runtime 0.03s = Validation runtime Fitting model: CatBoost_BAG_L1 ... Training model for up to 3566.92s of the 3566.91s of remaining time. Fitting 5 child models (S3F1 - S3F5) | Fitting with SequentialLocalFoldFittingStrategy 0.843 = Validation score (accuracy) 8.58s = Training runtime 0.04s = Validation runtime Fitting model: NeuralNetFastAI_BAG_L1 ... Training model for up to 3563.74s of the 3563.73s of remaining time. Fitting 5 child models (S3F1 - S3F5) | Fitting with SequentialLocalFoldFittingStrategy No improvement since epoch 7: early stopping No improvement since epoch 2: early stopping No improvement since epoch 1: early stopping No improvement since epoch 2: early stopping 0.843 = Validation score (accuracy) 28.05s = Training runtime 0.22s = Validation runtime Fitting model: XGBoost_BAG_L1 ... Training model for up to 3554.19s of the 3554.18s of remaining time. Fitting 5 child models (S3F1 - S3F5) | Fitting with SequentialLocalFoldFittingStrategy 0.7893 = Validation score (accuracy) 2.92s = Training runtime 0.06s = Validation runtime Fitting model: NeuralNetTorch_BAG_L1 ... Training model for up to 3553.16s of the 3553.15s of remaining time. Fitting 5 child models (S3F1 - S3F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8388 = Validation score (accuracy) 16.49s = Training runtime 0.17s = Validation runtime Fitting model: LightGBMLarge_BAG_L1 ... Training model for up to 3547.57s of the 3547.57s of remaining time. Fitting 5 child models (S3F1 - S3F5) | Fitting with SequentialLocalFoldFittingStrategy 0.814 = Validation score (accuracy) 4.29s = Training runtime 0.03s = Validation runtime Repeating k-fold bagging: 4/20 Fitting model: LightGBMXT_BAG_L1 ... Training model for up to 3546.19s of the 3546.18s of remaining time. Fitting 5 child models (S4F1 - S4F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8264 = Validation score (accuracy) 4.22s = Training runtime 0.04s = Validation runtime Fitting model: LightGBM_BAG_L1 ... Training model for up to 3545.18s of the 3545.17s of remaining time. Fitting 5 child models (S4F1 - S4F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8182 = Validation score (accuracy) 4.12s = Training runtime 0.04s = Validation runtime Fitting model: CatBoost_BAG_L1 ... Training model for up to 3544.09s of the 3544.08s of remaining time. Fitting 5 child models (S4F1 - S4F5) | Fitting with SequentialLocalFoldFittingStrategy 0.843 = Validation score (accuracy) 11.52s = Training runtime 0.05s = Validation runtime Fitting model: NeuralNetFastAI_BAG_L1 ... Training model for up to 3541.09s of the 3541.09s of remaining time. Fitting 5 child models (S4F1 - S4F5) | Fitting with SequentialLocalFoldFittingStrategy No improvement since epoch 2: early stopping No improvement since epoch 0: early stopping No improvement since epoch 4: early stopping No improvement since epoch 1: early stopping 0.8512 = Validation score (accuracy) 36.75s = Training runtime 0.29s = Validation runtime Fitting model: XGBoost_BAG_L1 ... Training model for up to 3532.22s of the 3532.21s of remaining time. Fitting 5 child models (S4F1 - S4F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8017 = Validation score (accuracy) 3.71s = Training runtime 0.08s = Validation runtime Fitting model: NeuralNetTorch_BAG_L1 ... Training model for up to 3531.32s of the 3531.31s of remaining time. Fitting 5 child models (S4F1 - S4F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8347 = Validation score (accuracy) 22.01s = Training runtime 0.23s = Validation runtime Fitting model: LightGBMLarge_BAG_L1 ... Training model for up to 3525.69s of the 3525.68s of remaining time. Fitting 5 child models (S4F1 - S4F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8058 = Validation score (accuracy) 5.68s = Training runtime 0.04s = Validation runtime Repeating k-fold bagging: 5/20 Fitting model: LightGBMXT_BAG_L1 ... Training model for up to 3524.23s of the 3524.22s of remaining time. Fitting 5 child models (S5F1 - S5F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8264 = Validation score (accuracy) 5.18s = Training runtime 0.05s = Validation runtime Fitting model: LightGBM_BAG_L1 ... Training model for up to 3523.21s of the 3523.2s of remaining time. Fitting 5 child models (S5F1 - S5F5) | Fitting with SequentialLocalFoldFittingStrategy 0.814 = Validation score (accuracy) 5.03s = Training runtime 0.05s = Validation runtime Fitting model: CatBoost_BAG_L1 ... Training model for up to 3522.25s of the 3522.24s of remaining time. Fitting 5 child models (S5F1 - S5F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8512 = Validation score (accuracy) 14.19s = Training runtime 0.06s = Validation runtime Fitting model: NeuralNetFastAI_BAG_L1 ... Training model for up to 3519.53s of the 3519.52s of remaining time. Fitting 5 child models (S5F1 - S5F5) | Fitting with SequentialLocalFoldFittingStrategy No improvement since epoch 3: early stopping No improvement since epoch 2: early stopping No improvement since epoch 3: early stopping No improvement since epoch 2: early stopping 0.8636 = Validation score (accuracy) 46.01s = Training runtime 0.36s = Validation runtime Fitting model: XGBoost_BAG_L1 ... Training model for up to 3510.09s of the 3510.08s of remaining time. Fitting 5 child models (S5F1 - S5F5) | Fitting with SequentialLocalFoldFittingStrategy 0.7975 = Validation score (accuracy) 4.56s = Training runtime 0.11s = Validation runtime Fitting model: NeuralNetTorch_BAG_L1 ... Training model for up to 3509.12s of the 3509.11s of remaining time. Fitting 5 child models (S5F1 - S5F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8388 = Validation score (accuracy) 27.88s = Training runtime 0.29s = Validation runtime Fitting model: LightGBMLarge_BAG_L1 ... Training model for up to 3503.12s of the 3503.11s of remaining time. Fitting 5 child models (S5F1 - S5F5) | Fitting with SequentialLocalFoldFittingStrategy 0.814 = Validation score (accuracy) 6.99s = Training runtime 0.05s = Validation runtime Repeating k-fold bagging: 6/20 Fitting model: LightGBMXT_BAG_L1 ... Training model for up to 3501.74s of the 3501.74s of remaining time. Fitting 5 child models (S6F1 - S6F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8306 = Validation score (accuracy) 6.12s = Training runtime 0.06s = Validation runtime Fitting model: LightGBM_BAG_L1 ... Training model for up to 3500.75s of the 3500.74s of remaining time. Fitting 5 child models (S6F1 - S6F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8182 = Validation score (accuracy) 5.99s = Training runtime 0.06s = Validation runtime Fitting model: CatBoost_BAG_L1 ... Training model for up to 3499.72s of the 3499.71s of remaining time. Fitting 5 child models (S6F1 - S6F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8471 = Validation score (accuracy) 16.64s = Training runtime 0.08s = Validation runtime Fitting model: NeuralNetFastAI_BAG_L1 ... Training model for up to 3497.2s of the 3497.2s of remaining time. Fitting 5 child models (S6F1 - S6F5) | Fitting with SequentialLocalFoldFittingStrategy No improvement since epoch 1: early stopping No improvement since epoch 2: early stopping No improvement since epoch 3: early stopping No improvement since epoch 2: early stopping 0.8554 = Validation score (accuracy) 54.87s = Training runtime 0.43s = Validation runtime Fitting model: XGBoost_BAG_L1 ... Training model for up to 3488.16s of the 3488.15s of remaining time. Fitting 5 child models (S6F1 - S6F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8058 = Validation score (accuracy) 5.4s = Training runtime 0.13s = Validation runtime Fitting model: NeuralNetTorch_BAG_L1 ... Training model for up to 3487.22s of the 3487.21s of remaining time. Fitting 5 child models (S6F1 - S6F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8306 = Validation score (accuracy) 33.74s = Training runtime 0.35s = Validation runtime Fitting model: LightGBMLarge_BAG_L1 ... Training model for up to 3481.24s of the 3481.23s of remaining time. Fitting 5 child models (S6F1 - S6F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8223 = Validation score (accuracy) 8.35s = Training runtime 0.06s = Validation runtime Repeating k-fold bagging: 7/20 Fitting model: LightGBMXT_BAG_L1 ... Training model for up to 3479.8s of the 3479.79s of remaining time. Fitting 5 child models (S7F1 - S7F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8347 = Validation score (accuracy) 7.05s = Training runtime 0.07s = Validation runtime Fitting model: LightGBM_BAG_L1 ... Training model for up to 3478.81s of the 3478.8s of remaining time. Fitting 5 child models (S7F1 - S7F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8182 = Validation score (accuracy) 6.93s = Training runtime 0.07s = Validation runtime Fitting model: CatBoost_BAG_L1 ... Training model for up to 3477.81s of the 3477.81s of remaining time. Fitting 5 child models (S7F1 - S7F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8471 = Validation score (accuracy) 19.04s = Training runtime 0.09s = Validation runtime Fitting model: NeuralNetFastAI_BAG_L1 ... Training model for up to 3475.35s of the 3475.34s of remaining time. Fitting 5 child models (S7F1 - S7F5) | Fitting with SequentialLocalFoldFittingStrategy No improvement since epoch 5: early stopping No improvement since epoch 1: early stopping No improvement since epoch 3: early stopping No improvement since epoch 1: early stopping 0.8554 = Validation score (accuracy) 63.9s = Training runtime 0.5s = Validation runtime Fitting model: XGBoost_BAG_L1 ... Training model for up to 3466.14s of the 3466.14s of remaining time. Fitting 5 child models (S7F1 - S7F5) | Fitting with SequentialLocalFoldFittingStrategy 0.7934 = Validation score (accuracy) 6.2s = Training runtime 0.15s = Validation runtime Fitting model: NeuralNetTorch_BAG_L1 ... Training model for up to 3465.22s of the 3465.22s of remaining time. Fitting 5 child models (S7F1 - S7F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8306 = Validation score (accuracy) 38.59s = Training runtime 0.41s = Validation runtime Fitting model: LightGBMLarge_BAG_L1 ... Training model for up to 3460.24s of the 3460.23s of remaining time. Fitting 5 child models (S7F1 - S7F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8182 = Validation score (accuracy) 9.64s = Training runtime 0.07s = Validation runtime Repeating k-fold bagging: 8/20 Fitting model: LightGBMXT_BAG_L1 ... Training model for up to 3458.89s of the 3458.88s of remaining time. Fitting 5 child models (S8F1 - S8F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8388 = Validation score (accuracy) 8.01s = Training runtime 0.08s = Validation runtime Fitting model: LightGBM_BAG_L1 ... Training model for up to 3457.86s of the 3457.85s of remaining time. Fitting 5 child models (S8F1 - S8F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8182 = Validation score (accuracy) 7.91s = Training runtime 0.08s = Validation runtime Fitting model: CatBoost_BAG_L1 ... Training model for up to 3456.8s of the 3456.79s of remaining time. Fitting 5 child models (S8F1 - S8F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8554 = Validation score (accuracy) 21.43s = Training runtime 0.1s = Validation runtime Fitting model: NeuralNetFastAI_BAG_L1 ... Training model for up to 3454.34s of the 3454.33s of remaining time. Fitting 5 child models (S8F1 - S8F5) | Fitting with SequentialLocalFoldFittingStrategy No improvement since epoch 3: early stopping No improvement since epoch 3: early stopping No improvement since epoch 1: early stopping No improvement since epoch 5: early stopping No improvement since epoch 2: early stopping 0.8512 = Validation score (accuracy) 73.54s = Training runtime 0.57s = Validation runtime Fitting model: XGBoost_BAG_L1 ... Training model for up to 3444.51s of the 3444.5s of remaining time. Fitting 5 child models (S8F1 - S8F5) | Fitting with SequentialLocalFoldFittingStrategy 0.781 = Validation score (accuracy) 7.05s = Training runtime 0.17s = Validation runtime Fitting model: NeuralNetTorch_BAG_L1 ... Training model for up to 3443.55s of the 3443.55s of remaining time. Fitting 5 child models (S8F1 - S8F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8388 = Validation score (accuracy) 44.64s = Training runtime 0.47s = Validation runtime Fitting model: LightGBMLarge_BAG_L1 ... Training model for up to 3437.38s of the 3437.37s of remaining time. Fitting 5 child models (S8F1 - S8F5) | Fitting with SequentialLocalFoldFittingStrategy 0.814 = Validation score (accuracy) 11.08s = Training runtime 0.08s = Validation runtime Repeating k-fold bagging: 9/20 Fitting model: LightGBMXT_BAG_L1 ... Training model for up to 3435.84s of the 3435.83s of remaining time. Fitting 5 child models (S9F1 - S9F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8306 = Validation score (accuracy) 9.01s = Training runtime 0.09s = Validation runtime Fitting model: LightGBM_BAG_L1 ... Training model for up to 3434.77s of the 3434.76s of remaining time. Fitting 5 child models (S9F1 - S9F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8223 = Validation score (accuracy) 8.92s = Training runtime 0.09s = Validation runtime Fitting model: CatBoost_BAG_L1 ... Training model for up to 3433.69s of the 3433.68s of remaining time. Fitting 5 child models (S9F1 - S9F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8471 = Validation score (accuracy) 24.02s = Training runtime 0.11s = Validation runtime Fitting model: NeuralNetFastAI_BAG_L1 ... Training model for up to 3431.04s of the 3431.03s of remaining time. Fitting 5 child models (S9F1 - S9F5) | Fitting with SequentialLocalFoldFittingStrategy No improvement since epoch 9: early stopping No improvement since epoch 5: early stopping No improvement since epoch 2: early stopping No improvement since epoch 0: early stopping 0.843 = Validation score (accuracy) 83.89s = Training runtime 0.65s = Validation runtime Fitting model: XGBoost_BAG_L1 ... Training model for up to 3420.48s of the 3420.47s of remaining time. Fitting 5 child models (S9F1 - S9F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8017 = Validation score (accuracy) 7.98s = Training runtime 0.19s = Validation runtime Fitting model: NeuralNetTorch_BAG_L1 ... Training model for up to 3419.44s of the 3419.43s of remaining time. Fitting 5 child models (S9F1 - S9F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8471 = Validation score (accuracy) 50.14s = Training runtime 0.52s = Validation runtime Fitting model: LightGBMLarge_BAG_L1 ... Training model for up to 3413.81s of the 3413.8s of remaining time. Fitting 5 child models (S9F1 - S9F5) | Fitting with SequentialLocalFoldFittingStrategy 0.814 = Validation score (accuracy) 12.82s = Training runtime 0.09s = Validation runtime Repeating k-fold bagging: 10/20 Fitting model: LightGBMXT_BAG_L1 ... Training model for up to 3411.98s of the 3411.97s of remaining time. Fitting 5 child models (S10F1 - S10F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8347 = Validation score (accuracy) 10.15s = Training runtime 0.1s = Validation runtime Fitting model: LightGBM_BAG_L1 ... Training model for up to 3410.74s of the 3410.73s of remaining time. Fitting 5 child models (S10F1 - S10F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8223 = Validation score (accuracy) 9.88s = Training runtime 0.09s = Validation runtime Fitting model: CatBoost_BAG_L1 ... Training model for up to 3409.7s of the 3409.69s of remaining time. Fitting 5 child models (S10F1 - S10F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8512 = Validation score (accuracy) 26.56s = Training runtime 0.13s = Validation runtime Fitting model: NeuralNetFastAI_BAG_L1 ... Training model for up to 3407.08s of the 3407.08s of remaining time. Fitting 5 child models (S10F1 - S10F5) | Fitting with SequentialLocalFoldFittingStrategy No improvement since epoch 5: early stopping No improvement since epoch 1: early stopping No improvement since epoch 1: early stopping No improvement since epoch 4: early stopping 0.8388 = Validation score (accuracy) 92.97s = Training runtime 0.72s = Validation runtime Fitting model: XGBoost_BAG_L1 ... Training model for up to 3397.82s of the 3397.81s of remaining time. Fitting 5 child models (S10F1 - S10F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8017 = Validation score (accuracy) 9.09s = Training runtime 0.22s = Validation runtime Fitting model: NeuralNetTorch_BAG_L1 ... Training model for up to 3396.57s of the 3396.57s of remaining time. Fitting 5 child models (S10F1 - S10F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8471 = Validation score (accuracy) 55.21s = Training runtime 0.58s = Validation runtime Fitting model: LightGBMLarge_BAG_L1 ... Training model for up to 3391.38s of the 3391.37s of remaining time. Fitting 5 child models (S10F1 - S10F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8182 = Validation score (accuracy) 14.42s = Training runtime 0.1s = Validation runtime Repeating k-fold bagging: 11/20 Fitting model: LightGBMXT_BAG_L1 ... Training model for up to 3389.6s of the 3389.6s of remaining time. Fitting 5 child models (S11F1 - S11F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8306 = Validation score (accuracy) 11.37s = Training runtime 0.11s = Validation runtime Fitting model: LightGBM_BAG_L1 ... Training model for up to 3388.31s of the 3388.3s of remaining time. Fitting 5 child models (S11F1 - S11F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8182 = Validation score (accuracy) 10.83s = Training runtime 0.1s = Validation runtime Fitting model: CatBoost_BAG_L1 ... Training model for up to 3387.29s of the 3387.29s of remaining time. Fitting 5 child models (S11F1 - S11F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8388 = Validation score (accuracy) 29.02s = Training runtime 0.14s = Validation runtime Fitting model: NeuralNetFastAI_BAG_L1 ... Training model for up to 3384.76s of the 3384.75s of remaining time. Fitting 5 child models (S11F1 - S11F5) | Fitting with SequentialLocalFoldFittingStrategy No improvement since epoch 6: early stopping No improvement since epoch 6: early stopping No improvement since epoch 1: early stopping No improvement since epoch 1: early stopping 0.843 = Validation score (accuracy) 103.09s = Training runtime 0.91s = Validation runtime Fitting model: XGBoost_BAG_L1 ... Training model for up to 3374.31s of the 3374.31s of remaining time. Fitting 5 child models (S11F1 - S11F5) | Fitting with SequentialLocalFoldFittingStrategy 0.7934 = Validation score (accuracy) 9.89s = Training runtime 0.24s = Validation runtime Fitting model: NeuralNetTorch_BAG_L1 ... Training model for up to 3373.4s of the 3373.39s of remaining time. Fitting 5 child models (S11F1 - S11F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8471 = Validation score (accuracy) 60.74s = Training runtime 0.63s = Validation runtime Fitting model: LightGBMLarge_BAG_L1 ... Training model for up to 3367.74s of the 3367.74s of remaining time. Fitting 5 child models (S11F1 - S11F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8223 = Validation score (accuracy) 15.91s = Training runtime 0.11s = Validation runtime Repeating k-fold bagging: 12/20 Fitting model: LightGBMXT_BAG_L1 ... Training model for up to 3366.16s of the 3366.15s of remaining time. Fitting 5 child models (S12F1 - S12F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8347 = Validation score (accuracy) 12.37s = Training runtime 0.12s = Validation runtime Fitting model: LightGBM_BAG_L1 ... Training model for up to 3365.08s of the 3365.07s of remaining time. Fitting 5 child models (S12F1 - S12F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8182 = Validation score (accuracy) 11.76s = Training runtime 0.11s = Validation runtime Fitting model: CatBoost_BAG_L1 ... Training model for up to 3364.07s of the 3364.06s of remaining time. Fitting 5 child models (S12F1 - S12F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8388 = Validation score (accuracy) 31.42s = Training runtime 0.15s = Validation runtime Fitting model: NeuralNetFastAI_BAG_L1 ... Training model for up to 3361.6s of the 3361.6s of remaining time. Fitting 5 child models (S12F1 - S12F5) | Fitting with SequentialLocalFoldFittingStrategy No improvement since epoch 1: early stopping No improvement since epoch 3: early stopping No improvement since epoch 1: early stopping No improvement since epoch 2: early stopping No improvement since epoch 1: early stopping 0.8471 = Validation score (accuracy) 111.39s = Training runtime 0.98s = Validation runtime Fitting model: XGBoost_BAG_L1 ... Training model for up to 3353.11s of the 3353.1s of remaining time. Fitting 5 child models (S12F1 - S12F5) | Fitting with SequentialLocalFoldFittingStrategy 0.7934 = Validation score (accuracy) 10.72s = Training runtime 0.26s = Validation runtime Fitting model: NeuralNetTorch_BAG_L1 ... Training model for up to 3352.15s of the 3352.14s of remaining time. Fitting 5 child models (S12F1 - S12F5) | Fitting with SequentialLocalFoldFittingStrategy 0.843 = Validation score (accuracy) 67.07s = Training runtime 0.69s = Validation runtime Fitting model: LightGBMLarge_BAG_L1 ... Training model for up to 3345.69s of the 3345.68s of remaining time. Fitting 5 child models (S12F1 - S12F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8099 = Validation score (accuracy) 17.35s = Training runtime 0.12s = Validation runtime Repeating k-fold bagging: 13/20 Fitting model: LightGBMXT_BAG_L1 ... Training model for up to 3344.15s of the 3344.14s of remaining time. Fitting 5 child models (S13F1 - S13F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8347 = Validation score (accuracy) 13.31s = Training runtime 0.13s = Validation runtime Fitting model: LightGBM_BAG_L1 ... Training model for up to 3343.14s of the 3343.13s of remaining time. Fitting 5 child models (S13F1 - S13F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8182 = Validation score (accuracy) 12.75s = Training runtime 0.12s = Validation runtime Fitting model: CatBoost_BAG_L1 ... Training model for up to 3342.06s of the 3342.05s of remaining time. Fitting 5 child models (S13F1 - S13F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8347 = Validation score (accuracy) 33.92s = Training runtime 0.16s = Validation runtime Fitting model: NeuralNetFastAI_BAG_L1 ... Training model for up to 3339.49s of the 3339.48s of remaining time. Fitting 5 child models (S13F1 - S13F5) | Fitting with SequentialLocalFoldFittingStrategy No improvement since epoch 2: early stopping No improvement since epoch 1: early stopping No improvement since epoch 1: early stopping No improvement since epoch 3: early stopping No improvement since epoch 4: early stopping 0.8471 = Validation score (accuracy) 119.89s = Training runtime 1.05s = Validation runtime Fitting model: XGBoost_BAG_L1 ... Training model for up to 3330.81s of the 3330.8s of remaining time. Fitting 5 child models (S13F1 - S13F5) | Fitting with SequentialLocalFoldFittingStrategy 0.7851 = Validation score (accuracy) 11.63s = Training runtime 0.28s = Validation runtime Fitting model: NeuralNetTorch_BAG_L1 ... Training model for up to 3329.78s of the 3329.77s of remaining time. Fitting 5 child models (S13F1 - S13F5) | Fitting with SequentialLocalFoldFittingStrategy 0.843 = Validation score (accuracy) 72.24s = Training runtime 0.74s = Validation runtime Fitting model: LightGBMLarge_BAG_L1 ... Training model for up to 3324.47s of the 3324.46s of remaining time. Fitting 5 child models (S13F1 - S13F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8182 = Validation score (accuracy) 18.65s = Training runtime 0.13s = Validation runtime Repeating k-fold bagging: 14/20 Fitting model: LightGBMXT_BAG_L1 ... Training model for up to 3323.09s of the 3323.08s of remaining time. Fitting 5 child models (S14F1 - S14F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8347 = Validation score (accuracy) 14.25s = Training runtime 0.14s = Validation runtime Fitting model: LightGBM_BAG_L1 ... Training model for up to 3322.09s of the 3322.08s of remaining time. Fitting 5 child models (S14F1 - S14F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8264 = Validation score (accuracy) 13.69s = Training runtime 0.13s = Validation runtime Fitting model: CatBoost_BAG_L1 ... Training model for up to 3321.05s of the 3321.04s of remaining time. Fitting 5 child models (S14F1 - S14F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8388 = Validation score (accuracy) 36.31s = Training runtime 0.18s = Validation runtime Fitting model: NeuralNetFastAI_BAG_L1 ... Training model for up to 3318.59s of the 3318.58s of remaining time. Fitting 5 child models (S14F1 - S14F5) | Fitting with SequentialLocalFoldFittingStrategy No improvement since epoch 1: early stopping No improvement since epoch 1: early stopping No improvement since epoch 0: early stopping No improvement since epoch 2: early stopping No improvement since epoch 2: early stopping 0.8471 = Validation score (accuracy) 128.43s = Training runtime 1.12s = Validation runtime Fitting model: XGBoost_BAG_L1 ... Training model for up to 3309.85s of the 3309.85s of remaining time. Fitting 5 child models (S14F1 - S14F5) | Fitting with SequentialLocalFoldFittingStrategy 0.7893 = Validation score (accuracy) 12.41s = Training runtime 0.3s = Validation runtime Fitting model: NeuralNetTorch_BAG_L1 ... Training model for up to 3308.95s of the 3308.95s of remaining time. Fitting 5 child models (S14F1 - S14F5) | Fitting with SequentialLocalFoldFittingStrategy 0.843 = Validation score (accuracy) 77.57s = Training runtime 0.8s = Validation runtime Fitting model: LightGBMLarge_BAG_L1 ... Training model for up to 3303.49s of the 3303.49s of remaining time. Fitting 5 child models (S14F1 - S14F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8099 = Validation score (accuracy) 20.16s = Training runtime 0.14s = Validation runtime Repeating k-fold bagging: 15/20 Fitting model: LightGBMXT_BAG_L1 ... Training model for up to 3301.88s of the 3301.87s of remaining time. Fitting 5 child models (S15F1 - S15F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8347 = Validation score (accuracy) 15.17s = Training runtime 0.15s = Validation runtime Fitting model: LightGBM_BAG_L1 ... Training model for up to 3300.89s of the 3300.88s of remaining time. Fitting 5 child models (S15F1 - S15F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8223 = Validation score (accuracy) 14.65s = Training runtime 0.14s = Validation runtime Fitting model: CatBoost_BAG_L1 ... Training model for up to 3299.86s of the 3299.85s of remaining time. Fitting 5 child models (S15F1 - S15F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8306 = Validation score (accuracy) 38.88s = Training runtime 0.19s = Validation runtime Fitting model: NeuralNetFastAI_BAG_L1 ... Training model for up to 3297.22s of the 3297.21s of remaining time. Fitting 5 child models (S15F1 - S15F5) | Fitting with SequentialLocalFoldFittingStrategy No improvement since epoch 2: early stopping No improvement since epoch 4: early stopping No improvement since epoch 5: early stopping No improvement since epoch 5: early stopping No improvement since epoch 2: early stopping 0.8471 = Validation score (accuracy) 137.46s = Training runtime 1.19s = Validation runtime Fitting model: XGBoost_BAG_L1 ... Training model for up to 3288.0s of the 3287.99s of remaining time. Fitting 5 child models (S15F1 - S15F5) | Fitting with SequentialLocalFoldFittingStrategy 0.7934 = Validation score (accuracy) 13.32s = Training runtime 0.32s = Validation runtime Fitting model: NeuralNetTorch_BAG_L1 ... Training model for up to 3286.97s of the 3286.96s of remaining time. Fitting 5 child models (S15F1 - S15F5) | Fitting with SequentialLocalFoldFittingStrategy 0.843 = Validation score (accuracy) 83.78s = Training runtime 0.86s = Validation runtime Fitting model: LightGBMLarge_BAG_L1 ... Training model for up to 3280.63s of the 3280.62s of remaining time. Fitting 5 child models (S15F1 - S15F5) | Fitting with SequentialLocalFoldFittingStrategy 0.814 = Validation score (accuracy) 21.51s = Training runtime 0.15s = Validation runtime Repeating k-fold bagging: 16/20 Fitting model: LightGBMXT_BAG_L1 ... Training model for up to 3279.2s of the 3279.19s of remaining time. Fitting 5 child models (S16F1 - S16F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8388 = Validation score (accuracy) 16.1s = Training runtime 0.16s = Validation runtime Fitting model: LightGBM_BAG_L1 ... Training model for up to 3278.2s of the 3278.19s of remaining time. Fitting 5 child models (S16F1 - S16F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8223 = Validation score (accuracy) 15.68s = Training runtime 0.15s = Validation runtime Fitting model: CatBoost_BAG_L1 ... Training model for up to 3277.08s of the 3277.07s of remaining time. Fitting 5 child models (S16F1 - S16F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8347 = Validation score (accuracy) 41.33s = Training runtime 0.2s = Validation runtime Fitting model: NeuralNetFastAI_BAG_L1 ... Training model for up to 3274.56s of the 3274.55s of remaining time. Fitting 5 child models (S16F1 - S16F5) | Fitting with SequentialLocalFoldFittingStrategy No improvement since epoch 2: early stopping No improvement since epoch 5: early stopping No improvement since epoch 1: early stopping No improvement since epoch 1: early stopping No improvement since epoch 1: early stopping 0.8512 = Validation score (accuracy) 145.89s = Training runtime 1.26s = Validation runtime Fitting model: XGBoost_BAG_L1 ... Training model for up to 3265.93s of the 3265.92s of remaining time. Fitting 5 child models (S16F1 - S16F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8017 = Validation score (accuracy) 14.16s = Training runtime 0.34s = Validation runtime Fitting model: NeuralNetTorch_BAG_L1 ... Training model for up to 3264.97s of the 3264.96s of remaining time. Fitting 5 child models (S16F1 - S16F5) | Fitting with SequentialLocalFoldFittingStrategy 0.843 = Validation score (accuracy) 89.41s = Training runtime 0.91s = Validation runtime Fitting model: LightGBMLarge_BAG_L1 ... Training model for up to 3259.21s of the 3259.2s of remaining time. Fitting 5 child models (S16F1 - S16F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8182 = Validation score (accuracy) 22.82s = Training runtime 0.16s = Validation runtime Repeating k-fold bagging: 17/20 Fitting model: LightGBMXT_BAG_L1 ... Training model for up to 3257.82s of the 3257.81s of remaining time. Fitting 5 child models (S17F1 - S17F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8347 = Validation score (accuracy) 17.13s = Training runtime 0.17s = Validation runtime Fitting model: LightGBM_BAG_L1 ... Training model for up to 3256.7s of the 3256.69s of remaining time. Fitting 5 child models (S17F1 - S17F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8182 = Validation score (accuracy) 16.64s = Training runtime 0.16s = Validation runtime Fitting model: CatBoost_BAG_L1 ... Training model for up to 3255.64s of the 3255.64s of remaining time. Fitting 5 child models (S17F1 - S17F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8306 = Validation score (accuracy) 43.9s = Training runtime 0.21s = Validation runtime Fitting model: NeuralNetFastAI_BAG_L1 ... Training model for up to 3253.01s of the 3253.0s of remaining time. Fitting 5 child models (S17F1 - S17F5) | Fitting with SequentialLocalFoldFittingStrategy No improvement since epoch 2: early stopping No improvement since epoch 5: early stopping No improvement since epoch 8: early stopping No improvement since epoch 2: early stopping 0.8554 = Validation score (accuracy) 155.31s = Training runtime 1.33s = Validation runtime Fitting model: XGBoost_BAG_L1 ... Training model for up to 3243.39s of the 3243.38s of remaining time. Fitting 5 child models (S17F1 - S17F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8017 = Validation score (accuracy) 15.03s = Training runtime 0.36s = Validation runtime Fitting model: NeuralNetTorch_BAG_L1 ... Training model for up to 3242.4s of the 3242.39s of remaining time. Fitting 5 child models (S17F1 - S17F5) | Fitting with SequentialLocalFoldFittingStrategy 0.843 = Validation score (accuracy) 94.55s = Training runtime 0.96s = Validation runtime Fitting model: LightGBMLarge_BAG_L1 ... Training model for up to 3237.13s of the 3237.12s of remaining time. Fitting 5 child models (S17F1 - S17F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8264 = Validation score (accuracy) 24.11s = Training runtime 0.17s = Validation runtime Repeating k-fold bagging: 18/20 Fitting model: LightGBMXT_BAG_L1 ... Training model for up to 3235.76s of the 3235.75s of remaining time. Fitting 5 child models (S18F1 - S18F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8388 = Validation score (accuracy) 18.07s = Training runtime 0.18s = Validation runtime Fitting model: LightGBM_BAG_L1 ... Training model for up to 3234.74s of the 3234.73s of remaining time. Fitting 5 child models (S18F1 - S18F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8182 = Validation score (accuracy) 17.57s = Training runtime 0.17s = Validation runtime Fitting model: CatBoost_BAG_L1 ... Training model for up to 3233.73s of the 3233.72s of remaining time. Fitting 5 child models (S18F1 - S18F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8306 = Validation score (accuracy) 46.34s = Training runtime 0.23s = Validation runtime Fitting model: NeuralNetFastAI_BAG_L1 ... Training model for up to 3231.21s of the 3231.21s of remaining time. Fitting 5 child models (S18F1 - S18F5) | Fitting with SequentialLocalFoldFittingStrategy No improvement since epoch 7: early stopping No improvement since epoch 1: early stopping No improvement since epoch 1: early stopping No improvement since epoch 5: early stopping No improvement since epoch 2: early stopping 0.8512 = Validation score (accuracy) 164.31s = Training runtime 1.4s = Validation runtime Fitting model: XGBoost_BAG_L1 ... Training model for up to 3222.01s of the 3222.0s of remaining time. Fitting 5 child models (S18F1 - S18F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8058 = Validation score (accuracy) 15.82s = Training runtime 0.38s = Validation runtime Fitting model: NeuralNetTorch_BAG_L1 ... Training model for up to 3221.09s of the 3221.09s of remaining time. Fitting 5 child models (S18F1 - S18F5) | Fitting with SequentialLocalFoldFittingStrategy 0.843 = Validation score (accuracy) 100.48s = Training runtime 1.02s = Validation runtime Fitting model: LightGBMLarge_BAG_L1 ... Training model for up to 3215.04s of the 3215.03s of remaining time. Fitting 5 child models (S18F1 - S18F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8264 = Validation score (accuracy) 25.38s = Training runtime 0.18s = Validation runtime Repeating k-fold bagging: 19/20 Fitting model: LightGBMXT_BAG_L1 ... Training model for up to 3213.69s of the 3213.68s of remaining time. Fitting 5 child models (S19F1 - S19F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8388 = Validation score (accuracy) 19.01s = Training runtime 0.18s = Validation runtime Fitting model: LightGBM_BAG_L1 ... Training model for up to 3212.68s of the 3212.68s of remaining time. Fitting 5 child models (S19F1 - S19F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8223 = Validation score (accuracy) 18.55s = Training runtime 0.18s = Validation runtime Fitting model: CatBoost_BAG_L1 ... Training model for up to 3211.63s of the 3211.62s of remaining time. Fitting 5 child models (S19F1 - S19F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8306 = Validation score (accuracy) 48.74s = Training runtime 0.24s = Validation runtime Fitting model: NeuralNetFastAI_BAG_L1 ... Training model for up to 3209.16s of the 3209.15s of remaining time. Fitting 5 child models (S19F1 - S19F5) | Fitting with SequentialLocalFoldFittingStrategy No improvement since epoch 3: early stopping No improvement since epoch 1: early stopping No improvement since epoch 1: early stopping 0.8595 = Validation score (accuracy) 173.84s = Training runtime 1.46s = Validation runtime Fitting model: XGBoost_BAG_L1 ... Training model for up to 3199.44s of the 3199.44s of remaining time. Fitting 5 child models (S19F1 - S19F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8058 = Validation score (accuracy) 16.66s = Training runtime 0.4s = Validation runtime Fitting model: NeuralNetTorch_BAG_L1 ... Training model for up to 3198.48s of the 3198.47s of remaining time. Fitting 5 child models (S19F1 - S19F5) | Fitting with SequentialLocalFoldFittingStrategy 0.843 = Validation score (accuracy) 106.37s = Training runtime 1.07s = Validation runtime Fitting model: LightGBMLarge_BAG_L1 ... Training model for up to 3192.46s of the 3192.45s of remaining time. Fitting 5 child models (S19F1 - S19F5) | Fitting with SequentialLocalFoldFittingStrategy 0.814 = Validation score (accuracy) 26.83s = Training runtime 0.19s = Validation runtime Repeating k-fold bagging: 20/20 Fitting model: LightGBMXT_BAG_L1 ... Training model for up to 3190.89s of the 3190.88s of remaining time. Fitting 5 child models (S20F1 - S20F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8388 = Validation score (accuracy) 19.94s = Training runtime 0.19s = Validation runtime Fitting model: LightGBM_BAG_L1 ... Training model for up to 3189.88s of the 3189.88s of remaining time. Fitting 5 child models (S20F1 - S20F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8223 = Validation score (accuracy) 19.88s = Training runtime 0.19s = Validation runtime Fitting model: CatBoost_BAG_L1 ... Training model for up to 3188.45s of the 3188.45s of remaining time. Fitting 5 child models (S20F1 - S20F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8264 = Validation score (accuracy) 51.37s = Training runtime 0.25s = Validation runtime Fitting model: NeuralNetFastAI_BAG_L1 ... Training model for up to 3185.74s of the 3185.73s of remaining time. Fitting 5 child models (S20F1 - S20F5) | Fitting with SequentialLocalFoldFittingStrategy No improvement since epoch 1: early stopping No improvement since epoch 1: early stopping No improvement since epoch 1: early stopping No improvement since epoch 3: early stopping No improvement since epoch 7: early stopping 0.8554 = Validation score (accuracy) 182.63s = Training runtime 1.54s = Validation runtime Fitting model: XGBoost_BAG_L1 ... Training model for up to 3176.75s of the 3176.74s of remaining time. Fitting 5 child models (S20F1 - S20F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8058 = Validation score (accuracy) 17.5s = Training runtime 0.42s = Validation runtime Fitting model: NeuralNetTorch_BAG_L1 ... Training model for up to 3175.78s of the 3175.77s of remaining time. Fitting 5 child models (S20F1 - S20F5) | Fitting with SequentialLocalFoldFittingStrategy 0.8388 = Validation score (accuracy) 111.38s = Training runtime 1.13s = Validation runtime Fitting model: LightGBMLarge_BAG_L1 ... Training model for up to 3170.64s of the 3170.63s of remaining time. Fitting 5 child models (S20F1 - S20F5) | Fitting with SequentialLocalFoldFittingStrategy 0.814 = Validation score (accuracy) 28.13s = Training runtime 0.2s = Validation runtime Completed 20/20 k-fold bagging repeats ... Fitting model: WeightedEnsemble_L2 ... Training model for up to 361.87s of the 3169.23s of remaining time. 0.8678 = Validation score (accuracy) 1.01s = Training runtime 0.0s = Validation runtime AutoGluon training complete, total runtime = 450.85s ... Best model: "WeightedEnsemble_L2" TabularPredictor saved. To load, use: predictor = TabularPredictor.load("RESULT_AUTOGLUON2/")
ベストモデルはデフォルト設定と変化ありませんでしたが、実行時間は450秒に増えました。
importance stddev p_value n p99_high p99_low
ca 0.088430 0.013262 0.000059 5 0.115736 0.061123
thal_7.0 0.066942 0.013517 0.000189 5 0.094774 0.039111
cp_4.0 0.059504 0.011165 0.000142 5 0.082492 0.036516
trestbps 0.044628 0.007949 0.000116 5 0.060994 0.028262
thalach 0.039669 0.013891 0.001543 5 0.068271 0.011068
chol 0.034711 0.004711 0.000040 5 0.044412 0.025010
age 0.030579 0.007507 0.000403 5 0.046035 0.015122
oldpeak 0.021488 0.012534 0.009281 5 0.047295 -0.004319
restecg_2.0 0.017355 0.003457 0.000179 5 0.024474 0.010237
exang 0.016529 0.002922 0.000112 5 0.022545 0.010513
sex 0.013223 0.011466 0.030708 5 0.036833 -0.010386
slope_2.0 0.013223 0.001848 0.000045 5 0.017028 0.009418
fbs 0.009091 0.003457 0.002091 5 0.016209 0.001972
thal_6.0 0.008264 0.004132 0.005528 5 0.016773 -0.000244
cp_2.0 0.006612 0.006267 0.038871 5 0.019515 -0.006292
slope_3.0 0.005785 0.002263 0.002318 5 0.010445 0.001125
cp_3.0 0.003306 0.003457 0.049650 5 0.010424 -0.003813
restecg_1.0 0.000826 0.004527 0.352000 5 0.010147 -0.008494
トップ3に変化はありませんね。
predictor.leaderboard(test, silent=True)
model score_test score_val pred_time_test pred_time_val fit_time pred_time_test_marginal pred_time_val_marginal fit_time_marginal stack_level can_infer fit_order 0 NeuralNetTorch_BAG_L1 0.901639 0.838843 1.313896 1.131090 111.379313 1.313896 1.131090 111.379313 1 True 12 1 NeuralNetFastAI_BAG_L1 0.901639 0.855372 1.844487 1.535011 182.631848 1.844487 1.535011 182.631848 1 True 10 2 RandomForestEntr_BAG_L1 0.885246 0.776860 0.091751 0.120122 0.647892 0.091751 0.120122 0.647892 1 True 6 3 CatBoost_BAG_L1 0.885246 0.826446 0.213271 0.251431 51.373805 0.213271 0.251431 51.373805 1 True 7 4 WeightedEnsemble_L2 0.885246 0.867769 2.078809 1.805338 235.025834 0.003749 0.001581 1.007252 2 True 14 5 RandomForestGini_BAG_L1 0.868852 0.793388 0.088803 0.116070 0.715415 0.088803 0.116070 0.715415 1 True 5 6 ExtraTreesEntr_BAG_L1 0.868852 0.797521 0.097888 0.113942 0.724379 0.097888 0.113942 0.724379 1 True 9 7 LightGBM_BAG_L1 0.868852 0.822314 0.258172 0.193620 19.879937 0.258172 0.193620 19.879937 1 True 4 8 LightGBMXT_BAG_L1 0.868852 0.838843 0.334872 0.192702 19.937207 0.334872 0.192702 19.937207 1 True 3 9 XGBoost_BAG_L1 0.868852 0.805785 1.129513 0.416742 17.504010 1.129513 0.416742 17.504010 1 True 11 10 LightGBMLarge_BAG_L1 0.852459 0.814050 0.316309 0.197132 28.125010 0.316309 0.197132 28.125010 1 True 13 11 ExtraTreesGini_BAG_L1 0.836066 0.785124 0.079650 0.155725 0.693128 0.079650 0.155725 0.693128 1 True 8 12 KNeighborsDist_BAG_L1 0.672131 0.636364 0.017302 0.017315 0.012930 0.017302 0.017315 0.012930 1 True 2 13 KNeighborsUnif_BAG_L1 0.655738 0.648760 0.027467 0.019590 0.016064 0.027467 0.019590 0.016064 1 True 1
BAG_L1がついているアルゴリズムがほとんどですね。
# 学習データとテストデータへの当てはまりを確認
from sklearn.metrics import accuracy_score
y_train_pred = predictor.predict(train)
y_test_pred = predictor.predict(test)
print("train",accuracy_score(y_train, y_train_pred))
print("test",accuracy_score(y_test, y_test_pred))
train 0.9462809917355371 test 0.8852459016393442
精度は上がりましたが、オーバーフィットしてしまっているかも。
auto-sklearnでモデリング
最後はauto-sklearnです。
auto-sklearnは少し古いバージョンのsklearnが依存関係になっています。
そのため、モデリング用データの作成を下記のように変更する必要があります。
- handle_unknown='igrnore' → handle_unknown='error'
OneHotEnc = OneHotEncoder(categories='auto',drop='first',handle_unknown='error')
- get_feature_names_out() → get_feature_names()
dummy_cols = OneHotEnc.get_feature_names()
import autosklearn.classification
automl = autosklearn.classification.AutoSklearnClassifier(
time_left_for_this_task=3619,
per_run_time_limit=600,
tmp_folder="autosklearn_classification_cleveland",
)
automl.fit(X_train, y_train, dataset_name="cleveland")
AutoSklearnClassifier(ensemble_class=class 'autosklearn.ensembles.ensemble_selection.EnsembleSelection'>, per_run_time_limit=600, time_left_for_this_task=3619, tmp_folder='autosklearn_classification_cleveland')
学習の進捗具合はtmp_folderで指定したフォルダの中にログとして書き込まれています。
# 学習データとテストデータへの当てはまりを確認
from sklearn.metrics import accuracy_score
y_train_pred = automl.predict(X_train)
y_test_pred = automl.predict(X_test)
print("train",accuracy_score(y_train, y_train_pred))
print("test",accuracy_score(y_test, y_test_pred))
train 0.8801652892561983 test 0.8688524590163934
精度はAutoGluonのデフォルト設定よりはいいかなという感じですね。
まとめ
今回は3つのAutoMLを試して見ました。
本データセットにはmljarのアルゴリズムが合っていたようで、一番精度が高かったです。(時間はかかりますが、、)
結果の分かりやすさを重視するのであれば、決定木やロジスティック回帰を使ってモデリングをするのが個人的には良いかと思いますが、精度重視の場合はAutoMLを使うのがいいかも知れませんね。