Table 3. Predictive performance of seven machine learning methods on four FT NIR spectral data resulting from 7-fold cross-validation.

FT NIR spectral data Machine learning models AUC Misclassification R2
Raw data XGBoost 0.964 0.089 0.661
Bootstrap Forest 0.940 0.123 0.474
Boosted Tree 0.937 0.133 0.533
Neural Networks 0.904 0.191 0.422
Decision Tree 0.821 0.227 0.250
Support Vector Machine 0.735 0.338 0.108
PLS-DA 0.766 0.131 0.406
Mean (7 models) 0.867 0.176 0.408
Mean (top 4 models) 0.936 0.134 0.523
SG (2nd polynomial) transformed data XGBoost 0.965 0.081 0.676
Bootstrap Forest 0.950 0.115 0.475
Boosted Tree 0.940 0.123 0.581
Neural Networks 0.914 0.167 0.435
Decision Tree 0.850 0.193 0.328
Support Vector Machine 0.734 0.338 0.106
PLS-DA 0.749 0.249 0.371
Mean (7 models) 0.872 0.181 0.424
Mean (top 4 models) 0.942 0.122 0.542
SG (2nd polynomial) + 1st derivative transformed data XGBoost 0.956 0.086 0.610
Bootstrap Forest 0.946 0.123 0.488
Boosted Tree 0.936 0.124 0.534
Neural Networks 0.891 0.190 0.395
Decision Tree 0.843 0.247 0.271
Support Vector Machine 0.795 0.253 0.227
PLS-DA 0.762 0.236 0.369
Mean (7 models) 0.876 0.180 0.413
Mean (top 4 models) 0.932 0.131 0.507
SG (2nd polynomial) + 2nd derivative transformed data XGBoost 0.900 0.173 0.350
Bootstrap Forest 0.890 0.185 0.353
Boosted Tree 0.880 0.184 0.376
Neural Networks 0.844 0.241 0.277
Decision Tree 0.821 0.259 0.122
Support Vector Machine 0.733 0.287 0.174
PLS-DA 0.770 0.227 0.352
Mean (7 models) 0.834 0.222 0.286
Mean (top 4 models) 0.879 0.196 0.339