Comparative Evaluation of Machine Learning Algorithms Using SMOTENC in Classifying Toddler Nutritional Status Based on Anthropometric Data

Full Text Preview
Download PDF

Abstract

Class imbalance in under-five nutritional status data can make classification models appear accurate while limiting minority-class recognition. This study evaluated Support Vector Machine, K-Nearest Neighbors, Naive Bayes, Decision Tree, and Random Forest for multiclass classification of under-five nutritional status using anthropometric data. The dataset contained 3,716 Posyandu Desa Ujunggenteng records from 2025, with sex, age, weight, and height as features and four target classes: undernutrition, good nutritional status, risk of overnutrition, and overnutrition. Labels were treated as operational records, not as recalculated WHO/Indonesian Ministry of Health z-score standards. The experiment used an 80:20 stratified train-test split, 5-fold Stratified K-Fold Cross-Validation, limited candidate-based parameter search, numerical-categorical preprocessing, and SMOTENC only on training data or training folds. Cross-validation showed that SVM achieved the highest CV Macro F1-score of 0.829578. On the test set, Random Forest obtained the highest Macro F1-score of 0.827132, while KNN achieved the highest accuracy of 0.956989 and balanced accuracy of 0.883393. Model selection should consider Macro F1-score, balanced accuracy, confusion matrix, and class-wise performance. Based on cross-validation, SVM was selected as the final model, while Random Forest was reported as the best test-set model based on Macro F1-score.

References

P. Mendon, M. Witsch, M. Becker, A. Adamski, and M. Vaillant, “Facilitating comprehensive child health monitoring within REDCap - an open-source code for real-time Z-score assessments,” BMC Med. Res. Methodol., vol. 24, no. 1, 2024, doi: 10.1186/s12874-024-02405-0.
[2] R. Oliveira-Santos et al., “Composite anthropometric data quality index for children under the age of 5 on the Brazilian National Food and Nutrition Surveillance System, 2019–2021,” Popul. Health Metr., vol. 23, no. 1, 2025, doi: 10.1186/s12963-025-00371-3.
[3] F. H. Bitew, C. S. Sparks, and S. H. Nyarko, “Machine learning algorithms for predicting undernutrition among under-five children in Ethiopia,” Public Health Nutr., vol. 25, no. 2, pp. 269–280, 2022, doi: 10.1017/S1368980021004262.
[4] O. N. Chilyabanyama et al., “Performance of Machine Learning Classifiers in Classifying Stunting among Under-Five Children in Zambia,” Children, vol. 9, no. 7, 2022, doi: 10.3390/children9071082.
[5] S. Ndagijimana, I. H. Kabano, E. Masabo, and J. M. Ntaganda, “Prediction of Stunting among Under-5 Children in Rwanda Using Machine Learning Techniques,” J. Prev. Med. Public Heal., vol. 56, no. 1, pp. 41–49, 2023, doi: 10.3961/jpmph.22.388.
[6] N. Novalina, I. A. A. Tarigan, F. K. Kameela, and M. Rizkinia, “Benchmarking machine learning algorithm for stunting risk prediction in Indonesia,” Bull. Electr. Eng. Informatics, vol. 14, no. 3, pp. 2252–2263, 2025, doi: 10.11591/eei.v14i3.8997.
[7] M. K. Ayele, G. A. Baye, S. H. Yesuf, A. A. Engda, and E. T. Mitiku, “Predicting stunting status among under five children in ethiopia using ensemblemachine learning algorithms,” Sci. Rep., vol. 15, no. 1, pp. 1–11, 2025, doi: 10.1038/s41598-025-03206-1.
[8] M. N. A. Khan and R. M. Yunus, “A hybrid ensemble approach to accelerate the classification accuracy for predicting malnutrition among under-five children in sub-Saharan African countries,” Nutrition, vol. 108, p. 111947, 2023, doi: https://doi.org/10.1016/j.nut.2022.111947.
[9] M. M. Islam, N. M. Shoukot Jahan Kibria, S. Kumar, D. C. Roy, and M. R. Karim, “Prediction of undernutrition and identification of its influencing predictors among under-five children in Bangladesh using explainable machine learning algorithms,” PLoS One, vol. 19, no. 12, pp. 1–22, 2024, doi: 10.1371/journal.pone.0315393.
[10] G. B. Begashaw, T. Zewotir, and H. M. Fenta, “A deep learning approach for classifying and predicting children’s nutritional status in Ethiopia using LSTM-FC neural networks,” BioData Min., vol. 18, no. 1, 2025, doi: 10.1186/s13040-025-00425-0.
[11] E. K. Anku and H. O. Duah, “Predicting and identifying factors associated with undernutrition among children under five years in Ghana using machine learning algorithms,” PLoS One, vol. 19, no. 2 February, pp. 1–16, 2024, doi: 10.1371/journal.pone.0296625.
[12] S. Wang, Y. Dai, J. Shen, and J. Xuan, “Research on expansion and classification of imbalanced data based on SMOTE algorithm,” Sci. Rep., vol. 11, no. 1, pp. 1–11, 2021, doi: 10.1038/s41598-021-03430-5.
[13] J. Li, Q. Zhu, Q. Wu, and Z. Fan, “A novel oversampling technique for class-imbalanced learning based on SMOTE and natural neighbors,” Inf. Sci. (Ny)., vol. 565, pp. 438–455, 2021, doi: https://doi.org/10.1016/j.ins.2021.03.041.
[14] E. Miranda, M. Aryuni, A. Y. Zakiyyah, Y. E. Kurniawati, A. V. D. Sano, and M. Kumbangsila, “An early prediction model for toddler nutrition based on machine learning from imbalanced data,” Procedia Comput. Sci., vol. 245, pp. 263–271, 2024, doi: 10.1016/j.procs.2024.10.251.
[15] R. Gustriansyah, N. Suhandi, S. Puspasari, and A. Sanmorino, “Machine Learning Method to Predict the Toddlers’ Nutritional Status,” J. Infotel, vol. 16, no. 1, pp. 32–43, 2024, doi: 10.20895/infotel.v15i4.988.
[16] A. Subekti, “Comparative Analysis of Automated Machine Learning Methods for Multiclass Stunting Prediction Using Anthropometric Data,” vol. 10, no. 2, pp. 991–1002, 2026, doi: 10.33395/sinkron.v10i2.15886.

How to Cite

[1]
“Comparative Evaluation of Machine Learning Algorithms Using SMOTENC in Classifying Toddler Nutritional Status Based on Anthropometric Data”, JTERA, vol. 11, no. 1, pp. 101–110, Jun. 2026, doi: 10.31544/jtera.v11.i1.2026.101-110.