Evaluasi Komparatif Algoritma Machine Learning Menggunakan SMOTENC pada Klasifikasi Status Gizi Balita Berbasis Data Antropometri

Erip Suratno; Adhi Kusnadi; Zeldi Suryady

doi:10.31544/jtera.v11.i1.2026.101-110

Abstract

Class imbalance in under-five nutritional status data can make classification models appear accurate while limiting minority-class recognition. This study evaluated Support Vector Machine, K-Nearest Neighbors, Naive Bayes, Decision Tree, and Random Forest for multiclass classification of under-five nutritional status using anthropometric data. The dataset contained 3,716 Posyandu Desa Ujunggenteng records from 2025, with sex, age, weight, and height as features and four target classes: undernutrition, good nutritional status, risk of overnutrition, and overnutrition. Labels were treated as operational records, not as recalculated WHO/Indonesian Ministry of Health z-score standards. The experiment used an 80:20 stratified train-test split, 5-fold Stratified K-Fold Cross-Validation, limited candidate-based parameter search, numerical-categorical preprocessing, and SMOTENC only on training data or training folds. Cross-validation showed that SVM achieved the highest CV Macro F1-score of 0.829578. On the test set, Random Forest obtained the highest Macro F1-score of 0.827132, while KNN achieved the highest accuracy of 0.956989 and balanced accuracy of 0.883393. Model selection should consider Macro F1-score, balanced accuracy, confusion matrix, and class-wise performance. Based on cross-validation, SVM was selected as the final model, while Random Forest was reported as the best test-set model based on Macro F1-score.

References

P. Mendon, M. Witsch, M. Becker, A. Adamski, and M. Vaillant, “Facilitating comprehensive child health monitoring within REDCap - an open-source code for real-time Z-score assessments,” BMC Med. Res. Methodol., vol. 24, no. 1, 2024, doi: 10.1186/s12874-024-02405-0.

[2] R. Oliveira-Santos et al., “Composite anthropometric data quality index for children under the age of 5 on the Brazilian National Food and Nutrition Surveillance System, 2019–2021,” Popul. Health Metr., vol. 23, no. 1, 2025, doi: 10.1186/s12963-025-00371-3.

[3] F. H. Bitew, C. S. Sparks, and S. H. Nyarko, “Machine learning algorithms for predicting undernutrition among under-five children in Ethiopia,” Public Health Nutr., vol. 25, no. 2, pp. 269–280, 2022, doi: 10.1017/S1368980021004262.

[4] O. N. Chilyabanyama et al., “Performance of Machine Learning Classifiers in Classifying Stunting among Under-Five Children in Zambia,” Children, vol. 9, no. 7, 2022, doi: 10.3390/children9071082.

[5] S. Ndagijimana, I. H. Kabano, E. Masabo, and J. M. Ntaganda, “Prediction of Stunting among Under-5 Children in Rwanda Using Machine Learning Techniques,” J. Prev. Med. Public Heal., vol. 56, no. 1, pp. 41–49, 2023, doi: 10.3961/jpmph.22.388.

[6] N. Novalina, I. A. A. Tarigan, F. K. Kameela, and M. Rizkinia, “Benchmarking machine learning algorithm for stunting risk prediction in Indonesia,” Bull. Electr. Eng. Informatics, vol. 14, no. 3, pp. 2252–2263, 2025, doi: 10.11591/eei.v14i3.8997.

[7] M. K. Ayele, G. A. Baye, S. H. Yesuf, A. A. Engda, and E. T. Mitiku, “Predicting stunting status among under five children in ethiopia using ensemblemachine learning algorithms,” Sci. Rep., vol. 15, no. 1, pp. 1–11, 2025, doi: 10.1038/s41598-025-03206-1.

[8] M. N. A. Khan and R. M. Yunus, “A hybrid ensemble approach to accelerate the classification accuracy for predicting malnutrition among under-five children in sub-Saharan African countries,” Nutrition, vol. 108, p. 111947, 2023, doi: https://doi.org/10.1016/j.nut.2022.111947.

[9] M. M. Islam, N. M. Shoukot Jahan Kibria, S. Kumar, D. C. Roy, and M. R. Karim, “Prediction of undernutrition and identification of its influencing predictors among under-five children in Bangladesh using explainable machine learning algorithms,” PLoS One, vol. 19, no. 12, pp. 1–22, 2024, doi: 10.1371/journal.pone.0315393.

[10] G. B. Begashaw, T. Zewotir, and H. M. Fenta, “A deep learning approach for classifying and predicting children’s nutritional status in Ethiopia using LSTM-FC neural networks,” BioData Min., vol. 18, no. 1, 2025, doi: 10.1186/s13040-025-00425-0.

[11] E. K. Anku and H. O. Duah, “Predicting and identifying factors associated with undernutrition among children under five years in Ghana using machine learning algorithms,” PLoS One, vol. 19, no. 2 February, pp. 1–16, 2024, doi: 10.1371/journal.pone.0296625.

[12] S. Wang, Y. Dai, J. Shen, and J. Xuan, “Research on expansion and classification of imbalanced data based on SMOTE algorithm,” Sci. Rep., vol. 11, no. 1, pp. 1–11, 2021, doi: 10.1038/s41598-021-03430-5.

[13] J. Li, Q. Zhu, Q. Wu, and Z. Fan, “A novel oversampling technique for class-imbalanced learning based on SMOTE and natural neighbors,” Inf. Sci. (Ny)., vol. 565, pp. 438–455, 2021, doi: https://doi.org/10.1016/j.ins.2021.03.041.

[14] E. Miranda, M. Aryuni, A. Y. Zakiyyah, Y. E. Kurniawati, A. V. D. Sano, and M. Kumbangsila, “An early prediction model for toddler nutrition based on machine learning from imbalanced data,” Procedia Comput. Sci., vol. 245, pp. 263–271, 2024, doi: 10.1016/j.procs.2024.10.251.

[15] R. Gustriansyah, N. Suhandi, S. Puspasari, and A. Sanmorino, “Machine Learning Method to Predict the Toddlers’ Nutritional Status,” J. Infotel, vol. 16, no. 1, pp. 32–43, 2024, doi: 10.20895/infotel.v15i4.988.

[16] A. Subekti, “Comparative Analysis of Automated Machine Learning Methods for Multiclass Stunting Prediction Using Anthropometric Data,” vol. 10, no. 2, pp. 991–1002, 2026, doi: 10.33395/sinkron.v10i2.15886.

E-ISSN	2548-737X
P-ISSN	2548-8678
Frequency	2× per year
Language	Indonesian / English
Publisher	Politeknik Sukabumi

JTERA

Comparative Evaluation of Machine Learning Algorithms Using SMOTENC in Classifying Toddler Nutritional Status Based on Anthropometric Data

Abstract

References

Issue

Section

License

How to Cite

Journal Info