Implementation of the LightGBM–CatBoost Ensemble Method for Obesity Risk Classification in Productive Age


Authors

  • Kecitaan Harefa Universitas Pamulang, Tangerang Selatan, Indonesia
  • Joko Priambodo Universitas Pamulang, Tangerang Selatan, Indonesia

DOI:

https://doi.org/10.47065/bulletincsr.v6i1.930

Keywords:

Obesity; Ensemble Learning; LightGBM; CatBoost; Risk Classification

Abstract

Obesity is a health problem that continues to increase among individuals of productive age and has the potential to reduce quality of life and work productivity. One of the main challenges in obesity risk assessment is the limitation of conventional methods in accurately identifying obesity risk when dealing with complex, multidimensional data that include both numerical and categorical variables. Therefore, an artificial intelligence–based approach is required to provide a more accurate and stable obesity risk classification. This study aims to implement and evaluate a LightGBM–CatBoost ensemble method for obesity risk classification with a focus on the productive age population. The dataset used in this study was obtained from the Kaggle platform and consisted of 2,111 individual records containing physical attributes, eating habits, physical activity, and lifestyle factors. Although the dataset is synthetic and balanced, the included attributes and age-related variables are representative of individuals within the productive age range, making it suitable for modeling obesity risk in this demographic context. The research stages include data preprocessing, separate training of the LightGBM and CatBoost models, model integration using a probability averaging ensemble technique, and performance evaluation using accuracy, precision, recall, and F1-score metrics. The results indicate that both LightGBM and CatBoost achieved accuracy levels above 95%, while the ensemble model demonstrated superior performance with an accuracy of 96.69% and more balanced evaluation metrics across all obesity risk classes. These findings confirm that the ensemble approach improves classification stability and accuracy compared to single models. Therefore, the LightGBM–CatBoost ensemble method is effective for obesity risk classification and has the potential to be further developed as a decision support system in the health sector.

Downloads

Download data is not yet available.

References

T. Aziz, N. Hussain, Z. Hameed, and L. Lin, “Elucidating the role of diet in maintaining gut health to reduce the risk of obesity, cardiovascular and other age-related inflammatory diseases?: recent challenges and future recommendations,” Gut Microbes, vol. 16, no. 1, 2024, doi: 10.1080/19490976.2023.2297864.

E. Nunan et al., “Obesity as a premature aging phenotype - implications for sarcopenic obesity,” GeroScience, vol. 44, no. 3, pp. 1393–1405, 2022, doi: 10.1007/s11357-022-00567-7.

G. J. M. Yong et al., “Precocious infant fecal microbiome promotes enterocyte barrier dysfuction , altered neuroendocrine signaling and associates with increased childhood obesity risk,” Gut Microbes, vol. 16, no. 1, 2024, doi: 10.1080/19490976.2023.2290661.

M. Ishida et al., “The association between obesity, health service use, and work productivity in Australia?: a cross ? sectional quantile regression analysis,” Sci. Rep., vol. 13, no. 1, pp. 1–9, 2023, doi: 10.1038/s41598-023-33389-4.

V. J. Beltrán-carrillo, Á. Megías, and D. González-cutre, “Elements behind sedentary lifestyles and unhealthy eating habits in individuals with severe obesity,” Int. J. Qual. Stud. Health Well-being, vol. 17, no. 1, 2022, doi: 10.1080/17482631.2022.2056967.

D. Mosha et al., “Risk factors for overweight and obesity among women of reproductive age in Dar es Salaam, Tanzania,” BMC Nutr., vol. 7, no. 1, pp. 1–10, 2021, doi: 10.1186/s40795-021-00445-z.

N. Opel et al., “Brain structural abnormalities in obesity?: relation to age, genetic risk, and common psychiatric disorders,” Mol. Psychiatry, vol. 26, no. 9, pp. 4839–4852, 2021, doi: 10.1038/s41380-020-0774-9.

A. Okunogbe, R. Nugent, G. Spencer, J. Powis, J. Ralston, and J. Wilding, “Economic impacts of overweight and obesity?: current and future estimates for 161 countries,” BMJ Glob. Heal., vol. 7, no. 9, pp. 1–17, 2022, doi: 10.1136/bmjgh-2022-009773.

R. Dettoni, C. Bahamondes, C. Yevenes, C. Cespedes, and J. Espinosa, “The effect of obesity on chronic diseases in USA: a flexible copula approach,” Sci. Rep., vol. 13, no. 1, pp. 1–15, 2023, doi: 10.1038/s41598-023-28920-6.

A. Bartosiewicz, J. Wyszy?ska, P. Mat?osz, E. ?uszczki, ?. Oleksy, and A. Stolarczyk, “Prevalence of dyslipidaemia within Polish nurses. Cross-sectional study - single and multiple linear regression models and ROC analysis,” BMC Public Health, vol. 24, no. 1, pp. 1–11, 2024, doi: 10.1186/s12889-024-18542-6.

B. Zhang, D. Jiang, H. Ma, and H. Liu, “Association between triglyceride-glucose index and its obesity indicators with hypertension in postmenopausal women: a cross-sectional study,” Front. Nutr., vol. 12, pp. 1–10, 2025, doi: 10.3389/fnut.2025.1623697.

S. Hamoud, A. Id, and L. Tafakori, “Predicting age at onset of childhood obesity using regression, Random Forest, Decision Tree, and K-Nearest Neighbour - A case study in Saudi Arabia,” PLoS One, vol. 19, no. 9, pp. 1–21, 2024, doi: 10.1371/journal.pone.0308408.

J. Kim, S. Mun, S. Lee, K. Jeong, and Y. Baek, “Prediction of metabolic and pre?metabolic syndromes using machine learning models with anthropometric, lifestyle, and biochemical factors from a middle?aged population in Korea,” BMC Public Health, vol. 22, no. 1, pp. 1–10, 2022, doi: 10.1186/s12889-022-13131-x.

A. Dwi, Y. Nur, and Y. Pristyanto, “Stock Price Time Series Data Forecasting Using the Light Gradient Boosting Machine ( LightGBM ) Model,” JOIV Int. J. Informatics Vis., vol. 7, no. 4, pp. 2270–2279, 2023, doi: 10.62527/joiv.7.4.1740.

C. Zhang, J. Deng, and W. Yi, “Data-driven online tracking filter architecture: A LightGBM implementation,” Signal Processing, vol. 221, p. 109477, 2024, doi: 10.1016/j.sigpro.2024.109477.

S. Hussain, M. Wazir, T. A. Jumani, and S. Khan, “A novel feature engineered-CatBoost-based supervised machine learning framework for electricity theft detection,” Energy Reports, vol. 7, pp. 4425–4436, 2021, doi: 10.1016/j.egyr.2021.07.008.

J. Dutta and S. Roy, “OccupancySense: Context-based indoor occupancy detection & prediction using CatBoost model,” Appl. Soft Comput., vol. 119, p. 108536, 2022, doi: 10.1016/j.asoc.2022.108536.

I. D. Mienye, Y. Sun, and S. Member, “A Survey of Ensemble Learning: Concepts, Algorithms, Applications, and Prospects,” IEEE Access, vol. 10, pp. 99129–99149, 2022, doi: 10.1109/ACCESS.2022.3207287.

T. Toharudin et al., “Boosting Algorithm to Handle Unbalanced Classification of PM2.5 Concentration Levels by Observing Meteorological Parameters in Jakarta-Indonesia Using AdaBoost, XGBoost, CatBoost, and LightGBM,” IEEE Access, vol. 11, pp. 35680–35696, 2023, doi: 10.1109/ACCESS.2023.3265019.

R. P. Sari, F. Febriyanto, and A. C. Adi, “Analysis Implementation of the Ensemble Algorithm in Predicting Customer Churn in Telco Data: A Comparative Study,” Informatica, vol. 47, no. 7, pp. 63–70, 2023, doi: 10.31449/inf.v47i7.4797.

K. Shanmugavadivel, M. D. M. S, T. R. Mahesh, T. Al Shehari, and N. A. Alsadhan, “Optimized polycystic ovarian disease prognosis and classification using AI based computational approaches on multi?modality data,” BMC Med. Inform. Decis. Mak., vol. 24, no. 1, pp. 1–22, 2024, doi: 10.1186/s12911-024-02688-9.

M. Saber et al., “Enhancing flood risk assessment through integration of ensemble learning approaches and physical-based hydrological modeling,” Geomatics, Nat. Hazards Risk, vol. 14, no. 1, 2023, doi: 10.1080/19475705.2023.2203798.


Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Implementation of the LightGBM–CatBoost Ensemble Method for Obesity Risk Classification in Productive Age

Dimensions Badge

ARTICLE HISTORY

Published: 2025-12-31

Abstract View: 38 times
PDF Download: 27 times

How to Cite

Harefa, K., & Priambodo, J. (2025). Implementation of the LightGBM–CatBoost Ensemble Method for Obesity Risk Classification in Productive Age. Bulletin of Computer Science Research, 6(1), 531-538. https://doi.org/10.47065/bulletincsr.v6i1.930

Issue

Section

Articles