Analisis Pengaruh Preprocessing Data dan Hyperparameter Tuning pada Backpropagation Neural Network dalam Klasifikasi Stroke


Authors

  • Asrul Gunawan Universitas Teknologi Yogyakarta, Sleman, Indonesia
  • Arief Hermawan Universitas Teknologi Yogyakarta, Sleman, Indonesia
  • Donny Avianto Universitas Teknologi Yogyakarta, Sleman, Indonesia

DOI:

https://doi.org/10.47065/bulletincsr.v6i2.956

Keywords:

Backpropagation Neural Network; Stroke; Data Preprocessing; Hyperparameter Tuning; Classification

Abstract

Data imbalance and scale differences between features are often the main factors that reduce the performance of neural network-based classification models. This study aims to analyze the effect of data preprocessing and hyperparameter tuning on the performance of Backpropagation Neural Network (BPNN) in stroke classification. This study used a stroke dataset from the Kaggle platform consisting of 5,110 patient data with 10 clinical features. The evaluation was conducted using five schemes and consisted of several data balancing techniques. These techniques include no balancing, SMOTE, and ADASYN. In addition, the evaluation also involved data normalization including no normalization, MinMaxScaler, and Z-Score. The BPNN model used has an architecture of 19 input neurons, 29 neurons in the hidden layer, and 1 output neuron. Hyperparameter tuning was performed by finding the best learning rate and number of epochs. The evaluation results showed that the model in scheme one has limitations. This limitation is most visible in identifying stroke classes. The application of SMOTE and MinMaxScaler in scheme two proved that the results were better and its performance increased significantly. On the other hand, the combination of ADASYN and Z-Score in scheme three showed more stable performance and was able to detect stroke cases more accurately. The hyperparameter tuning process in schemes four and five also proved to improve performance. The best results were obtained in scheme five, with an accuracy of 96.47%, a precision of 97.34%, a recall of 95.62%, and an F1-score of 96.47%. These findings indicate that the combination of adaptive balancing techniques, distribution-based normalization, and optimal parameter tuning is very effective in improving the accuracy and stability of BPNN for stroke classification.

Downloads

Download data is not yet available.

References

A. F. Eram, A. S. Mahmud, M. M. Khadem, and M. A. Ihsan, “Beyond the numbers: App-enabled stroke prediction system for high-risk individuals in imbalanced datasets,” Neurosci. Informatics, vol. 5, no. 3, p. 100215, 2025, doi: 10.1016/j.neuri.2025.100215.

S. Felehgari, P. Sariaslani, S. Shamsizadeh, S. Felehgari, A. Rajabi, and H. Mohammadi, “Multi?classification Deep Learning Approach for Diagnosing Stroke Type and Severity Using Multimodal Magnetic Resonance Images,” J. Med. Signals Sens., vol. 15, no. 4, pp. 1–7, 2025, doi: 10.4103/jmss.jmss_37_24.

M. Azhima, I. Afrianty, E. Budianita, and S. Kurnia Gusti, “KLIK: Kajian Ilmiah Informatika dan Komputer Penerapan Metode Backpropagation Neural Network untuk Klasifikasi Penyakit Stroke,” Media Online), vol. 4, no. 6, pp. 3013–3021, 2024, doi: 10.30865/klik.v4i6.1956.

X. Tang, M. Tang, W. Liu, and S. Cui, “Explainable machine learning for stroke risk prediction: a comparative study with SHAP-based interpretation,” Front. Neurol., vol. 16, no. January, pp. 1–12, 2026, doi: 10.3389/fneur.2025.1716984.

Solikhun and N. Amalya, “Algoritma Backpropagation Metode Levenberg Marquardt Dalam Memprediksi Penyakit Stroke,” Bull. Comput. Sci. Res., vol. 3, no. 2, pp. 191–196, 2023, doi: 10.47065/bulletincsr.v3i2.229.

Alwaliyanto, Siska Kurnia Gusti, Iis Afrianty, and Fadhilah Syafria, “Penerapan Metode ADASYN Dalam Mengatasi Imbalanced Data Untuk Klasifikasi Penyakit Stroke Menggunakan Support Vector Machine,” Bull. Comput. Sci. Res., vol. 5, no. 4, pp. 532–541, 2025, doi: 10.47065/bulletincsr.v5i4.612.

M. Resa Arif Yudianto, P. Sukmasetya, R. Abul Hasani, and D. Sasongko, “Pengaruh Data Preprocessing terhadap Imbalanced Dataset pada Klasifikasi Citra Sampah menggunakan Algoritma Convolutional Neural Network,” Build. Informatics, Technol. Sci., vol. 4, no. 3, pp. 1367–1375, 2022, doi: 10.47065/bits.v4i3.2575.

I. D. Mienye and Y. Sun, “Performance analysis of cost-sensitive learning methods with application to imbalanced medical data,” Informatics Med. Unlocked, vol. 25, p. 100690, 2021, doi: 10.1016/j.imu.2021.100690.

A. M. Sowjanya and O. Mrudula, “Effective treatment of imbalanced datasets in health care using modified SMOTE coupled with stacked deep learning algorithms,” Appl. Nanosci., vol. 13, no. 3, pp. 1829–1840, 2023, doi: 10.1007/s13204-021-02063-4.

J. Zhu et al., “Processing imbalanced medical data at the data level with assisted-reproduction data as an example,” BioData Min., vol. 17, no. 1, 2024, doi: 10.1186/s13040-024-00384-y.

V. S. Elangovan, R. Devarajan, O. I. Khalaf, M. S. Sharif, and W. Elmedany, “Analyzing an Imbalanced Stroke Prediction Dataset Using Machine Learning Techniques,” Karbala Int. J. Mod. Sci., vol. 10, no. 2, pp. 246–259, 2024, doi: 10.33640/2405-609X.3355.

E. F. Agyemang et al., “Addressing Class Imbalance Problem in Health Data Classification: Practical Application From an Oversampling Viewpoint,” Appl. Comput. Intell. Soft Comput., vol. 2025, no. 1, 2025, doi: 10.1155/acis/1013769.

S. rizki Zikrillah aulia, O. Okfalisa, E. Haerani, and L. Oktavia, “Application of ADASYN Technique in Classification of Stroke Disease using Backpropagation Neural Network,” INOVTEK Polbeng - Seri Inform., vol. 10, no. 3, pp. 1666–1674, 2025, doi: 10.35314/jdhv9s39.

R. Azhar, S. K. Gusti, I. Afrianty, and ..., “Perbandingan Teknik Penyeimbang Kelas Pada Multi-Layer Perceptron (MLP) Berbasis Backpropagation Untuk Klasifikasi Diabetes Mellitus,” Bull. Comput. …, vol. 5, no. 6, pp. 1304–1314, 2025, doi: 10.47065/bulletincsr.v5i6.804.

N. Cahyani, R. Irsyada, and R. Mahmuda, “Penerapan Algoritma Neural Network untuk Klasifikasi Diabetes Mellitus: Perbandingan Backpropagation dan Resillient Backpropagation,” Digit. Transform. Technol., vol. 4, no. 2, pp. 1067–1074, 2025, doi: 10.47709/digitech.v4i2.5208.

W. H. Herowati, R. A. Pramunendar, and H. Al Azies, “Optimalisasi metode BPNN (Backpropagation Neural Network) menggunakan GA (Genetic Algorithm) dalam menentukan arah offset pada metode ekstraksi fitur GLCM (Gray Level Co-occurrence Matrices),” Bina Insa. Ict J., vol. 10, no. 2, p. 123, 2023, doi: 10.51211/biict.v10i2.2624.

N. Melnykova, Y. Patereha, S. Skopivskyi, M. Farion, S. Fedushko, and K. Drohomyretska, “Machine learning for stroke prediction using imbalanced data,” Sci. Rep., vol. 15, no. 1, pp. 1–20, 2025, doi: 10.1038/s41598-025-01855-w.

A. Firdaus, rif’an, M. Lutfi, and M. Amrulloh, Faishol, “Klasifikasi Jenis Kelengkeng Berdasarkan Morfologi Daun Dengan Ekstraksi Ciri RRGB, GLCM,” J. Keilmuan dan Apl. Tek. Inform., vol. 1(2), no. 2, pp. 110–125, 2022, doi: https://doi.org/10.31102/jatim.v4i2.2341.

F. R. Aftha Harianto, Z. Alawi, and I. A. Sa’ida, “Pengaruh Komposisi Split Data Pada Akurasi Klasifikasi Penderita Diabetes Menggunakan Algoritma Machine Learning,” J. Sist. Inf. dan Inform., vol. 8, no. 1, pp. 36–44, 2025, doi: 10.47080/simika.v8i1.3663.

M. A. Saleem et al., “Enhancing stroke risk prediction through class balancing and data augmentation with CBDA-ResNet50,” Sci. Rep., vol. 15, no. 1, pp. 1–19, 2025, doi: 10.1038/s41598-025-07350-6.


Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Analisis Pengaruh Preprocessing Data dan Hyperparameter Tuning pada Backpropagation Neural Network dalam Klasifikasi Stroke

Dimensions Badge

ARTICLE HISTORY

Published: 2026-02-18

Abstract View: 20 times
PDF Download: 23 times

How to Cite

Gunawan, A., Hermawan, A., & Avianto, D. (2026). Analisis Pengaruh Preprocessing Data dan Hyperparameter Tuning pada Backpropagation Neural Network dalam Klasifikasi Stroke. Bulletin of Computer Science Research, 6(2), 682-692. https://doi.org/10.47065/bulletincsr.v6i2.956

Issue

Section

Articles