Depression Prediction Among University Students Using a Random Forest Algorithm Based on Psychosocial Data

Prediksi Depresi Mahasiswa: Pendekatan Berbasis Data Psikososial Menggunakan Algoritma Random Forest

Authors

  • Abiyya Alfahrizi Putra Arifiansyah Abiyya Universitas Nurul Jadid
  • Muhammad Afandi Universitas Nurul Jadid, Probolinggo
  • Dodi Dwi Riskianto Universitas Nurul Jadid, Probolinggo
  • Sudriyanto Sudriyanto Universitas Nurul Jadid, Probolinggo

DOI:

https://doi.org/10.30787/restia.v4i1.2100

Keywords:

Student Depression, Machine Learning, Prediction, Random Forest

Abstract

College students' mental health is a critical issue that is gaining increasing attention, particularly regarding depression, which significantly impacts quality of life and academic achievement. This study aims to develop a predictive model for depression in college students based on psychosocial data using the Random Forest algorithm. The data used is a public secondary dataset from Kaggle with 1,000 samples, covering demographic variables, lifestyle, and psychological indicators. The analysis process included data preprocessing, class balancing, model training, and evaluation using accuracy, precision, recall, F1-score, and confusion matrix metrics. Test results showed that the Random Forest model was able to predict depression with 87.0% accuracy, 86.1% precision, 87.4% recall, and 86.7% F1-score, demonstrating good and stable performance. Word cloud visualization identified academic pressure, stress, and anxiety as dominant factors. Compared to previous research using the SVM algorithm, Random Forest demonstrated improved performance, particularly in handling complex and imbalanced data. This study confirms the effectiveness of the Random Forest-based machine learning approach in supporting the early detection of college students' depression and provides a foundation for the development of mental health monitoring systems in higher education settings.

References

G. Limenih, A. MacDougall, M. Wedlake, and E. Nouvet, “Depression and global mental health in the global south: a critical analysis of policy and discourse,” Int. J. Soc. Determ. Heal. Heal. Serv., vol. 54, no. 2, pp. 95–107, 2024.

K. S. Chaudhari, M. P. Dhapkas, A. Kumar, and R. G. Ingle, “Mental disorders–a serious global concern that needs to address,” Int J Pharm Qual Assur, vol. 15, no. 02, pp. 973–978, 2024.

G. I. Al Jowf et al., “A public health perspective of post-traumatic stress disorder,” Int. J. Environ. Res. Public Health, vol. 19, no. 11, p. 6474, 2022.

N. R. Rohmah and M. Mahrus, “Mengidentifikasi Faktor-faktor Penyebab Stres Akademik pada Mahasiswa dan Strategi Pengelolaannya,” JIEM J. Islam. Educ. Manag., vol. 5, no. 1, pp. 36–43, 2024.

V. Blanco, M. Salmerón, P. Otero, and F. L. Vázquez, “Symptoms of Depression, Anxiety, and Stress and Prevalence of Major Depression and Its Predictors in Female University Students.,” Int. J. Environ. Res. Public Health, vol. 18, no. 11, May 2021, doi: 10.3390/ijerph18115845.

S. Verma, C. Sharma, G. Aggarwal, and P. Upadhya, “Artificial intelligence-based approach for classification and prediction of mental health,” in 2024 14th International Conference on Cloud Computing, Data Science & Engineering (Confluence), IEEE, 2024, pp. 708–713.

B. Acharya, “Comparative analysis of machine learning algorithms: KNN, SVM, decision tree and logistic regression for efficiency and performance,” Int. J. Res. Appl. Sci. Eng. Technol., vol. 12, no. 11, pp. 614–619, 2024.

L. F. Voges, L. C. Jarren, and S. Seifert, “Exploitation of surrogate variables in random forests for unbiased analysis of mutual impact and importance of features,” Bioinformatics, vol. 39, no. 8, p. btad471, 2023.

J. Buesa et al., “Predictors of postpartum depression in threatened preterm labour: importance of psychosocial factors,” Spanish J. Psychiatry Ment. Heal., vol. 17, no. 1, pp. 51–54, 2024.

H. S. BALTACI, D. Kucuker, I. Ozkilic, U. Y. Karatas, and H. A. Ozdemir, “Investigation of Variables Predicting Depression in College Students.,” Eurasian J. Educ. Res., no. 93, 2021.

W. Narkbunnum and K. Wisaeng, “Prediction of Depression for Undergraduate Students Based on Imbalanced Data by Using Data Mining Techniques,” Appl. Syst. Innov., vol. 5, no. 6, p. 120, 2022.

G. S. Dhillon and S. Kaur, “Depression Among College Students: Prevalence And Associated Risk Factors,” Indian J. Ment. Heal., vol. 9, no. 2, 2022.

N. Kosaraju, S. R. Sankepally, and K. Mallikharjuna Rao, “Categorical data: Need, encoding, selection of encoding method and its emergence in machine learning models—a practical review study on heart disease prediction dataset using pearson correlation,” in Proceedings of International Conference on Data Science and Applications: ICDSA 2022, Volume 1, Springer, 2023, pp. 369–382.

A. Bansal, A. Verma, S. Singh, and Y. Jain, “Combination of oversampling and undersampling techniques on imbalanced datasets,” in International Conference on Innovative Computing and Communications: Proceedings of ICICC 2022, Volume 3, Springer, 2022, pp. 647–656.

M. Maindola et al., “Utilizing random forests for high-accuracy classification in medical diagnostics,” in 2024 7th International Conference on Contemporary Computing and Informatics (IC3I), IEEE, 2024, pp. 1679–1685.

K. Vita, P. Yana, B. Liliia, and V. Dmytro, “AUTOMATED DETECTION OF POTENTIALLY DANGEROUS URL ADDRESSES USING THE SCIKIT-LEARN LIBRARY,” pp. 353–357, 2024.

F. Aziz, S. Abasa, and A. Andyka, “Pengembangan dan Validasi Model Hybrid Machine Learning untuk Diagnosis Awal Depresi,” J. Pharm. Appl. Comput. Sci., vol. 3, no. 1, pp. 8–15, 2025.

O. Iparraguirre-Villanueva, C. Paulino-Moreno, A. Epifanía-Huerta, and C. Torres-Ceclén, “Machine Learning Models to Classify and Predict Depression in College Students.,” Int. J. Interact. Mob. Technol., vol. 18, no. 14, 2024.

Downloads

Published

2026-02-02

How to Cite

Abiyya, A. A. P. A., Afandi, M., Dwi Riskianto, D., & Sudriyanto, S. (2026). Depression Prediction Among University Students Using a Random Forest Algorithm Based on Psychosocial Data: Prediksi Depresi Mahasiswa: Pendekatan Berbasis Data Psikososial Menggunakan Algoritma Random Forest. Jurnal Riset Sistem Dan Teknologi Informasi, 4(1), 12–21. https://doi.org/10.30787/restia.v4i1.2100

Similar Articles

You may also start an advanced similarity search for this article.