TY - JOUR
T1 - Stacking ensemble learning model for predict anxiety level in university students using balancing methods
AU - Daza, Alfredo
AU - Arroyo-Paz,
AU - Bobadilla, Juana
AU - Apaza, Oscar
AU - Pinto, Juan
N1 - Publisher Copyright:
© 2023
PY - 2023/1
Y1 - 2023/1
N2 - Background: Anxiety is known as one of the most common health disorders affecting a large part of the population with a high social and personal impact, which affects about 25% of people worldwide; it is so when it comes to anxiety in students, it is evidenced that in 2018, 63% of high school students in the United States reported having experienced “excessive anxiety” in recent years. Objective: The purpose of this study was to propose a method and 4 combined models based on Stacking with the aim of predicting anxiety levels in college students. In addition, an end-user web interface was developed with the best model proposed in this study. Methods: The data set used consisted of a sample of undergraduate students of systems and computer Engineering from a public university with a total of 284 participants. The data was then cleaned and preprocessed using the Python program. In the data balancing, the data were divided into 5 values obtained and the oversampling method was performed, distributing the data according to the condition. Then the portioning of the balanced data proceeded, using the cross-validation method for data training. For the modeling and evaluation, 5 independent algorithms were used and 4 combined models combined algorithms were proposed. Results: The proposed approach, called Stacking 4A: KNN-Ensemble with data oversampling balancing, was shown to obtain the best results in several evaluation metrics. Specifically, the following values were achieved: Accuracy = 97.83%, sensitivity = 98.44%, f1-score = 97.88%, MCC = 97.08% and specificity = 99.32%, these results exceeded those obtained by the other algorithms. However, the Stacking 2A: SVM-Ensemble technique with data oversampling balance achieved the best value in the precision metric with a result of 97.83%. Conclusions: This article focuses on applying the Ensemble Stacking technique to identify anxiety levels at an early stage among students attending a public university in Peru. Therefore, by using the combined method, an improvement in anxiety prediction was observed, surpassing the performance of the independent algorithms used.
AB - Background: Anxiety is known as one of the most common health disorders affecting a large part of the population with a high social and personal impact, which affects about 25% of people worldwide; it is so when it comes to anxiety in students, it is evidenced that in 2018, 63% of high school students in the United States reported having experienced “excessive anxiety” in recent years. Objective: The purpose of this study was to propose a method and 4 combined models based on Stacking with the aim of predicting anxiety levels in college students. In addition, an end-user web interface was developed with the best model proposed in this study. Methods: The data set used consisted of a sample of undergraduate students of systems and computer Engineering from a public university with a total of 284 participants. The data was then cleaned and preprocessed using the Python program. In the data balancing, the data were divided into 5 values obtained and the oversampling method was performed, distributing the data according to the condition. Then the portioning of the balanced data proceeded, using the cross-validation method for data training. For the modeling and evaluation, 5 independent algorithms were used and 4 combined models combined algorithms were proposed. Results: The proposed approach, called Stacking 4A: KNN-Ensemble with data oversampling balancing, was shown to obtain the best results in several evaluation metrics. Specifically, the following values were achieved: Accuracy = 97.83%, sensitivity = 98.44%, f1-score = 97.88%, MCC = 97.08% and specificity = 99.32%, these results exceeded those obtained by the other algorithms. However, the Stacking 2A: SVM-Ensemble technique with data oversampling balance achieved the best value in the precision metric with a result of 97.83%. Conclusions: This article focuses on applying the Ensemble Stacking technique to identify anxiety levels at an early stage among students attending a public university in Peru. Therefore, by using the combined method, an improvement in anxiety prediction was observed, surpassing the performance of the independent algorithms used.
KW - Anxiety
KW - College undergraduate
KW - Intelligent system
KW - Oversampling
KW - Stacking ensemble
UR - http://www.scopus.com/inward/record.url?scp=85169507435&partnerID=8YFLogxK
U2 - 10.1016/j.imu.2023.101340
DO - 10.1016/j.imu.2023.101340
M3 - Article
AN - SCOPUS:85169507435
SN - 2352-9148
VL - 42
JO - Informatics in Medicine Unlocked
JF - Informatics in Medicine Unlocked
M1 - 101340
ER -