Random forest algorithm optimization using k-nearest neighborand smote on diabetes disease. Optimize Random Forest for diabetes prediction using K-Nearest Neighbor and SMOTE. Achieves 92.86% accuracy, improving early diagnosis of this chronic disease.
Abstract. Diabetes is a chronic disease that can cause long-term damage, dysfunction and failure of various organs in the body. Diabetes occurs due to an increase in blood sugar (glucose) levels exceeding normal values. Early diagnosis of diseases is crucial for addressing them, especially in the case of diabetes, which is one of the chronic illnesses. Purpose: This study aims to find out how the implementation of the K-Nearest Neighbor algorithm with the Synthetic Minority Oversampling Technique (SMOTE) in optimizing Random Forest algorithm for diabetes disease prediction. Methods/Study design/approach: This study uses the Pima Indian Diabetes Dataset, the random forest algorithm for the classification, k-nearest neighbor for optimization, and SMOTE for the minority class oversampling. Result/Findings: The prediction accuracy of the model using SMOTE and k-nearest neighbor is 92,86%. Meanwhile, the model that does not use SMOTE and k-nearest neighbor obtains an accuracy of 83,03%. Novelty/Originality/Value: This research shows that the use of random forest algorithm with k-nearest neighbor and SMOTE gives better accuracy than without using k-nearest neighbor and SMOTE.
This paper presents an investigation into optimizing the Random Forest (RF) algorithm for diabetes prediction by integrating the K-Nearest Neighbor (KNN) algorithm and the Synthetic Minority Oversampling Technique (SMOTE). Addressing the critical need for early diagnosis of chronic diseases like diabetes, the authors utilize the Pima Indian Diabetes Dataset to evaluate their proposed methodology. The core objective is to demonstrate how these combined techniques can enhance the predictive accuracy of a Random Forest model, ultimately yielding a significantly improved accuracy of 92.86% compared to the baseline 83.03% without the optimization strategies. The study makes a valuable contribution by highlighting the synergistic effect of employing data pre-processing and complementary algorithms to boost classification performance in a medical context. The application of SMOTE is particularly pertinent, as medical datasets often suffer from class imbalance, which can severely bias model predictions; its successful integration here underscores its utility. While Random Forest is a robust classifier, the results clearly show that its performance can be further refined through thoughtful optimization, suggesting that the combination with KNN and SMOTE is a promising avenue for improving diagnostic models. The substantial gain in accuracy is compelling and supports the authors' claim of novelty and value. Despite the promising results, a few points warrant further clarification and discussion. The abstract states "k-nearest neighbor for optimization" but does not explicitly detail the specific mechanism by which KNN optimizes the Random Forest algorithm. A more precise explanation of KNN's role—whether it's for feature selection, hyperparameter tuning, or a form of meta-learning—would greatly enhance the understanding of the methodology. Additionally, while accuracy is a primary metric, for imbalanced datasets and medical diagnoses, reporting other crucial metrics such as precision, recall, F1-score, and AUC would provide a more comprehensive evaluation of the model's performance and robustness. Finally, details regarding the specific parameters used for KNN (e.g., the 'k' value) and Random Forest would be beneficial for reproducibility and for a deeper understanding of the experimental setup.
You need to be logged in to view the full text and Download file of this article - Random Forest Algorithm Optimization using K-Nearest Neighborand SMOTE on Diabetes Disease from Recursive Journal of Informatics .
Login to View Full Text And DownloadYou need to be logged in to post a comment.
By Sciaria
By Sciaria
By Sciaria
By Sciaria
By Sciaria
By Sciaria