Comparison of Probabilistic Neural Network (PNN) and k-Nearest Neighbor (k-NN) Algorithms for Diabetes Classification

Diah Siti Fatimah Azzahrah, Alamsyah Alamsyah

Diah Siti Fatimah Azzahrah, Alamsyah Alamsyah

Bioinformatics

Recursive Journal of Informatics

0.0 (0 ratings)

Introduction

Comparison of probabilistic neural network (pnn) and k-nearest neighbor (k-nn) algorithms for diabetes classification. Compare Probabilistic Neural Network (PNN) and k-Nearest Neighbor (k-NN) for diabetes classification. K-NN shows superior accuracy (78.1%) and speed using Pima Indians Database.

1 views

Abstract

Purpose: This study aims to compare algorithms to determine the accuracy of the algorithm and determine the speed of the algorithm used for diabetes classification. Methods: There are two algorithms used in this study, namely Probabilistic Neural Network (PNN) and k-Nearest Neighbor (k-NN). The data used is the Pima Indians Diabetes Database. The data contains 768 data with 8 attributes and 1 target class, namely 0 for no diabetes and 1 for diabetes. The dataset has been divided into 80% training data and 20% testing data. Result: Accuracy is obtained after implementing k-fold cross validation with a value of k = 4. The accuracy results show that the k-Nearest Neighbor algorithm is superior and has better quickness compared to the Probabilistic Neural Network. The k-Nearest Neighbor algorithm obtains an accuracy of 74.6% for all features and 78.1% for four features Novelty: The novelty of this paper is optimizing and improving accuracy which is implemented with by focusing on data preprocessing, feature selection and k-fold cross validation in the classification algorithm

Review

The paper, "Comparison of Probabilistic Neural Network (PNN) and k-Nearest Neighbor (k-NN) Algorithms for Diabetes Classification," presents a study aimed at evaluating and comparing the accuracy and speed of PNN and k-NN algorithms for classifying diabetes. Utilizing the widely-used Pima Indians Diabetes Database, comprising 768 instances, the authors report dividing the data into 80% training and 20% testing sets. The core finding indicates that the k-Nearest Neighbor algorithm demonstrates superior performance in both accuracy and quickness compared to the Probabilistic Neural Network. Specifically, k-NN achieved an accuracy of 74.6% with all features and an improved 78.1% when a selection of four features was used, with evaluation reportedly carried out through k-fold cross-validation (k=4). The study's strength lies in its clear objective to compare two distinct machine learning paradigms on a common medical dataset, addressing both accuracy and computational efficiency. However, the abstract's claim of "novelty" in "optimizing and improving accuracy by focusing on data preprocessing, feature selection and k-fold cross validation" requires further substantiation. While these are crucial steps in model development, merely *including* them does not inherently constitute novelty unless a specific, non-standard, or significantly improved methodology for these steps is proposed and detailed. Furthermore, the abstract mentions both an 80% training/20% testing split and the use of k-fold cross-validation (k=4) for accuracy derivation. The interplay between these two experimental design choices needs clarification: was k-fold applied *within* the training set of the 80/20 split, or was the 80/20 split for initial exploration while k-fold was the final evaluation strategy on the full dataset? The absence of explicit accuracy figures for PNN also limits a direct quantitative comparison of the algorithms from the abstract alone. To strengthen the paper, it would be beneficial to elaborate on the specific techniques employed for data preprocessing and feature selection, demonstrating how these contribute to the claimed optimization and novelty. A precise definition and measurement of "quickness" for both algorithms would enhance the understanding of their computational efficiency. Providing the accuracy results for the PNN algorithm would allow readers to fully appreciate the extent of k-NN's reported superiority. Additionally, considering the sensitivity of diabetes diagnosis, the practical implications of an accuracy around 78% should be discussed in a clinical context, possibly comparing it to existing diagnostic methods or state-of-the-art machine learning models in this domain. Future work could explore more advanced feature engineering, ensemble methods, or different hyperparameter tuning strategies to potentially achieve higher diagnostic accuracy and robustness.

Full Text

You need to be logged in to view the full text and Download file of this article - Comparison of Probabilistic Neural Network (PNN) and k-Nearest Neighbor (k-NN) Algorithms for Diabetes Classification from Recursive Journal of Informatics .

Comments

You need to be logged in to post a comment.

Top Blogs by Rating

Favorite Blog

Comparison of Probabilistic Neural Network (PNN) and k-Nearest Neighbor (k-NN) Algorithms for Diabetes Classification

Home Research Details

Diah Siti Fatimah Azzahrah, Alamsyah Alamsyah