Comparison of Naive Bayes Classifier and K-Nearest Neighbor Algorithms with Information Gain and Adaptive Boosting for Sentiment Analysis of Spotify App Reviews

Meidika Bagus Saputro, Alamsyah Alamsyah

Meidika Bagus Saputro, Alamsyah Alamsyah

Informatics

Recursive Journal of Informatics

0.0 (0 ratings)

Introduction

Comparison of naive bayes classifier and k-nearest neighbor algorithms with information gain and adaptive boosting for sentiment analysis of spotify app reviews. Compares Naive Bayes & K-Nearest Neighbor algorithms with Information Gain & Adaptive Boosting for sentiment analysis of Spotify app reviews, achieving 87.28% accuracy with NB.

1 views

Abstract

Abstract. At this time, the development of technology are increase rapidly. One of the issue that appear with advance technology is data volume in the world has increase too. With the large data volumes that exist in the world it can be used to some purpose in many field. Entertainment is one of the field that have many interest from user in this world. Spotify is the example of entertainment apps that provided by Google Play Store to give online music streams to their users. Because that apps is provided by Google Play Store, many reviews of the user about the apps it can be classified to know the positive, negative, or neutral. One way to classified the review of user is make sentiment analysis. In this paper, to classify the review we use naïve Bayes classifier and k-nearest neighbors that will be compared with adding Information gain as feature selection and adaptive boosting as boosting algorithm of each classification algorithm that we used. The result of classification using naïve Bayes classifier with adding Information gain and adaptive boosting is 87.28% and k-nearest neighbor with adding information gain and adaptive boosting can perform accuracy of 80.35%. Purpose: Knowing the result each of accuracy from the naïve Bayes classifier and k-nearest neighbor algorithm with adding information gain and adaptive boosting that we used and know how to doing the sentiment analysis step by step with the methods that chosen in this study. Methods/Study design/approach: This study applied data preprocessing, lexicon based labelling with TextBlob, Normalization, Word Vectorization using TF-IDF, and classification with naïve Bayes classifier and k-nearest neighbor, information gain as feature selection, and adaptive boosting as boosting algorithm to boost the accuracy of classification result. Result/Findings: The accuracy of naïve Bayes classifier with adding information gain and adaptive boosting is 87.28%. Meanwhile, by k-nearest neighbor with adding information gain and adaptive boosting reach the accuracy of 80.35%. This result obtained by using 60.000 dataset with data splitting 80% as data training and 20% as data testing. Novelty/Originality/Value: Implementing information gain as feature selection and adaptive boosting as boosting algorithm to naïve Bayes classifier is prove that it can be increase the accuracy of classification, but not same when implementing in k-nearest neighbor. So, for the future research can applied another classification algorithm or feature selection to get better result.

Review

This paper presents a comparative study of Naive Bayes Classifier and K-Nearest Neighbor algorithms for sentiment analysis of Spotify app reviews, enhanced with Information Gain for feature selection and Adaptive Boosting. The research addresses a relevant problem in the era of burgeoning user-generated data, particularly for understanding user sentiment towards popular applications like Spotify. The reported accuracies, notably 87.28% for Naive Bayes with the enhancements and 80.35% for K-Nearest Neighbor, demonstrate the potential of these combined techniques for classifying positive, negative, or neutral app reviews. The methodology outlines a clear sequence from data preprocessing to classification, including the use of TextBlob for lexicon-based labeling and TF-IDF for word vectorization, which provides a structured approach to the problem. While the study offers valuable insights into the performance of the chosen algorithms, several aspects could benefit from further detail and clarification. The abstract mentions that Information Gain and Adaptive Boosting "can increase the accuracy," yet it only presents the boosted accuracies without a direct comparison to baseline performances (i.e., Naive Bayes or KNN *without* these enhancements). This makes it difficult to quantitatively assess the *degree* of improvement attributed to IG and AdaBoost. Additionally, the role of TextBlob for "lexicon based labelling" needs clearer integration with the subsequent classification step; it's implied this is for initial data labeling, but explicit context on how the human-labeled or TextBlob-labeled data is used for training the classifiers would strengthen the methodological clarity. Further, the hyperparameters for KNN and AdaBoost are not mentioned, which are crucial for reproducibility and understanding optimal performance. For future research, it would be beneficial to delve deeper into the reasons behind the differential impact of Information Gain and Adaptive Boosting on Naive Bayes versus K-Nearest Neighbor. Exploring error analysis for both models could reveal specific patterns of misclassification, offering insights into their limitations and guiding further improvements. Investigating other advanced feature selection techniques or ensemble methods, as well as alternative classification algorithms beyond Naive Bayes and KNN, could yield even higher accuracies or offer more robust solutions. Lastly, considering the nuances of multi-class sentiment analysis (positive, negative, neutral) and potentially incorporating deep learning approaches could provide a comprehensive understanding of user feedback on such platforms.

Full Text

You need to be logged in to view the full text and Download file of this article - Comparison of Naive Bayes Classifier and K-Nearest Neighbor Algorithms with Information Gain and Adaptive Boosting for Sentiment Analysis of Spotify App Reviews from Recursive Journal of Informatics .

Comments

You need to be logged in to post a comment.

Top Blogs by Rating

Favorite Blog

Comparison of Naive Bayes Classifier and K-Nearest Neighbor Algorithms with Information Gain and Adaptive Boosting for Sentiment Analysis of Spotify App Reviews

Home Research Details

Meidika Bagus Saputro, Alamsyah Alamsyah