Sentiment Analysis on Twitter Social Media Regarding Covid-19 Vaccination with Naive Bayes Classifier (NBC) and Bidirectional Encoder Representations from Transformers (BERT)

Angga Riski Dwi Saputra, Budi Prasetiyo

Angga Riski Dwi Saputra, Budi Prasetiyo

Informatics

Recursive Journal of Informatics

0.0 (0 ratings)

Introduction

Sentiment analysis on twitter social media regarding covid-19 vaccination with naive bayes classifier (nbc) and bidirectional encoder representations from transformers (bert). Analyze public sentiment on Covid-19 vaccination via Twitter using Naive Bayes Classifier (NBC) and BERT. Compares algorithm accuracy (73% NBC, 83% BERT) for positive, neutral, negative opinions.

2 views

Abstract

Abstract. The Covid-19 vaccine is an important tool to stop the Covid-19 pandemic, however, there are pros and cons from the public regarding this Covid-19 vaccine. Purpose: These responses were conveyed by the public in many ways, one of which is through social media such as Twitter. Responses given by the public regarding the Covid-19 vaccination can be analyzed and categorized into responses with positive, neutral or negative sentiments. Methods: In this study, sentiment analysis was carried out regarding Covid-19 vaccination originating from Twitter using the Naïve Bayes Classifier (NBC) and Bidirectional Encoder Representations from Transformers (BERT) algorithms. The data used in this study is public tweet data regarding the Covid-19 vaccination with a total of 29,447 tweet data in English. Result: Sentiment analysis begins with data preprocessing on the dataset used for data normalization and data cleaning before classification. Then word vectorization was performed with TF-IDF and data classification was performed using the Naïve Bayes Classifier (NBC) and Bidirectional Encoder Representations from Transformers (BERT) algorithms. From the classification results, an accuracy value of 73% was obtained for the Naïve Bayes Classifier (NBC) algorithm and 83% for the Bidirectional Encoder Representations from Transformers (BERT) algorithm. Novelty: A direct comparison between classical models such as NBC and modern deep learning models such as BERT offers new insights into the advantages and disadvantages of both approaches in processing Twitter data. Additionally, this study proposes temporal sentiment analysis, which allows evaluating changes in public sentiment regarding vaccination over time. Another innovation is the implementation of a hybrid approach to data cleansing that combines traditional methods with the natural language processing capabilities of BERT, which more effectively addresses typical Twitter data issues such as slang and spelling errors. Finally, this research also expands sentiment classification to be multi-label, identifying more specific sentiment categories such as trust, fear, or doubt, which provides a deeper understanding of public opinion.

Review

This paper presents a timely and relevant study on public sentiment regarding Covid-19 vaccination, utilizing Twitter as the data source. The research aims to categorize public responses into positive, neutral, or negative sentiments by employing two distinct classification algorithms: the classical Naïve Bayes Classifier (NBC) and the more modern Bidirectional Encoder Representations from Transformers (BERT). Using a substantial dataset of nearly 30,000 English tweets, the study outlines a methodology that includes data preprocessing, word vectorization (TF-IDF), and subsequent classification. The primary outcome indicates that BERT achieved a higher accuracy of 83% compared to NBC's 73%, suggesting the superior performance of deep learning models for this task. While the study tackles a significant societal issue and provides a clear comparison of two prominent NLP techniques, the abstract's "Novelty" section raises some questions regarding the scope of the reported work. Several claims, such as "this study *proposes* temporal sentiment analysis," "Another innovation is the *implementation* of a hybrid approach to data cleansing," and "this research also *expands* sentiment classification to be multi-label," appear to describe prospective or partially implemented features rather than fully integrated and reported results. For instance, the result section only mentions positive, neutral, or negative sentiment categories, not the "trust, fear, or doubt" of multi-label classification. Furthermore, while TF-IDF is appropriate for NBC, its role in BERT's classification, which typically uses its own internal tokenization and embeddings, would require clarification. Comprehensive evaluation metrics beyond just accuracy, such as precision, recall, and F1-score, would also strengthen the assessment of model performance, especially given potential class imbalance in sentiment data. Overall, this research addresses a crucial topic with a sound comparative methodology, offering valuable insights into public perception dynamics during a global health crisis. The demonstrated superiority of BERT over NBC for Twitter sentiment analysis is a relevant finding. To fully realize its potential and substantiate the claims made, the authors should ensure that the reported results clearly align with all aspects presented in the "Novelty" section, or clearly delineate which parts are completed work versus future directions. With clearer articulation of its contributions and a more comprehensive reporting of its implementation details and evaluation metrics, this paper could make a significant contribution to the field of social media sentiment analysis in public health.

Full Text

You need to be logged in to view the full text and Download file of this article - Sentiment Analysis on Twitter Social Media Regarding Covid-19 Vaccination with Naive Bayes Classifier (NBC) and Bidirectional Encoder Representations from Transformers (BERT) from Recursive Journal of Informatics .

Comments

You need to be logged in to post a comment.

Top Blogs by Rating

Favorite Blog

Sentiment Analysis on Twitter Social Media Regarding Covid-19 Vaccination with Naive Bayes Classifier (NBC) and Bidirectional Encoder Representations from Transformers (BERT)

Home Research Details

Angga Riski Dwi Saputra, Budi Prasetiyo