Hyperparameter Tuning of Long Short-Term Memory Model for Clickbait Classification in News Headlines

Grace Yudha Satriawan, Budi Prasetiyo

Grace Yudha Satriawan, Budi Prasetiyo

Informatics

Recursive Journal of Informatics

0.0 (0 ratings)

Introduction

Hyperparameter tuning of long short-term memory model for clickbait classification in news headlines. Improve clickbait detection in news headlines using hyperparameter-tuned LSTM models. This research achieves 0.8030 accuracy, helping combat hoaxes and enhance journalism.

3 views

Abstract

Abstract. The information available on the internet nowadays is diverse and moves very quickly. Information is becoming easier to obtain by the general public with the numerous online media outlets, including news portals that provide up-to-date information insights. Various news portals earn revenue from advertising using pay-per-click methods that encourage article writers to use clickbait techniques to attract visitors. However, the negative effects of clickbait include a decrease in journalism quality and the spread of hoaxes. This problem can be prevented by using text classification to classify clickbait in news titles. One method that can be used for text classification is a neural network. Artificial neural networks use algorithms that can independently adjust input coefficient weights. This makes this algorithm highly effective for modeling non-linear statistical data. The artificial neural network algorithm, especially the Long Short-Term Memory (LSTM), has been widely used in various natural language processing fields with satisfying results, including text classification. To improve the performance of the neural network model, adjustments can be made to the model's hyperparameters. Hyperparameters are parameters that cannot be obtained through data and must be defined before the training process. In this research, the Long Short-Term Memory (LSTM) model was used in clickbait classification in news titles. Sixteen neural network models were trained with different hyperparameter configurations for each model. Hyperparameter tuning was carried out using the random search algorithm. The dataset used was the CLICK-ID dataset published by William & Sari, 2020[1], with a total of 15,000 annotated data. The research results show that the developed LSTM model has a validation accuracy of 0.8030, higher than William & Sari's research, and a validation loss of 0.4876. Using this model, researchers were able to classify clickbait in news titles with fairly good accuracy. Purpose: The study was to develop and evaluate a LSTM model with hyperparameter tuning for clickbait classification on news headlines. The thesis also aims to compare the performance of simple LSTM and bidirectional LSTM for this task. Methods: This study uses CLICK-ID dataset and applies different text preprocessing techniques. The dataset later was used to build and train 16 LSTM models with different hyperparameters and evaluates them using validation accuracy and loss. This study uses random search for hyperparameter tuning. Result: The results of the study show that the best model for clickbait classification on news headlines is a bidirectional LSTM model with one layer, 64 units, 0.2 dropout rate, and 0.001 learning rate. This model achieves a validation accuracy of 0.8030 and a validation loss of 0.4876. The results also show that hyperparameter tuning using random search can improve the performance of the LSTM models by avoiding zero probabilities and finding the optimal values for the hyperparameters. Novelty: This study compares and analyzes the different preprocessing methods on text and the different configurations of the models to find the best model for clickbait classification on news headlines. The study also uses hyperparameter tuning to tune the model into the best model and finding the optimal values for the hyperparameters.

Review

The submitted manuscript, "Hyperparameter Tuning of Long Short-Term Memory Model for Clickbait Classification in News Headlines," addresses a highly pertinent issue in today's digital information landscape: the detection of clickbait. With the proliferation of online news and the adverse effects of clickbait on journalistic integrity and the spread of misinformation, effective automated classification methods are crucial. This study aims to develop and evaluate a Long Short-Term Memory (LSTM) model, a type of recurrent neural network known for its efficacy in natural language processing tasks, for classifying clickbait in news headlines. A key aspect of their approach is the systematic hyperparameter tuning of the LSTM model to optimize its performance, acknowledging that neural network effectiveness is highly dependent on these pre-defined parameters. The research employed the CLICK-ID dataset, comprising 15,000 annotated news headlines, as its primary data source. To achieve optimal model performance, the authors trained sixteen distinct neural network models, each with varying hyperparameter configurations. The methodology included a comparison between simple LSTM and bidirectional LSTM architectures, and hyperparameter tuning was conducted using a random search algorithm. The abstract reports that the most successful model was a bidirectional LSTM, configured with a single layer, 64 units, a 0.2 dropout rate, and a 0.001 learning rate. This optimized model achieved a validation accuracy of 0.8030 and a validation loss of 0.4876. Notably, the authors state that this performance surpasses that reported in the foundational research by William & Sari, 2020, upon which their dataset is based. The study's findings demonstrate a promising approach to improving clickbait classification accuracy through diligent hyperparameter tuning. The reported validation accuracy of 0.8030 for the best-performing bidirectional LSTM model is a significant result, especially when presented as an improvement over existing benchmarks. The novelty of the research lies in its thorough comparative analysis of different text preprocessing methods and various model configurations, coupled with the application of random search to pinpoint optimal hyperparameter values. This systematic exploration not only yields a robust model but also provides valuable insights into the effective design choices for LSTM-based clickbait detectors. The contribution of this work is a well-tuned and effective model capable of classifying clickbait, thereby contributing to efforts to maintain information quality in online news environments.

Full Text

You need to be logged in to view the full text and Download file of this article - Hyperparameter Tuning of Long Short-Term Memory Model for Clickbait Classification in News Headlines from Recursive Journal of Informatics .

Comments

You need to be logged in to post a comment.

Top Blogs by Rating

Favorite Blog

Hyperparameter Tuning of Long Short-Term Memory Model for Clickbait Classification in News Headlines

Home Research Details

Grace Yudha Satriawan, Budi Prasetiyo