Sentiment Analysis of Lithuanian Online Reviews Using Large Language Models

Read original: arXiv:2407.19914 - Published 7/30/2024 by Brigita Vileikyt.e, Mantas Lukov{s}eviv{c}ius, Lukas Stankeviv{c}ius

💬

Overview

Sentiment analysis is a challenging field within Natural Language Processing (NLP) due to the complexity of languages and subjective nature of sentiments.
Existing research on sentiment analysis for the Lithuanian language has been limited, with traditional machine learning methods and classification algorithms showing limited effectiveness.
This paper addresses sentiment analysis of Lithuanian online reviews, exploring the capabilities of pre-trained multilingual Large Language Models (LLMs) such as BERT and T5.

Plain English Explanation

Sentiment analysis is the process of understanding the emotional tone or attitude expressed in text, such as whether a review is positive, negative, or neutral. This is an important task in Natural Language Processing, as it allows companies and researchers to gain insights into people's opinions and feelings about products, services, or topics.

However, sentiment analysis can be quite challenging. Languages are complex, with nuanced ways of expressing emotions, and sentiment is inherently subjective. This makes it difficult for machines to accurately detect and classify the sentiment in text.

The Lithuanian language, in particular, has not been extensively studied for sentiment analysis. Previous research has shown that traditional machine learning methods and classification algorithms have limited effectiveness for this task in Lithuanian.

In this work, the researchers address sentiment analysis of Lithuanian online reviews. They explore the use of transformer models, which are a type of large language model that have shown impressive performance in various NLP tasks. Specifically, they fine-tune BERT and T5 models for the Lithuanian sentiment analysis task.

The researchers find that the fine-tuned models perform quite well, particularly when the sentiments expressed in the reviews are less ambiguous. For example, they achieve 80.74% accuracy in recognizing the most popular one-star reviews and 89.61% accuracy for the most popular five-star reviews. These results significantly outperform the performance of the current state-of-the-art general-purpose language model, GPT-4.

Technical Explanation

The researchers collected and cleaned a dataset of Lithuanian five-star-based online reviews from multiple domains. They then applied transformer models, specifically fine-tuning pre-trained BERT and T5 models, to perform sentiment analysis on this data.

The fine-tuned models achieved impressive performance, especially for less ambiguous sentiments. For the most popular one-star reviews, the models achieved 80.74% testing recognition accuracy, and for the most popular five-star reviews, the accuracy was 89.61%. These results significantly outperformed the current commercial state-of-the-art general-purpose language model, GPT-4.

The researchers openly shared their fine-tuned large language models online, making them available for further research and development.

Critical Analysis

The researchers acknowledge the inherent difficulty of the sentiment analysis task, particularly for less-studied and less-resourced languages like Lithuanian. They note that while the fine-tuned models perform quite well, especially for clear-cut sentiments, the task remains challenging due to the subjective nature of language and emotions.

One potential limitation of the study is the size and domain diversity of the dataset used for fine-tuning the models. Expanding the dataset to include a broader range of review topics and styles could help the models better generalize to a wider variety of Lithuanian text.

Additionally, the researchers do not provide a detailed error analysis or exploration of the types of errors the models make. Understanding the specific challenges and limitations of the fine-tuned models could help guide future research and development efforts in this area.

Overall, this research represents an important step forward in advancing sentiment analysis for the Lithuanian language. The successful application of transformer models, such as BERT and T5, provides a promising direction for continued improvement and expansion of Lithuanian NLP capabilities.

Conclusion

This paper addresses the challenge of sentiment analysis for the Lithuanian language, a less-studied and less-resourced language. The researchers apply transformer models, specifically fine-tuning BERT and T5, to perform sentiment analysis on a dataset of Lithuanian online reviews.

The fine-tuned models demonstrate strong performance, particularly for less ambiguous sentiments, significantly outperforming the current state-of-the-art general-purpose language model, GPT-4. This research represents an important advancement in Lithuanian NLP and provides a foundation for further exploration and development of sentiment analysis tools for the Lithuanian language.

By openly sharing their fine-tuned models, the researchers are enabling other researchers and developers to build upon this work and continue advancing the field of Lithuanian sentiment analysis.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

💬

Sentiment Analysis of Lithuanian Online Reviews Using Large Language Models

Brigita Vileikyt.e, Mantas Lukov{s}eviv{c}ius, Lukas Stankeviv{c}ius

Sentiment analysis is a widely researched area within Natural Language Processing (NLP), attracting significant interest due to the advent of automated solutions. Despite this, the task remains challenging because of the inherent complexity of languages and the subjective nature of sentiments. It is even more challenging for less-studied and less-resourced languages such as Lithuanian. Our review of existing Lithuanian NLP research reveals that traditional machine learning methods and classification algorithms have limited effectiveness for the task. In this work, we address sentiment analysis of Lithuanian five-star-based online reviews from multiple domains that we collect and clean. We apply transformer models to this task for the first time, exploring the capabilities of pre-trained multilingual Large Language Models (LLMs), specifically focusing on fine-tuning BERT and T5 models. Given the inherent difficulty of the task, the fine-tuned models perform quite well, especially when the sentiments themselves are less ambiguous: 80.74% and 89.61% testing recognition accuracy of the most popular one- and five-star reviews respectively. They significantly outperform current commercial state-of-the-art general-purpose LLM GPT-4. We openly share our fine-tuned LLMs online.

7/30/2024

The Model Arena for Cross-lingual Sentiment Analysis: A Comparative Study in the Era of Large Language Models

Xiliang Zhu, Shayna Gardiner, Tere Rold'an, David Rossouw

Sentiment analysis serves as a pivotal component in Natural Language Processing (NLP). Advancements in multilingual pre-trained models such as XLM-R and mT5 have contributed to the increasing interest in cross-lingual sentiment analysis. The recent emergence in Large Language Models (LLM) has significantly advanced general NLP tasks, however, the capability of such LLMs in cross-lingual sentiment analysis has not been fully studied. This work undertakes an empirical analysis to compare the cross-lingual transfer capability of public Small Multilingual Language Models (SMLM) like XLM-R, against English-centric LLMs such as Llama-3, in the context of sentiment analysis across English, Spanish, French and Chinese. Our findings reveal that among public models, SMLMs exhibit superior zero-shot cross-lingual performance relative to LLMs. However, in few-shot cross-lingual settings, public LLMs demonstrate an enhanced adaptive potential. In addition, we observe that proprietary GPT-3.5 and GPT-4 lead in zero-shot cross-lingual capability, but are outpaced by public models in few-shot scenarios.

6/28/2024

🤖

New!Comprehensive Study on Sentiment Analysis: From Rule-based to modern LLM based system

Shailja Gupta, Rajesh Ranjan, Surya Narayan Singh

This paper provides a comprehensive survey of sentiment analysis within the context of artificial intelligence (AI) and large language models (LLMs). Sentiment analysis, a critical aspect of natural language processing (NLP), has evolved significantly from traditional rule-based methods to advanced deep learning techniques. This study examines the historical development of sentiment analysis, highlighting the transition from lexicon-based and pattern-based approaches to more sophisticated machine learning and deep learning models. Key challenges are discussed, including handling bilingual texts, detecting sarcasm, and addressing biases. The paper reviews state-of-the-art approaches, identifies emerging trends, and outlines future research directions to advance the field. By synthesizing current methodologies and exploring future opportunities, this survey aims to understand sentiment analysis in the AI and LLM context thoroughly.

9/17/2024

Do Large Language Models Possess Sensitive to Sentiment?

Yang Liu, Xichou Zhu, Zhou Shen, Yi Liu, Min Li, Yujun Chen, Benzi John, Zhenzhen Ma, Tao Hu, Zhiyang Xu, Wei Luo, Junhui Wang

Large Language Models (LLMs) have recently displayed their extraordinary capabilities in language understanding. However, how to comprehensively assess the sentiment capabilities of LLMs continues to be a challenge. This paper investigates the ability of LLMs to detect and react to sentiment in text modal. As the integration of LLMs into diverse applications is on the rise, it becomes highly critical to comprehend their sensitivity to emotional tone, as it can influence the user experience and the efficacy of sentiment-driven tasks. We conduct a series of experiments to evaluate the performance of several prominent LLMs in identifying and responding appropriately to sentiments like positive, negative, and neutral emotions. The models' outputs are analyzed across various sentiment benchmarks, and their responses are compared with human evaluations. Our discoveries indicate that although LLMs show a basic sensitivity to sentiment, there are substantial variations in their accuracy and consistency, emphasizing the requirement for further enhancements in their training processes to better capture subtle emotional cues. Take an example in our findings, in some cases, the models might wrongly classify a strongly positive sentiment as neutral, or fail to recognize sarcasm or irony in the text. Such misclassifications highlight the complexity of sentiment analysis and the areas where the models need to be refined. Another aspect is that different LLMs might perform differently on the same set of data, depending on their architecture and training datasets. This variance calls for a more in-depth study of the factors that contribute to the performance differences and how they can be optimized.

9/5/2024