Multi Class Depression Detection Through Tweets using Artificial Intelligence

2404.13104

Published 4/23/2024 by Muhammad Osama Nusrat, Waseem Shahzad, Saad Ahmed Jamal

Multi Class Depression Detection Through Tweets using Artificial Intelligence

Abstract

Depression is a significant issue nowadays. As per the World Health Organization (WHO), in 2023, over 280 million individuals are grappling with depression. This is a huge number; if not taken seriously, these numbers will increase rapidly. About 4.89 billion individuals are social media users. People express their feelings and emotions on platforms like Twitter, Facebook, Reddit, Instagram, etc. These platforms contain valuable information which can be used for research purposes. Considerable research has been conducted across various social media platforms. However, certain limitations persist in these endeavors. Particularly, previous studies were only focused on detecting depression and the intensity of depression in tweets. Also, there existed inaccuracies in dataset labeling. In this research work, five types of depression (Bipolar, major, psychotic, atypical, and postpartum) were predicted using tweets from the Twitter database based on lexicon labeling. Explainable AI was used to provide reasoning by highlighting the parts of tweets that represent type of depression. Bidirectional Encoder Representations from Transformers (BERT) was used for feature extraction and training. Machine learning and deep learning methodologies were used to train the model. The BERT model presented the most promising results, achieving an overall accuracy of 0.96.

Create account to get full access

Overview

This paper explores the use of artificial intelligence and natural language processing techniques to detect and classify different levels of depression in tweets.
The researchers developed a multi-class depression detection model that can identify users as having no depression, mild depression, moderate depression, or severe depression based on the sentiment and linguistic patterns in their Twitter posts.
The model was trained and evaluated on a dataset of tweets collected from users who had self-reported their depression levels, demonstrating the potential of this approach for mental health monitoring and support.

Plain English Explanation

The researchers in this study wanted to see if they could use artificial intelligence (AI) and natural language processing (NLP) to detect different levels of depression in people's tweets (short social media posts). They collected tweets from people who had told the researchers how depressed they felt, and used this data to train a machine learning model.

The model they developed can look at the language and sentiment (positive or negative feeling) in a person's tweets and then classify them as having no depression, mild depression, moderate depression, or severe depression. This could be really useful for monitoring people's mental health and providing support, since it allows you to automatically identify different levels of depression just by analyzing their social media posts.

Technical Explanation

The researchers first collected a dataset of tweets from users who had self-reported their depression levels, ranging from no depression to severe depression. They then preprocessed the tweets to remove irrelevant information and used a variety of natural language processing techniques to extract features from the text, such as word embeddings, sentiment scores, and linguistic patterns.

Next, the researchers trained a multi-class classification model to predict the user's depression level based on the extracted tweet features. They experimented with different machine learning algorithms, including logistic regression, support vector machines, and deep neural networks, and evaluated the models' performance using metrics like accuracy, precision, recall, and F1-score.

The best-performing model was able to accurately classify users as having no depression, mild depression, moderate depression, or severe depression with an F1-score of 0.82. The researchers also analyzed the important features that the model used to make its predictions, finding that sentiment, linguistic style, and social engagement patterns were all highly predictive of depression levels.

Critical Analysis

The researchers acknowledge several limitations of their study, including the potential for bias in the self-reported depression labels and the reliance on a single social media platform (Twitter). Additionally, the dataset was relatively small, which may limit the generalizability of the results.

Further research would be needed to validate the performance of this approach on larger and more diverse datasets, as well as to investigate the ethical implications of using social media data for mental health monitoring without user consent. There are also concerns about the potential for misuse of such technology for surveillance or discrimination purposes.

Overall, while this study demonstrates the potential of AI and NLP for mental health assessment, it also highlights the need for careful consideration of the social and ethical implications of such technologies.

Conclusion

This study presents a novel approach to detecting and classifying different levels of depression in Twitter users using artificial intelligence and natural language processing. The researchers developed a multi-class depression detection model that can accurately identify users as having no depression, mild depression, moderate depression, or severe depression based on the sentiment and linguistic patterns in their tweets.

The findings suggest that social media data, when analyzed with advanced AI techniques, could be a valuable tool for mental health monitoring and support. However, the researchers also acknowledge the limitations of their approach and the need for further research to address the ethical and privacy concerns surrounding the use of such technology.

Overall, this work represents an important step towards leveraging the wealth of data available on social media platforms to better understand and support mental health at a population level.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

🔎

Diverse Perspectives, Divergent Models: Cross-Cultural Evaluation of Depression Detection on Twitter

Nuredin Ali, Charles Chuankai Zhang, Ned Mayo, Stevie Chancellor

Social media data has been used for detecting users with mental disorders, such as depression. Despite the global significance of cross-cultural representation and its potential impact on model performance, publicly available datasets often lack crucial metadata related to this aspect. In this work, we evaluate the generalization of benchmark datasets to build AI models on cross-cultural Twitter data. We gather a custom geo-located Twitter dataset of depressed users from seven countries as a test dataset. Our results show that depression detection models do not generalize globally. The models perform worse on Global South users compared to Global North. Pre-trained language models achieve the best generalization compared to Logistic Regression, though still show significant gaps in performance on depressed and non-Western users. We quantify our findings and provide several actionable suggestions to mitigate this issue.

6/26/2024

cs.CL

🗣️

Exploring Social Media Posts for Depression Identification: A Study on Reddit Dataset

Nandigramam Sai Harshit, Nilesh Kumar Sahu, Haroon R. Lone

Depression is one of the most common mental disorders affecting an individual's personal and professional life. In this work, we investigated the possibility of utilizing social media posts to identify depression in individuals. To achieve this goal, we conducted a preliminary study where we extracted and analyzed the top Reddit posts made in 2022 from depression-related forums. The collected data were labeled as depressive and non-depressive using UMLS Metathesaurus. Further, the pre-processed data were fed to classical machine learning models, where we achieved an accuracy of 92.28% in predicting the depressive and non-depressive posts.

5/14/2024

cs.CL cs.SI

Assessing ML Classification Algorithms and NLP Techniques for Depression Detection: An Experimental Case Study

Giuliano Lorenzoni, Cristina Tavares, Nathalia Nascimento, Paulo Alencar, Donald Cowan

Depression has affected millions of people worldwide and has become one of the most common mental disorders. Early mental disorder detection can reduce costs for public health agencies and prevent other major comorbidities. Additionally, the shortage of specialized personnel is very concerning since Depression diagnosis is highly dependent on expert professionals and is time-consuming. Recent research has evidenced that machine learning (ML) and Natural Language Processing (NLP) tools and techniques have significantly bene ted the diagnosis of depression. However, there are still several challenges in the assessment of depression detection approaches in which other conditions such as post-traumatic stress disorder (PTSD) are present. These challenges include assessing alternatives in terms of data cleaning and pre-processing techniques, feature selection, and appropriate ML classification algorithms. This paper tackels such an assessment based on a case study that compares different ML classifiers, specifically in terms of data cleaning and pre-processing, feature selection, parameter setting, and model choices. The case study is based on the Distress Analysis Interview Corpus - Wizard-of-Oz (DAIC-WOZ) dataset, which is designed to support the diagnosis of mental disorders such as depression, anxiety, and PTSD. Besides the assessment of alternative techniques, we were able to build models with accuracy levels around 84% with Random Forest and XGBoost models, which is significantly higher than the results from the comparable literature which presented the level of accuracy of 72% from the SVM model.

4/9/2024

cs.CL

🤖

EmoScan: Automatic Screening of Depression Symptoms in Romanized Sinhala Tweets

Jayathi Hewapathirana, Deshan Sumanathilaka

This work explores the utilization of Romanized Sinhala social media data to identify individuals at risk of depression. A machine learning-based framework is presented for the automatic screening of depression symptoms by analyzing language patterns, sentiment, and behavioural cues within a comprehensive dataset of social media posts. The research has been carried out to compare the suitability of Neural Networks over the classical machine learning techniques. The proposed Neural Network with an attention layer which is capable of handling long sequence data, attains a remarkable accuracy of 93.25% in detecting depression symptoms, surpassing current state-of-the-art methods. These findings underscore the efficacy of this approach in pinpointing individuals in need of proactive interventions and support. Mental health professionals, policymakers, and social media companies can gain valuable insights through the proposed model. Leveraging natural language processing techniques and machine learning algorithms, this work offers a promising pathway for mental health screening in the digital era. By harnessing the potential of social media data, the framework introduces a proactive method for recognizing and assisting individuals at risk of depression. In conclusion, this research contributes to the advancement of proactive interventions and support systems for mental health, thereby influencing both research and practical applications in the field.

4/1/2024

cs.CL cs.CY cs.LG