Nowadays, researchers unanimously agree on the undeniable importance of mental health. However, the literature related to tracking mental disorders in textual content from social media platforms is heavily inclined towards specific problems. In particular, panic disorder/panic attacks are heavily understudied in the current literature and the relevant resources are missing. Therefore, in this work we focus on collecting an annotated dataset. To this end, in order to mitigate the annotation effort by selectively annotating unlabeled data, we propose an active-learning based approach with uncertainty sampling supported by contextualized (Transformer-based) representations, symptomatic and psychometric features and domain expertise. Our evaluation demonstrates the efficiency of the proposed approach both in terms of classification accuracy and predictions confidence. Our contribution to the research community is an annotated dataset of 13,036 tweets that distinguishes between personal panicking experiences such as panic attacks, other panic-related content and completely panic-unrelated content hoping that it will foster research on the topic.

Annotating Panic in Social Media using Active Learning, Transformers and Domain Knowledge

Lucifora C.;
2023-01-01

Abstract

Nowadays, researchers unanimously agree on the undeniable importance of mental health. However, the literature related to tracking mental disorders in textual content from social media platforms is heavily inclined towards specific problems. In particular, panic disorder/panic attacks are heavily understudied in the current literature and the relevant resources are missing. Therefore, in this work we focus on collecting an annotated dataset. To this end, in order to mitigate the annotation effort by selectively annotating unlabeled data, we propose an active-learning based approach with uncertainty sampling supported by contextualized (Transformer-based) representations, symptomatic and psychometric features and domain expertise. Our evaluation demonstrates the efficiency of the proposed approach both in terms of classification accuracy and predictions confidence. Our contribution to the research community is an annotated dataset of 13,036 tweets that distinguishes between personal panicking experiences such as panic attacks, other panic-related content and completely panic-unrelated content hoping that it will foster research on the topic.
2023
9798350381641
Active Learning
Classification algorithms
Data acquisition
Machine learning
Mental disorders
Natural language processing
Transformers
Uncertainty
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14085/51587
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
social impact