Abstract

Digital social networks have become an essential source of information because celebrities use them to share their opinions, ideas, thoughts, and feelings. This makes digital social networks one of the preferred means for celebrities to promote themselves and attract new followers. This paper proposes a model of feature selection for the classification of celebrities profiles based on their use of a digital social network Twitter. The model includes the analysis of lexical, syntactic, symbolic, participation, and complementary information features of the posts of celebrities to estimate, based on these, their demographic and influence characteristics. The classification with these new features has an F1-score of 0.65 in Fame, 0.88 in Gender, 0.37 in Birth year, and 0.57 in Occupation. With these new features, the average accuracy improve up to 0.14 more. As a result, extracted features from linguistic cues improved the performance of predictive models of Fame and Gender and facilitate explanations of the model results. Particularly, the use of the third person singular was highly predictive in the model of Fame.

Original languageEnglish
Article number16
Number of pages36
JournalComputational Social Networks
Volume8
Issue number1
DOIs
StatePublished - Dec 2021

Keywords

  • Author profile
  • Celebrity profile
  • Demographic features
  • Influential feature
  • Natural Language Processing

Fingerprint

Dive into the research topics of 'Celebrity profiling through linguistic analysis of digital social networks'. Together they form a unique fingerprint.

Cite this