TY - JOUR
T1 - Celebrity profiling on twitter using sociolinguistic features notebook for PAN at CLEF 2019
AU - Moreno-Sandoval, Luis Gabriel
AU - Puertas, Edwin
AU - Plaza-Del-Arco, Flor Miriam
AU - Pomares-Quimbaya, Alexandra
AU - Alvarado-Valencia, Jorge Andres
AU - Alfonso Ureña-López, L.
N1 - Publisher Copyright:
© 2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
PY - 2019
Y1 - 2019
N2 - Social networks have been a revolutionary scenario for celebrities because they allow them to reach a wider audience with much higher frequency than using traditional means. These platforms enable them to improve or sometimes deteriorate, their careers through the construction of closer relationships with their fans and the acquisition of new ones. Indeed, networks have promoted the emergence of a new type of celebrities that exists only in the digital world. Being able to characterize the celebrities that are more active on social networks, such as Twitter, gives an enormous opportunity to identify what is their real level of fame, what is their relevance for an age group, or a specific gender or occupation. These facts may enrich decision making, especially in advertising and marketing. To achieve this aim, this paper presents a novel strategy for the characterization of celebrities profile on Twitter based on the generation of socio-linguistic features from their posts that serve as input to a set of classifiers. Specifically, we produced four classifiers that describe the level of fame, the gender, the birth date, and the possible occupation of a celebrity. We obtained the training and test data sets as part of our participation at PAN 2019 at CLEF. Results of each classifier are reported including the analysis of which features are more relevant, which classification techniques were more useful and which were the final precision and recall results.
AB - Social networks have been a revolutionary scenario for celebrities because they allow them to reach a wider audience with much higher frequency than using traditional means. These platforms enable them to improve or sometimes deteriorate, their careers through the construction of closer relationships with their fans and the acquisition of new ones. Indeed, networks have promoted the emergence of a new type of celebrities that exists only in the digital world. Being able to characterize the celebrities that are more active on social networks, such as Twitter, gives an enormous opportunity to identify what is their real level of fame, what is their relevance for an age group, or a specific gender or occupation. These facts may enrich decision making, especially in advertising and marketing. To achieve this aim, this paper presents a novel strategy for the characterization of celebrities profile on Twitter based on the generation of socio-linguistic features from their posts that serve as input to a set of classifiers. Specifically, we produced four classifiers that describe the level of fame, the gender, the birth date, and the possible occupation of a celebrity. We obtained the training and test data sets as part of our participation at PAN 2019 at CLEF. Results of each classifier are reported including the analysis of which features are more relevant, which classification techniques were more useful and which were the final precision and recall results.
KW - Author profiling
KW - Celebrity profiling
KW - Computational linguistic
KW - Natural language processing
KW - Socio-linguistic feature
KW - Twitter
KW - User profiling
UR - http://www.scopus.com/inward/record.url?scp=85070517749&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85070517749
SN - 1613-0073
VL - 2380
JO - CEUR Workshop Proceedings
JF - CEUR Workshop Proceedings
T2 - 20th Working Notes of CLEF Conference and Labs of the Evaluation Forum, CLEF 2019
Y2 - 9 September 2019 through 12 September 2019
ER -