Facebook comments and shared posts often convey human biases,which play a pivotal role in information spreading and content consumption,where short information can be quickly consumed, and later ruminated. Such biasis nevertheless at the basis of human-generated content, and being able to extractcontexts that does not amplify but represent such a bias can be relevant to datamining and artificial intelligence, because it is what shapes the opinion of usersthrough social media. Starting from the observation that a separation in topicclusters, i.e. sub-contexts, spontaneously occur if evaluated by human commonsense, especially in particular domains e.g. politics, technology, this workintroduces a process for automated context extraction by means of a class ofpath-based semantic similarity measures which, using third party knowledge e.g.WordNet, Wikipedia, can create a bag of words relating to relevant conceptspresent in Facebook comments to topic-related posts, thus reflecting the collectiveknowledge of a community of users. It is thus easy to create human-readableviews e.g. word clouds, or structured information to be readable by machines forfurther learning or content explanation, e.g. augmenting information with timestamps of posts and comments. Experimental evidence, obtained by the domainof information security and technology over a sample of 9M3k page users, whereprevious comments serve as a use case for forthcoming users, shows that a simpleclustering on frequency-based bag of words can identify the main context wordscontained in Facebook comments identifiable by human common sense. Groupsimilarity measures are also of great interest for many application domains, sincethey can be used to evaluate similarity of objects in term of the similarity of theassociated sets, can then be calculated on the extracted context words to reflectthe collective notion of semantic similarity, providing additional insights onwhich to reason, e.g. in terms of cognitive factors and behavioral patterns.

Clustering facebook for biased context extraction

Milani, Alfredo
2017-01-01

Abstract

Facebook comments and shared posts often convey human biases,which play a pivotal role in information spreading and content consumption,where short information can be quickly consumed, and later ruminated. Such biasis nevertheless at the basis of human-generated content, and being able to extractcontexts that does not amplify but represent such a bias can be relevant to datamining and artificial intelligence, because it is what shapes the opinion of usersthrough social media. Starting from the observation that a separation in topicclusters, i.e. sub-contexts, spontaneously occur if evaluated by human commonsense, especially in particular domains e.g. politics, technology, this workintroduces a process for automated context extraction by means of a class ofpath-based semantic similarity measures which, using third party knowledge e.g.WordNet, Wikipedia, can create a bag of words relating to relevant conceptspresent in Facebook comments to topic-related posts, thus reflecting the collectiveknowledge of a community of users. It is thus easy to create human-readableviews e.g. word clouds, or structured information to be readable by machines forfurther learning or content explanation, e.g. augmenting information with timestamps of posts and comments. Experimental evidence, obtained by the domainof information security and technology over a sample of 9M3k page users, whereprevious comments serve as a use case for forthcoming users, shows that a simpleclustering on frequency-based bag of words can identify the main context wordscontained in Facebook comments identifiable by human common sense. Groupsimilarity measures are also of great interest for many application domains, sincethey can be used to evaluate similarity of objects in term of the similarity of theassociated sets, can then be calculated on the extracted context words to reflectthe collective notion of semantic similarity, providing additional insights onwhich to reason, e.g. in terms of cognitive factors and behavioral patterns.
2017
9783319623917
Artificial intelligence
Collective knowledge
Data mining
Knowledge discovery
Semantic distance
Word similarity
Theoretical Computer Science
Computer Science (all)
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14085/42988
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 15
  • ???jsp.display-item.citation.isi??? ND
social impact