One of the main problems that emerges in the classic approach to semantics is the difficulty in acquisition and maintenance of ontologies and semantic annotations. On the other hand, the flow of data and documents which are accessible from the Web is continuously fueled by the contribution of millions of users who interact digitally in a collaborative way. Search engines, continually exploring the Web, are therefore the natural source of information on which to base a modern approach to semantic annotation. A promising idea is that it is possible to generalize the semantic similarity, under the assumption that semantically similar terms behave similarly, and define collaborative proximity measures based on the indexing information returned by search engines. In this work PMING, a new collaborative proximity measure based on search engines, which uses the information provided by search engines, is introduced as a basis to extract semantic content. PMING is defined on the basis of the best features of other state-of-the-art proximity distances which have been considered. It defines the degree of relatedness between terms, by using only the number of documents returned as result for a query, then the measure dynamically reflects the collaborative change made on the web resources. Experiments held on popular collaborative and generalist engines (e.g. Flickr, Youtube, Google, Bing, Yahoo Search) show that PMING outperforms state-of-the-art proximity measures (e.g. Normalized Google Distance, Flickr Distance etc.), in modeling contexts, modeling human perception, and clustering of semantic associations.

PMING distance: A collaborative semantic proximity measure

MILANI, Alfredo
2012-01-01

Abstract

One of the main problems that emerges in the classic approach to semantics is the difficulty in acquisition and maintenance of ontologies and semantic annotations. On the other hand, the flow of data and documents which are accessible from the Web is continuously fueled by the contribution of millions of users who interact digitally in a collaborative way. Search engines, continually exploring the Web, are therefore the natural source of information on which to base a modern approach to semantic annotation. A promising idea is that it is possible to generalize the semantic similarity, under the assumption that semantically similar terms behave similarly, and define collaborative proximity measures based on the indexing information returned by search engines. In this work PMING, a new collaborative proximity measure based on search engines, which uses the information provided by search engines, is introduced as a basis to extract semantic content. PMING is defined on the basis of the best features of other state-of-the-art proximity distances which have been considered. It defines the degree of relatedness between terms, by using only the number of documents returned as result for a query, then the measure dynamically reflects the collaborative change made on the web resources. Experiments held on popular collaborative and generalist engines (e.g. Flickr, Youtube, Google, Bing, Yahoo Search) show that PMING outperforms state-of-the-art proximity measures (e.g. Normalized Google Distance, Flickr Distance etc.), in modeling contexts, modeling human perception, and clustering of semantic associations.
2012
data mining
information extraction
semantic similarity measure
groupware
ontologies (artificial intelligence)
pattern clustering
search engines
PMING distance
Web
collaborative change
collaborative proximity measure
collaborative semantic proximity measure
context modeling
human perception modeling
indexing information
information source
ontologies
proximity distance
relatedness degree
semantic annotation
semantic association clustering
semantic content extraction
semantic similarity
semantics approach
data mining
information extraction
semantic similarity measure
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14085/43063
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 42
  • ???jsp.display-item.citation.isi??? ND
social impact