In this work, we introduce a novel method for entity resolution author disambiguation in bibliographic networks. Such a method is based on a 2-steps network traversal using topological similarity measures for rating candidate nodes. Topological similarity is widely used in the Link Prediction application domain to assess the likelihood of an unknown link. A similarity function can be a good approximation for equality, therefore can be used to disambiguate, basing on the hypothesis that authors with many common co-authors are similar. Our method has experimented on a graph-based representation of the public DBLP Computer Science database. The results obtained are extremely encouraging regarding Precision, Accuracy, and Specificity. Further good aspects are the locality of the method for disambiguation assessment which avoids the need to know the global network, and the exploitation of only a few data, e.g. author name and paper title (i.e., co-authorship data).

Efficient Graph-Based Author Disambiguation by Topological Similarity in DBLP

Milani A.
2018-01-01

Abstract

In this work, we introduce a novel method for entity resolution author disambiguation in bibliographic networks. Such a method is based on a 2-steps network traversal using topological similarity measures for rating candidate nodes. Topological similarity is widely used in the Link Prediction application domain to assess the likelihood of an unknown link. A similarity function can be a good approximation for equality, therefore can be used to disambiguate, basing on the hypothesis that authors with many common co-authors are similar. Our method has experimented on a graph-based representation of the public DBLP Computer Science database. The results obtained are extremely encouraging regarding Precision, Accuracy, and Specificity. Further good aspects are the locality of the method for disambiguation assessment which avoids the need to know the global network, and the exploitation of only a few data, e.g. author name and paper title (i.e., co-authorship data).
2018
978-1-5386-9555-5
Databases
Social network services
Data integrity
Computer science
Bibliometrics
Semantics
Task analysis
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14085/42981
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 7
  • ???jsp.display-item.citation.isi??? ND
social impact