Self-training author name disambiguation for information scarce scenarios.

dc.contributor.authorFerreira, Anderson Almeida
dc.contributor.authorVeloso, Adriano Alonso
dc.contributor.authorGonçalves, Marcos André
dc.contributor.authorLaender, Alberto Henrique Frade
dc.date.accessioned2017-02-21T16:36:15Z
dc.date.available2017-02-21T16:36:15Z
dc.date.issued2014
dc.description.abstractWe present a novel 3-step self-training method for author name disambiguation—SAND (self-training associative name disambiguator)—which requires no manual labeling, no parameterization (in real-world scenarios) and is particularly suitable for the common situation in which only the most basic information about a citation record is available (i.e., author names, and work and venue titles). During the first step, real-world heuristics on coauthors are able to produce highly pure (although fragmented) clusters. The most representative of these clusters are then selected to serve as training data for the third supervised author assignment step. The third step exploits a state-of-the-art transductive disambiguation method capable of detecting unseen authors not included in any training example and incorporating reliable predictions to the training data. Experiments conducted with standard public collections, using the minimum set of attributes present in a citation, demonstrate that our proposed method outperforms all representative unsupervised author grouping disambiguation methods and is very competitive with fully supervised author assignment methods. Thus, different from other bootstrapping methods that explore privileged, hard to obtain information such as self-citations and personal information, our proposed method produces topnotch performance with no (manual) training data or parameterization and in the presence of scarce information.pt_BR
dc.identifier.citationFERREIRA, A. A. et al. Self-training author name disambiguation for information scarce scenarios. Journal of the Association for Information Science and Technology, v. 65, n. 6, p. 1257-1278, jun. 2014. Disponível em: <http://onlinelibrary.wiley.com/doi/10.1002/asi.22992/epdf>. Acesso em: 17 fev. 2017.pt_BR
dc.identifier.doihttps://doi.org/10.1002/asi.22992
dc.identifier.issn2330-1643
dc.identifier.urihttp://www.repositorio.ufop.br/handle/123456789/7291
dc.identifier.uri2https://asistdl.onlinelibrary.wiley.com/doi/full/10.1002/asi.22992pt_BR
dc.language.isoen_USpt_BR
dc.rightsrestritopt_BR
dc.titleSelf-training author name disambiguation for information scarce scenarios.pt_BR
dc.typeArtigo publicado em periodicopt_BR

Arquivos

Pacote original

Agora exibindo 1 - 1 de 1
Nenhuma Miniatura Disponível
Nome:
ARTIGO_SelfTrainingAuthor.pdf
Tamanho:
349.52 KB
Formato:
Adobe Portable Document Format

Licença do pacote

Agora exibindo 1 - 1 de 1
Nenhuma Miniatura Disponível
Nome:
license.txt
Tamanho:
924 B
Formato:
Item-specific license agreed upon to submission
Descrição: