Genealogical trees on the web : a search engine user perspective.

dc.contributor.authorYates, Ricardo Baeza
dc.contributor.authorPereira Junior, Álvaro Rodrigues
dc.contributor.authorZiviani, Nivio
dc.date.accessioned2012-10-18T19:01:55Z
dc.date.available2012-10-18T19:01:55Z
dc.date.issued2008
dc.description.abstractThis paper presents an extensive study about the evolution of textual content on the Web, which shows how some new pages are created from scratch while others are created using already existing content. We show that a significant fraction of the Web is a byproduct of the latter case. We introduce the concept of Web genealogical tree, in which every page in a Web snapshot is classified into a component. We study in detail these components, characterizing the copies and identifying the relation between a source of content and a search engine, by comparing page relevance measures, documents returned by real queries performed in the past, and click-through data. We observe that sources of copies are more frequently returned by queries and more clicked than other documents.pt_BR
dc.identifier.citationYATES, R. B.; PEREIRA JÚNIOR, A. R.; ZIVIANI, N. Genealogical trees on the web : a search engine user perspective. In. 17th International World Wide Web Conference, 17,. 2008. Beijing. Anais... Beijing: International World Wide Web Conference, 2008. Disponível em: <http://homepages.dcc.ufmg.br/~nivio/papers/www08.pdf>. Acesso em: 18 out. 2012.pt_BR
dc.identifier.urihttp://www.repositorio.ufop.br/handle/123456789/1676
dc.language.isoen_USpt_BR
dc.subjectWebpt_BR
dc.subjectTextpt_BR
dc.subjectContent evolutionpt_BR
dc.subjectSearch enginept_BR
dc.subjectWeb miningpt_BR
dc.titleGenealogical trees on the web : a search engine user perspective.pt_BR
dc.typeTrabalho apresentado em eventopt_BR
Arquivos
Pacote Original
Agora exibindo 1 - 1 de 1
Nenhuma Miniatura disponível
Nome:
EVENTO_GenealogicalTreesWeb.pdf
Tamanho:
567.71 KB
Formato:
Adobe Portable Document Format
Licença do Pacote
Agora exibindo 1 - 1 de 1
Nenhuma Miniatura disponível
Nome:
license.txt
Tamanho:
1.71 KB
Formato:
Item-specific license agreed upon to submission
Descrição: