Entity retrieval on Wikipedia in the scope of the gikiCLEF track
diplomová práce (OBHÁJENO)
![Náhled dokumentu](/bitstream/handle/20.500.11956/23299/thumbnail.png?sequence=7&isAllowed=y)
Zobrazit/ otevřít
Trvalý odkaz
http://hdl.handle.net/20.500.11956/23299Identifikátory
SIS: 62987
Kolekce
- Kvalifikační práce [11264]
Vedoucí práce
Oponent práce
Žabokrtský, Zdeněk
Fakulta / součást
Matematicko-fyzikální fakulta
Obor
Matematická lingvistika
Katedra / ústav / klinika
Ústav formální a aplikované lingvistiky
Datum obhajoby
14. 9. 2009
Nakladatel
Univerzita Karlova, Matematicko-fyzikální fakultaJazyk
Angličtina
Známka
Velmi dobře
This thesis presents a system to retrieve entities specified by a question or description given in natural language, this description indicates the entity type and the properties that the entities need to satisfy. This task is analogous to the one proposed in the GikiCLEF 2009 track. The system is fed with the Spanish Wikipedia Collection of 2008 and every entity is represented by a Wikipage. We propose three novel methods to perform query expansion in the problem of entity retrieval. We also introduce a novel method to employ the English Yago and DBpedia semantic resources to determine the target named entity type; this method is used to improve previous approaches in which the target NE type is based solely on Wikipedia categories. We show that our system obtains promising results when we evaluate its performance in the GikiCLEF 2009 topic list and compare the results with the other participants of the track.