Mapping the Prague Dependency Treebank Annotation Scheme onto Robust Minimal Recursion Semantics
diploma thesis (DEFENDED)
![Document thumbnail](/bitstream/handle/20.500.11956/30663/thumbnail.png?sequence=6&isAllowed=y)
View/ Open
Permanent link
http://hdl.handle.net/20.500.11956/30663Identifiers
Study Information System: 62613
Collections
- Kvalifikační práce [11266]
Author
Advisor
Referee
Štěpánek, Jan
Faculty / Institute
Faculty of Mathematics and Physics
Discipline
Computational Linguistics
Department
Institute of Formal and Applied Linguistics
Date of defense
1. 2. 2010
Publisher
Univerzita Karlova, Matematicko-fyzikální fakultaLanguage
English
Grade
Excellent
This thesis investigates the correspondence between two semantic formalisms, namely the tectogrammatical layer of the Prague Dependency Treebank 2.0 (PDT) and Robust Minimal Recursion Semantics (RMRS). It is a rst attempt to relate the dependency based annotation scheme of PDT to a compositional semantics approach like RMRS. An iterative mapping algorithm that converts PDT trees into RMRS structures is developed that associates RMRSs to each node in the dependency tree. Therefore, composition rules are formulated and the complex relation between dependency in PDT and semantic heads in RMRS is analyzed in detail. It turns out that structure and dependencies, morphological categories and some coreferences can be preserved in the target structures. Furthermore, valency and free modi cations are distinguished using the valency dictionary of PDT as an additional resource. The evaluation result of 81% recall shows that systematically correct underspeci ed target structures can be obtained by a rule-based mapping approach, which is an indicator that RMRS is capable of representing Czech data. This nding is novel as Czech, with its free word order and rich morphology, is typologically di erent from language that used RMRS thus far.