Automatic detection and attribution of quotes
Automatická identifikace citátů
diplomová práce (OBHÁJENO)

Zobrazit/ otevřít
Trvalý odkaz
http://hdl.handle.net/20.500.11956/181574Identifikátory
SIS: 245126
Kolekce
- Kvalifikační práce [11326]
Autor
Vedoucí práce
Oponent práce
Vidová Hladká, Barbora
Fakulta / součást
Matematicko-fyzikální fakulta
Obor
Computer Science - Language Technologies and Computational Linguistics
Katedra / ústav / klinika
Ústav formální a aplikované lingvistiky
Datum obhajoby
6. 6. 2023
Nakladatel
Univerzita Karlova, Matematicko-fyzikální fakultaJazyk
Angličtina
Známka
Výborně
Klíčová slova (česky)
NLPKlíčová slova (anglicky)
NLP|quotation extraction|quotation attribution|CRFs|article|annotationQuotations extraction and attribution are important practical tasks for the media, but most of the presented solutions are monolingual. In this work, I present a complex machine learning-based system for extraction and attribution of direct and indirect quo- tations, which is trained on English and tested on Czech and Russian data. Czech and Russian test datasets were manually annotated as part of this study. This system is com- pared against a rule-based baseline model. Baseline model demonstrates better precision in extraction of quotation elements, but low recall. The machine learning-based model is better overall in extracting separate elements of quotations and full quotations as well. 1