Identifikace složených gramatických tvarů

Krippnerová, Lenka

Identification of periphrastic grammatical forms

dc.contributor.advisor	Zeman, Daniel
dc.creator	Krippnerová, Lenka
dc.date.accessioned	2024-11-29T05:42:09Z
dc.date.available	2024-11-29T05:42:09Z
dc.date.issued	2024
dc.identifier.uri	http://hdl.handle.net/20.500.11956/192873
dc.description.abstract	Cílem této práce je identifikovat a vhodně označit složené gramatické tvary v dat- ech z projektu Universal Dependencies. Data z projektu Universal Dependencies jsou anotována na morfologické i syntaktické rovině, nicméně tyto anotace se vztahují pouze k jednotlivým slovům, složené tvary tedy nelze snadno vyhledat. Výstupem této práce je program, který na základě nastudovaných gramatických pravidel slovanských jazyků tyto složené gramatické tvary v datech slovanských jazyků objeví a označí. Dále pro- gram, jenž načte pravidla z konfiguračního souboru a na základě těchto pravidel v datech označí složené tvary. Konfigurační soubor je navržen ve formátu YAML, lze ho snadno upravovat v klasickém textovém editoru a je dostatečně jednoduchý na to, aby s ním dokázal bez větších obtíží pracovat i uživatel bez zkušeností s programováním.	cs_CZ
dc.description.abstract	The goal of this work is to identify and appropriately mark periphrastic grammatical forms in the data from the Universal Dependencies project. The data from the Universal Dependencies project are annotated on the morphological and syntactic level, however, these annotations relate only to individual words, so periphrastic forms cannot be easily searched. The output of this work is a program which, based on the studied grammatical rules of Slavic languages, discovers and marks these periphrastic grammatical forms in the data of Slavic languages. Furthermore, a program that reads the rules from the configuration file and, based on these rules, marks the periphrastic forms in the data. The configuration file is designed in YAML format, can be easily edited in a classic text editor, and is simple enough that even a user without programming experience can work with it without much difficulty.	en_US
dc.language	Čeština	cs_CZ
dc.language.iso	cs_CZ
dc.publisher	Univerzita Karlova, Matematicko-fyzikální fakulta	cs_CZ
dc.subject	annotated corpus\|tagging\|morphology\|syntax\|natural language processing\|universal dependencies	en_US
dc.subject	anotovaný korpus\|značkování\|morfologie\|syntax\|zpracování přirozeného jazyka\|universal dependencies	cs_CZ
dc.title	Identifikace složených gramatických tvarů	cs_CZ
dc.type	bakalářská práce	cs_CZ
dcterms.created	2024
dcterms.dateAccepted	2024-09-05
dc.description.department	Institute of Formal and Applied Linguistics	en_US
dc.description.department	Ústav formální a aplikované lingvistiky	cs_CZ
dc.description.faculty	Matematicko-fyzikální fakulta	cs_CZ
dc.description.faculty	Faculty of Mathematics and Physics	en_US
dc.identifier.repId	269893
dc.title.translated	Identification of periphrastic grammatical forms	en_US
dc.contributor.referee	Lopatková, Markéta
thesis.degree.name	Bc.
thesis.degree.level	bakalářské	cs_CZ
thesis.degree.discipline	Computer Science with specialisation in Artificial Intelligence	en_US
thesis.degree.discipline	Informatika se specializací Umělá inteligence	cs_CZ
thesis.degree.program	Computer Science	en_US
thesis.degree.program	Informatika	cs_CZ
uk.thesis.type	bakalářská práce	cs_CZ
uk.taxonomy.organization-cs	Matematicko-fyzikální fakulta::Ústav formální a aplikované lingvistiky	cs_CZ
uk.taxonomy.organization-en	Faculty of Mathematics and Physics::Institute of Formal and Applied Linguistics	en_US
uk.faculty-name.cs	Matematicko-fyzikální fakulta	cs_CZ
uk.faculty-name.en	Faculty of Mathematics and Physics	en_US
uk.faculty-abbr.cs	MFF	cs_CZ
uk.degree-discipline.cs	Informatika se specializací Umělá inteligence	cs_CZ
uk.degree-discipline.en	Computer Science with specialisation in Artificial Intelligence	en_US
uk.degree-program.cs	Informatika	cs_CZ
uk.degree-program.en	Computer Science	en_US
thesis.grade.cs	Velmi dobře	cs_CZ
thesis.grade.en	Very good	en_US
uk.abstract.cs	Cílem této práce je identifikovat a vhodně označit složené gramatické tvary v dat- ech z projektu Universal Dependencies. Data z projektu Universal Dependencies jsou anotována na morfologické i syntaktické rovině, nicméně tyto anotace se vztahují pouze k jednotlivým slovům, složené tvary tedy nelze snadno vyhledat. Výstupem této práce je program, který na základě nastudovaných gramatických pravidel slovanských jazyků tyto složené gramatické tvary v datech slovanských jazyků objeví a označí. Dále pro- gram, jenž načte pravidla z konfiguračního souboru a na základě těchto pravidel v datech označí složené tvary. Konfigurační soubor je navržen ve formátu YAML, lze ho snadno upravovat v klasickém textovém editoru a je dostatečně jednoduchý na to, aby s ním dokázal bez větších obtíží pracovat i uživatel bez zkušeností s programováním.	cs_CZ
uk.abstract.en	The goal of this work is to identify and appropriately mark periphrastic grammatical forms in the data from the Universal Dependencies project. The data from the Universal Dependencies project are annotated on the morphological and syntactic level, however, these annotations relate only to individual words, so periphrastic forms cannot be easily searched. The output of this work is a program which, based on the studied grammatical rules of Slavic languages, discovers and marks these periphrastic grammatical forms in the data of Slavic languages. Furthermore, a program that reads the rules from the configuration file and, based on these rules, marks the periphrastic forms in the data. The configuration file is designed in YAML format, can be easily edited in a classic text editor, and is simple enough that even a user without programming experience can work with it without much difficulty.	en_US
uk.file-availability	V
uk.grantor	Univerzita Karlova, Matematicko-fyzikální fakulta, Ústav formální a aplikované lingvistiky	cs_CZ
thesis.grade.code	2
uk.publication-place	Praha	cs_CZ
uk.thesis.defenceStatus	O