Nativní indexování rozsáhlých XML databází

Bartoš, Tomáš

Nativní indexování rozsáhlých XML databází

diploma thesis (DEFENDED)

View/Open

Záznam o průběhu obhajoby (131.8Kb)

Permanent link

http://hdl.handle.net/20.500.11956/31128

Identifiers

Study Information System: 71922

Referee

Pokorný, Jaroslav

Faculty / Institute

Faculty of Mathematics and Physics

Discipline

Software Systems

Department

Department of Software Engineering

Date of defense

24. 5. 2010

Publisher

Univerzita Karlova, Matematicko-fyzikální fakulta

Language

English

Grade

Excellent

V predloženej práci študujeme metódy indexovania pre rozsiahle XML databázy a ich časovú zložitosť pri vyhodnocovaní dotazov na vyhladávanie ciest. Existuje množstvo sposobov, ako indexovať XML dáta, ale my sa zameriame na indexovanie ciest od koreňa k listom a ich zhlukovanie na základe podobných kritérii. Štúdium a správna kombinácia existujúcich metód nám slúži ako základ pre vytvorenie nového natívneho indexu iXUPT, ktorý používa značkovanie ciest. Predstavujeme dve variácie indexovania v závislosti od sposobu zisťovania vzťahu predok-potomok. Prvá možnosť používa číselnú schému, druhá využíva Rho-Index. Sprvnosž nášho riešenia dokážeme implementáciou prototypu a vyhodnotením niekolkých XPath dotazov predstavujúcich cesty v grafe. Nakoniec porovnáme jednotlivé varianty a dosiahnuté výsledky s existujúcimi riešeniami.

Abstract (English)

In the present work we study the indexing methods for large XML databases and their time efficiency when evaluating path queries. There are several ways of indexing XML data but we focus on indexing root- to-leaf paths and grouping them according to the common criteria, path labels. We study the existing methods and combine them in order to create the iXUPT, a novel native indexing concept using path templates, which leverages advantages of current approaches. We provide two variations of our solution depending on the way of handling ancestor-descendant relationships. The first one uses the proposed numbering scheme, while the second one relies on the Rho-Index structure. Furthermore, we prove the feasibility of our concept by the implemented prototype and by evaluating sample regular path expressions represented by XPath queries. We compare the variations between each other and also with other solutions.

Citace dokumentu

Metadata

Show full item record