XML on-line validation

RDF 

 
This item is provided by the institution :
University of Crete
Repository :
E-Locus Institutional Repository
see the original item page
in the repository's web site and access all digital files if the item*
share



Semantic enrichment/homogenization by EKT
2009 (EN)
XML on-line validation

Καμπάς, Δημήτριος
Kampas, Dimitrios

Colazzo, Dario
Bidoit, Nicole

This work investigates the on-line validation of XML documents with re- spect to a DTD, under memory constraints. We consider a simple approach by examining the membership checking of a string with respect to a given regular expression. A DTD is studied in the form of a regular expression and an XML document is considered as a string. The membership checking is examined for a class of regular expressions that is called conflict free or single occurrence. It is shown that the majority of real worlds DTDs respect this restriction. We use Brzozowski derivatives as an easy and efficient way to perform mem- bership checking avoiding the automata construction. We provide an algorithm for membership checking on conflict free regular expressions. We examine the complexity of this algorithm on different classes of conflict free regular expressions and the condition, under which the complexity is linear, is provided . The second approach is based on automata construction. Instead of computing the derivatives for each string on the fly, we precompute all the possible derivatives of the regular expression. We consider each derivative a state of an automaton recognizing the language described by the regular expression. The automata construction algorithm is provided and the problem of multiple derivations of same derivatives is examined. Finally, we provide the basic notion of an optimization based on the symbol consuming during derivation. It enables to compute each derivative only once avoiding a considerable number of useless redundant derivations. (EN)

text

Πανεπιστήμιο Κρήτης (EL)
University of Crete (EN)

2009-06-29




*Institutions are responsible for keeping their URLs functional (digital file, item page in repository site)