This work investigates the on-line validation of XML documents with re-
spect to a DTD, under memory constraints. We consider a simple approach by examining the membership checking of a string with respect to a given regular expression. A DTD is studied in the form of a regular expression and an XML document is considered as a string. The membership checking is examined for a class of regular expressions that is called conflict free or single occurrence. It is shown that the majority of real worlds DTDs respect this restriction.
We use Brzozowski derivatives as an easy and efficient way to perform mem-
bership checking avoiding the automata construction. We provide an algorithm for membership checking on conflict free regular expressions. We
examine the complexity of this algorithm on different classes of conflict free regular expressions and the condition, under which the complexity is linear, is provided .
The second approach is based on automata construction. Instead of computing the derivatives for each string on the fly, we precompute all the possible derivatives of the regular expression. We consider each derivative a state of an automaton recognizing the language described by the regular expression. The automata construction algorithm is provided and the problem of multiple derivations of same derivatives is examined.
Finally, we provide the basic notion of an optimization based on the symbol
consuming during derivation. It enables to compute each derivative only
once avoiding a considerable number of useless redundant derivations.