Hybrid in-database inference for declarative information extraction

 
This item is provided by the institution :

Repository :
Institutional Repository Technical University of Crete
see the original item page
in the repository's web site and access all digital files if the item*
share




2011 (EN)

Hybrid in-database inference for declarative information extraction (EN)

Γαροφαλακης Μινως (EL)
Hellerstein Joseph M. (EN)
Garofalakis Minos (EN)
Wick Michael L. (EN)
Wang Daisy Zhe (EN)
Franklin Michael J. (EN)

Πολυτεχνείο Κρήτης (EL)
Technical University of Crete (EN)

In the database community, work on information extraction (IE) has centered on two themes: how to effectively manage IE tasks, and how to manage the uncertainties that arise in the IE process in a scalable manner. Recent work has proposed a probabilistic database (PDB) based declarative IE system that supports a leading statistical IE model, and an associated inference algorithm to answer top-k-style queries over the probabilistic IE outcome. Still, the broader problem of effectively supporting general probabilistic inference inside a PDB-based declarative IE system remains open. In this paper, we explore the in-database implementations of a wide variety of inference algorithms suited to IE, including two Markov chain Monte Carlo algorithms, Viterbi and sum-product algorithms. We describe the rules for choosing appropriate inference algorithms based on the model, the query and the text, considering the trade-off between accuracy and runtime. Based on these rules, we describe a hybrid approach to optimize the execution of a single probabilistic IE query to employ different inference algorithms appropriate for different records. We show that our techniques can achieve up to 10-fold speedups compared to the non-hybrid solutions proposed in the literature. (EN)

other
conferenceItem

Mathematics of Computing (EN)
Database Management (EN)


English

2011





*Institutions are responsible for keeping their URLs functional (digital file, item page in repository site)