Efficient parallel Text Retrieval techniques on Bulk Synchronous Parallel (BSP)/Coarse Grained Multicomputers (CGM)

 
This item is provided by the institution :
Technological Educational Institute of Athens
Repository :
Ypatia - Institutional Repository
see the original item page
in the repository's web site and access all digital files if the item*
share




2009 (EN)
Efficient parallel Text Retrieval techniques on Bulk Synchronous Parallel (BSP)/Coarse Grained Multicomputers (CGM) (EN)

Γαβαλάς, Δαμιανός (EL)
Κωνσταντόπουλος, Χαράλαμπος (EL)
Πάντζιου, Γραμματή Ε. (EL)
Μάμαλης, Βασίλης (EL)

Τεχνολογικό Εκπαιδευτικό Ίδρυμα Αθήνας. Σχολή Τεχνολογικών Εφαρμογών. Τμήμα Μηχανικών Πληροφορικής Τ.Ε. (EL)

In this paper, we present efficient, scalable, and portable parallel algorithms for the off-line clustering, the on-line retrieval and the update phases of the Text Retrieval (TR) problem based on the vector space model and using clustering to organize and handle a dynamic document collection. The algorithms are running on the Coarse-Grained Multicomputer (CGM) and/or the Bulk Synchronous Parallel (BSP) model which are two models that capture within a few parameters the characteristics of the parallel machine. To the best of our knowledge, our parallel retrieval algorithms are the first ones analyzed under these specific parallel models. For all the phases of the proposed algorithms, we analytically determine the relevant communication and computation cost thereby formally proving the efficiency of the proposed solutions. In addition, we prove that our technique for the on-line retrieval phase performs very well in comparison to other possible alternatives in the typical case of a multiuser information retrieval (IR) system where a number of user queries are concurrently submitted to an IR system. Finally, we discuss external memory issues and show how our techniques can be adapted to the case when processors have limited main memory but sufficient disk capacity for holding their local data. (EN)

journalArticle

Ομαδοποίηση εγγράφων (EN)
Document clustering (EN)
Παράλληλοι αλγόριθμοι (EN)
Ανάκτηση κειμένου (EN)
External memory (EN)
Parallel algorithms (EN)
Εξωτερική μνήμη (EN)
Text retrieval (EN)

ΤΕΙ Αθήνας (EL)
Technological Educational Institute of Athens (EN)

The Journal of Supercomputing (EN)

English

2009-06

DOI: 10.1007/s11227-008-0225-x

Springer US (EN)



*Institutions are responsible for keeping their URLs functional (digital file, item page in repository site)