Hierarchical clustering in medical document collections: the BIC-Means method

 
This item is provided by the institution :

Repository :
Institutional Repository Technical University of Crete
see the original item page
in the repository's web site and access all digital files if the item*
share




2010 (EN)

Hierarchical clustering in medical document collections: the BIC-Means method (EN)

Χουρδακης Νικολαος (EL)
Αργυριου Μιχαηλ (EL)
Πετρακης Ευριπιδης (EL)
Chourdakis Nikolaos (EN)
Argyriou Michail (EN)
Milios, EE (EN)
Petrakis Evripidis (EN)

Πολυτεχνείο Κρήτης (EL)
Technical University of Crete (EN)

Hierarchical clustering of text collections is a key problem in document management and retrieval. In partitional hierarchical clustering, which is more efficient than its agglomerative counterpart, the entire collection is split into clusters and the individual clusters are further split until a heuristicallymotivated termination criterion is met. In this paper, we define the BIC-means algorithm, which applies the Bayesian Information Criterion (BIC) as a domain independent termination criterion for partitional hierarchical clustering. We evaluate the effectiveness of BIC-means in clustering and retrieval on medical document collections and we propose a dynamic version of the BIC-Means algorithm for adapting an existing clustering solution to document additions. (EN)

journalArticle


Journal of Digital Information Management (EN)

English

2010


Elsevier (EN)




*Institutions are responsible for keeping their URLs functional (digital file, item page in repository site)