Study and Implematation of Community Detection Algorithms in Large Graphs

Study and Implematation of Community Detection Algorithms in Large Graphs

URI: https://www.openarchives.gr/aggregator-openarchives/edm/apothesis/000099-https%3A__apothesis.eap.gr_handle_repo_36014
RDF/XML JSON-LD

This item is provided by the institution :

Hellenic Open University

Repository :
Institutional Repository of the Hellenic Open University

see the original item page
in the repository's web site and access all digital files if the item^*

Semantic enrichment by EKT

ΕΚΤ item type

Thesis (EN)

EKT year

2017 (EN)

EKT historical period

Title

Μελέτη και Υλοποίηση Αλγορίθμων Εύρεσης Κοινοτήτων σε Μεγάλα Γραφήματα

Study and Implematation of Community Detection Algorithms in Large Graphs

Creator

Γεμενετζής, Χρήστος

Contributor

Παπαδόπουλος , Απόστολος

Βασιλακόπουλος , Μιχαήλ

Παπαδόπουλος, Απόστολος

Description

Με την δημιουργία του διαδικτύου αλλά και των ιστοσελίδων κοινωνικής δικτύωσης, οι περισσότεροι πολίτες κάθε χώρας μπορούν να διατηρούν προσωπικούς λογαριασμούς και να επικοινωνούν μέσω αυτών διαδικτυακά. Η συχνότερη ή πυκνότερη επικοινωνία μεταξύ ομάδων χρηστών του διαδικτύου σε σχέση με τους υπόλοιπους χρήστες, μπορεί να προκαλέσει το ενδιαφέρον των σύγχρονων επιστημών και να υποδηλώσει την ύπαρξη κοινοτήτων του διαδικτύου. Η αναπαράσταση του διαδικτύου και όχι μόνο, γίνεται με γραφήματα. Το μέγεθος των γραφημάτων αυτών μπορεί να έχει εκατομμύρια, αλλά και δισεκατομμύρια κορυφές ή ακμές, μέγεθος που χρονικά ή χωρικά δεν είναι εύκολο να διαχειριστεί ένας υπολογιστής. Η εύρεση κοινοτήτων, σε γραφήματα τέτοιου μεγέθους, είναι ένα τμήμα της επιστήμης εξόρυξης δεδομένων σε γραφήματα (Graph Mining), που είναι ακόμα ρευστό. Τα διαφορετικά είδη προβλημάτων, στην εύρεση ομάδων κορυφών σε ένα γράφημα ή η διαφορετική δομή των γραφημάτων, οδηγούν τους επιστήμονες να παράγουν πολλούς και διαφορετικούς αλγόριθμους εύρεσης κοινοτήτων σε γραφήματα, ώστε να πετύχουν την βέλτιστη λύση. Σε αυτήν την διπλωματική εργασία γίνεται υλοποίηση και μελέτη τριών γνωστών αλγορίθμων εύρεσης κοινοτήτων σε γραφήματα, του k-Medoids, του k-Centers Farthest First Traversal και του Louvain. Η υλοποίηση γίνεται στην γλώσσα προγραμματισμού Python, η οποία είναι μια ανερχόμενη γλώσσα προγραμματισμού με πολύ απήχηση και είναι μία από τις γλώσσες που υποστηρίζονται από την πλατφόρμα Apache Spark, η οποία χρησιμοποιείται ευρέως στην περιοχή της ανάλυσης μεγάλων δεδομένων (big data analytics). Με την υλοποίηση πειραμάτων αναλύονται η συμπεριφορά, η χρήση της μνήμης, το πλήθος των κοινοτήτων που παράγονται αλλά και ο χρόνος εκτέλεσης των αλγορίθμων σε μεγάλα γραφήματα. Αναδεικνύονται τα προβλήματα του κάθε αλγόριθμου ξεχωριστά, στους τέσσερις αυτούς τομείς, και υλοποιούνται παραλλαγές προγραμματισμού των αλγορίθμων, με στόχο την επίλυση των προβλημάτων. Τέλος, δημιουργείται και υλοποιείται υβριδικός αλγόριθμος που μπορεί να επιλύσει ταυτόχρονα τα προβλήματα όλων των προηγούμενων.

Since the creation of both internet and social network sites, most citizens of each country have the opportunity to keep personal accounts and communicate via them electronically. The most frequent or dense communication among clusters of network users in comparison to the rest users, increases current sciences’ interest and declares the existence of network communities. Network representation – but not solely – is designed with graphs. Those graphs’ size may have millions, but also billions nodes or edges, and may have a size that a computer is not easy to manipulate as far as time or memory capacity is concerned. Community detection of such type of graphs, is a part of Graph Mining Science, which is even more unsure. Different types of problems regarding finding clusters of nodes in a graph or finding the different structure of graphs, force scientists to produce many different community detection algorithms in graphs, so as to find the best solution. In this dissertation three well - known algorithms about community detection in graphs are implemented and studied. Those are k-Medoids, the k-Center Farthest First Traversal and the Louvain ones. The implementation is designed with the Python programming language, which is a promising one, with great impact, and it is one of those languages, which are supported by the Apache Spark platform, which is widely used in big data analytics field. The behavior, the usage of memory, the number of communities produced but also the duration of algorithm’s execution in large graphs are all analyzed by the experiments’ implementation. Each algorithm’s problems in those four fields are emphasized separately, and different programming approaches are made, aiming to solve the problems arised. Finally, a hybrid algorithm is designed and constructed, that can solve at the same time all the problems of algorithms mentioned above.

Περιέχει: Πίνακες, εικόνες

Type

Διπλωματική Εργασία / Thesis

Subject

μεγάλα γραφήματα

Farthest First Traversal

large graphs

εύρεση κοινοτήτων

Louvain

partitioning graphs

k-Center

clustering

community detection

k-Medoids

Provider

Hellenic Open University

Repository / collection

Institutional Repository of the Hellenic Open University

Subcollections

Language

Greek

Date

2017-10-10T10:01:32Z

2017-09-23

Identifier

https://apothesis.eap.gr/handle/repo/36014

Publisher

Ελληνικό Ανοικτό Πανεπιστήμιο / Hellenic Open University

Coverage

*Institutions are responsible for keeping their URLs functional (digital file, item page in repository site)

Study and Implematation of Community Detection Algorithms in Large Graphs

Βοηθείστε μας να κάνουμε καλύτερο το OpenArchives.gr.