Astrocluster: a framework agnostic clustering tool for semantic analysis of projects and technical debt

This item is provided by the institution :
University of Macedonia   

Repository :
Psepheda - Digital Library and Institutional Repository   

see the original item page
in the repository's web site and access all digital files if the item*



Astrocluster: a framework agnostic clustering tool for semantic analysis of projects and technical debt (EN)

Kote, Kostandin (EN)

Αμπατζόγλου, Απόστολος (EL)
Κασκάλης, Θεόδωρος (EL)
Χατζηγεωργίου, Αλέξανδρος (EL)

Bachelor's Degree Paper (EN)
Text (EN)

2025
2025-03-12T06:41:34Z


Η βιβλιοθήκη διαθέτει αντίτυπο της πτυχιακής μόνο σε ηλεκτρονική μορφή. (EL)
Πτυχιακή εργασία--Πανεπιστήμιο Μακεδονίας, Θεσσαλονίκη, 2025. (EL)
Approved for entry into archive by Κυριακή Μπαλτά ([email protected]) on 2025-03-12T06:41:33Z (GMT) No. of bitstreams: 3 license_rdf: 1025 bytes, checksum: 84a900c9dd4b2a10095a94649e1ce116 (MD5) KoteKostandinPe2025presentation.pdf: 3589577 bytes, checksum: e092186d2c7771fba0f20d818f092b4e (MD5) KoteKostandinPe2025.pdf: 3707369 bytes, checksum: 3c37d4a29b554ec5bb31174ef4411693 (MD5) (EN)
Several studies have explored the world of code analysis using neural networks and graph-based methods. These methods, while offering robust insights into code structure and functionality, lack extensive use cases for finding similarity between code files and their semantic analysis in different projects. Despite these advancements, the challenge of understanding the semantic relationships between files in large, diverse codebases remains underexplored. Identifying these relationships is critical for tasks such as refactoring, modularization, and assessing technical debt, which require a deep understanding of the intent and context of the code. This thesis introduces a framework- agnostic clustering model for organizing software project files based on contextual semantic similarity. The model utilizes embeddings generated by a fine-tuned UniXcoder embedder to represent file content, which are then clustered using a similarity metric and clustering paradigm. Users can manually inspect clusters, enabling tailored refactoring and automatic technical debt analysis. (EN)
Submitted by ΚΩΣΤΑΝΤΙΝ ΚΟΤΕ ([email protected]) on 2025-03-11T15:31:04Z No. of bitstreams: 3 license_rdf: 1025 bytes, checksum: 84a900c9dd4b2a10095a94649e1ce116 (MD5) KoteKostandinPe2025presentation.pdf: 3589577 bytes, checksum: e092186d2c7771fba0f20d818f092b4e (MD5) KoteKostandinPe2025.pdf: 3707369 bytes, checksum: 3c37d4a29b554ec5bb31174ef4411693 (MD5) (EN)
Made available in DSpace on 2025-03-12T06:41:34Z (GMT). No. of bitstreams: 3 license_rdf: 1025 bytes, checksum: 84a900c9dd4b2a10095a94649e1ce116 (MD5) KoteKostandinPe2025presentation.pdf: 3589577 bytes, checksum: e092186d2c7771fba0f20d818f092b4e (MD5) KoteKostandinPe2025.pdf: 3707369 bytes, checksum: 3c37d4a29b554ec5bb31174ef4411693 (MD5) Previous issue date: 2025-02-14 (EN)


Modularization (EN)
Embedders (EN)
Semantic Relationships (EN)
Neural Networks (EN)
Refactoring (EN)
Code Similarity (EN)

Πανεπιστήμιο Μακεδονίας (EL)

Τμήμα Εφαρμοσμένης Πληροφορικής (ΠΕ) (EL)

Αναφορά Δημιουργού - Παρόμοια Διανομή 4.0 Διεθνές (EL)
http://creativecommons.org/licenses/by-sa/4.0/




*Institutions are responsible for keeping their URLs functional (digital file, item page in repository site)