Data Mining for Software Management: Automatic marking of complex Rust code using software metrics

see the original item page
in the repository's web site and access all digital files if the item*



Data Mining for Software Management: Automatic marking of complex Rust code using software metrics (EN)

Karatakis, Panagiotis (EN)

Akritidis, Leonidas
Koukaras, Paraskevas
Tjortjis, Christos (EN)

masterThesis

2025-05-19T06:40:02Z
2024-02-01


This thesis explores the use of data mining and machine learning techniques to mark complex Rust code using software metrics automatically. The methodology proposed involves four main procedures: dataset construction, feature extraction, model training and finetuning, and model evaluation. The dataset construction involves the creation of a ground truth dataset by collecting commit messages and their metadata, performing NLP analysis, and extracting software metrics. Feature extraction involves enhancing the dataset with additional features to improve model performance. Model training and finetuning involve training and optimizing the models using various machine learning algorithms. Finally, model evaluation involves assessing the performance of the models using various evaluation metrics. The results show promising performance in detecting software defects, with F1 scores of 77% and AUC scores of 85%. The study also highlights limitations and future research opportunities, such as advanced feature engineering, larger sample sizes, and more complex algorithms. Overall, this thesis contributes to the development of automated methods for software management and provides valuable insights for stakeholders in the software development industry. (EN)


Data Mining (EN)

English

School of Science and Technology, MSc in Data Science
School of Science & Technology (EN)

Default License




*Institutions are responsible for keeping their URLs functional (digital file, item page in repository site)