Incorporating Trainable Filterbanks in Deep Neural Networks for Music Transcription

Incorporating Trainable Filterbanks in Deep Neural Networks for Music Transcription

URI: https://www.openarchives.gr/aggregator-openarchives/edm/pergamos/000005-uoadl%3A3395209
RDF/XML JSON-LD

This item is provided by the institution :
/aggregator-openarchives/portal/institutions/uoa

Repository :
Pergamos Digital Library

see the original item page
in the repository's web site and access all digital files if the item^*

Title

Incorporating Trainable Filterbanks in Deep Neural Networks for Music Transcription

Creator

ΠΡΙΜΕΝΤΑ ΑΙΚΑΤΕΡΙΝΗ-ΜΑΡΙΑ (EL)

PRIMENTA AIKATERINI-MARIA (EN)

Type

born_digital_graduate_thesis

Πτυχιακή Εργασία (EL)

Graduate Thesis (EN)

Thesis
Bachelor thesis (EN)

Date

2024

Year

2024 (EN)

Description

Τα τελευταία χρόνια, η Αυτόματη Μεταγραφή Μουσικής, η διαδικασία δηλαδή μετατροπής ηχογραφήσεων σε συμβολικές αναπαραστάσεις χωρίς ανθρώπινη παρέμβαση, έχει βιώσει σημαντικές προόδους και έχει εφαρμοστεί σε διάφορους τομείς της μουσικής. Πολλές υπάρχουσες προσεγγίσεις χρησιμοποιούν Βαθιά Νευρωνικά Δίκτυα και βασίζονται στην εκμάθηση των χαρακτηριστικών εισόδου απευθείας από αναπαραστάσεις όπως τα φασματογράμματα λογαριθμικής κλίμακας Mel. Αυτό οδηγεί σε προκλήσεις, όπως έναν υψηλό αριθμό εκπαιδεύσιμων παραμέτρων, περιορισμένη προσαρμοστικότητα και αργή σύγκλιση. Σε αυτήν τη διατριβή, αντιμετωπίζουμε αυτές τις προκλήσεις προτείνοντας μια νέα μέθοδο για τη βελτίωση των συστημάτων μεταγραφής πιάνου μέσω της ενσωμάτωσης εκπαιδεύσιμων φίλτρων για την εξαγωγή χαρακτηριστικών. Εμπνευσμένοι από το SincNet, μια αρχιτεκτονική με Συνελικτικά Νευρωνικά Δίκτυα που υλοποιεί παραμετρικά φίλτρα βασισμένα σε sinc συναρτήσεις, στοχεύουμε στην βελτίωση της ακρίβειας και της αποδοτικότητας ενός υπάρχοντος, υψηλής ανάλυσης, συστήματος μεταγραφής πιάνου. Το προτεινόμενο πλαίσιο επιτυγχάνει ένα Μέσο Ποσοστό Ακρίβειας 89%, το οποίο είναι συγκρίσιμο αλλά χαμηλότερο από αυτό της πρωτότυπης μεθόδου. Ωστόσο, συγκριτικά με την πρωτότυπη μέθοδο, αποδίδει καλύτερα στην ακρίβεια ανίχνευσης των ενάρξεων και απολήξεων των μουσικών νοτών. Η υλοποίηση της προτεινόμενης μας μεθόδου είναι διαθέσιμη στη διεύθυνση https://github.com/marikaitiprim/MusicTranscription-BScThesis. (EL)

In recent years, Automatic Music Transcription, the process of converting audio recordings into symbolic representations without the human intervention, has witnessed significant advancements and has been applied across various domains in the music field. Many existing approaches utilize Deep Neural Networks and rely on learning their input features directly from representations like log-mel spectrograms. This leads to challenges such as a high number of trainable parameters, limited adaptability and slow convergence. In this thesis, we tackle these challenges by proposing a new method to enhance piano transcription systems through the incorporation of trainable filterbanks for feature extraction. Drawing inspiration from SincNet, a Convolutional Neural Network architecture that implements parameterized sinc-based filterbanks, we aim to improve the accuracy and efficiency of an existing high-resolution piano transcription system. Our proposed framework achieves an Average Precision Score of 89%, which is comparable to but lower than that of the original method. However, it outperforms the original method in terms of the accuracy of onset and offset detections. The implementation of our proposed method is available at https://github.com/marikaitiprim/MusicTranscription-BScThesis. (EN)

Scientific field

Τεχνολογία – Πληροφορική

Engineering and Technology ▶ Electrical engineering, Electronic engineering, Information engineering
Communication engineering and systems, Telecommunications (EN)

Natural Sciences ▶ Computer and Information Sciences
Computer Science (EN)

Subject

Τεχνολογία – Πληροφορική (EL)

Technology - Computer science (EN)

Language

English

School / Department / Institute

Βιβλιοθήκη και Κέντρο Πληροφόρησης » Βιβλιοθήκη Σχολής Θετικών Επιστημών » Πληροφορική

Σχολή Θετικών Επιστημών » Τμήμα Πληροφορικής & Τηλεπικοινωνιών

National and Kapodistrian University of Athens ▶ School of Science
Department of Informatics and Telecommunications

Rights

https://creativecommons.org/licenses/by-nc/4.0/

Provider

University of Athens

Repository / collection

Pergamos Digital Library

Subcollections

Graduate Thesis

*Institutions are responsible for keeping their URLs functional (digital file, item page in repository site)

Incorporating Trainable Filterbanks in Deep Neural Networks for Music Transcription

Βοηθείστε μας να κάνουμε καλύτερο το OpenArchives.gr.