Οπτικοακουστική σύνθεση φωνής με χρήση κρυφών μαρκοβιανών μοντέλων

Οπτικοακουστική σύνθεση φωνής με χρήση κρυφών μαρκοβιανών μοντέλων

URI: https://www.openarchives.gr/aggregator-openarchives/edm/ntua/000011-123456789_42129
RDF/XML JSON-LD

This item is provided by the institution :
National Technical University of Athens

Repository :
Digital Library of National Technical University of Athens | Dspace@NTUA

see the original item page
in the repository's web site and access all digital files if the item^*

Title

Οπτικοακουστική σύνθεση φωνής με χρήση κρυφών μαρκοβιανών μοντέλων (EL)

Creator

Φιλντίσης, Παναγιώτης Παρασκευάς (EL)

Filntisis, Panagiotis Paraskevas (EN)

Contributor

ntua (EL)

Ποταμιάνος, Αλέξανδρος (EL)

Μαραγκός, Πέτρος (EL)

Πρωτόπαπας, Αθανάσιος (EL)

Type

bachelorThesis

Thesis
Bachelor thesis (EN)

Issued

2015-10-30

2016-03-09T13:25:36Z

2016-03-09

Year

2015 (EN)

Description

Στην παρούσα διπλωματική εργασία παρουσιάζεται ένα πλήρες οπτικοακουστικό σύστημα σύνθεσης φωνής για την Ελληνική γλώσσα. Κατά την υλοποίηση ενός τέτοιου συστήματος αντλούνται τεχνικές από διάφορους επιστημονικούς τομείς όπως η Μηχανική Μάθηση, η Επεξεργασία Σημάτων, και η Όραση Υπολογιστών. Εκκινώντας με την εισαγωγή, παρουσιάζουμε την ιστορική αναδρομή και τις σημαντικότερες μεθόδους για την υλοποίηση ενός οπτικοακουστικού συνθέτη φωνής. Εν συνεχεία, στα επόμενα κεφάλαια παρουσιάζεται η απαραίτητη θεωρητική ανάλυση για την υλοποίηση του οπτικοακουστικού συστήματος σύνθεσης φωνής, παράλληλα με τα πειραματικά αποτελέσματα που λήφθηκαν κατά την υλοποίηση και αξιολόγηση του συστήματος. Η αξιολόγηση του συστήματος είναι ιδιαίτερα ενθαρρυντική τόσο για την παραγόμενη ομιλία, όσο και για την παραγόμενη εικονοσειρά, ανοίγοντας διάπλατα τον δρόμο για την μετέπειτα εξέλιξη του συστήματος σε εφαρμογές όπως η συναισθηματική οπτικοακουστική σύνθεσης φωνής, μια πρώτη προσέγγιση και αξιολόγηση της οποίας κάνουμε στο τελευταίο Κεφάλαιο. (EL)

In the present diploma thesis, we present a complete audiovisual text-to-speech synthesis system for the Greek language. During the implementation of such a system, we draw tools from a variety of scientific fields, such as Machine Learning, Signal Processing and Computer Vision. Starting with the introduction, we present the history and most important methods for the implementation of an audiovisual text-to-speech synthesis system. In the next chapters we present the necessary theoretical analysis for the implementation of the system, and at the same time we present our experimental results and evaluation. The evaluation of the system appears especially encouraging both for the synthetic speech and video, opening the way for the evolution of our system for applications such as emotional and expressive speech synthesis, on which we do a first approach and evaluation in the last Chapter. (EN)

Scientific field

Engineering and Technology ▶ Electrical engineering, Electronic engineering, Information engineering
Robotics (EN)

Engineering and Technology ▶ Electrical engineering, Electronic engineering, Information engineering
Automation and control systems (EN)

Subject

Οπτικοακουστική σύνθεση φωνής (EL)

Αναγνώριση προτύπων (EL)

Audiovisual speech synthesis (EN)

hmm (EN)

aam (EN)

Language

Greek

School / Department / Institute

Εθνικό Μετσόβιο Πολυτεχνείο. Σχολή Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών. Τομέας Σημάτων, Ελέγχου και Ρομποτικής (EL)

National Technical Univeristy of Athens ▶ School of Electrical and Computer Engineering
Signals, Control and Robotics Divison

Rights

Default License

Provider

National Technical University of Athens

Repository / collection

Digital Library of National Technical University of Athens | Dspace@NTUA

Subcollections

Κεντρική Βιβλιοθήκη Ε.Μ.Π.

Ιδρυματικό Αποθετήριο

Διπλωματικές Εργασίες

*Institutions are responsible for keeping their URLs functional (digital file, item page in repository site)

Οπτικοακουστική σύνθεση φωνής με χρήση κρυφών μαρκοβιανών μοντέλων

Βοηθείστε μας να κάνουμε καλύτερο το OpenArchives.gr.