δείτε την πρωτότυπη σελίδα τεκμηρίου στον ιστότοπο του αποθετηρίου του φορέα για περισσότερες πληροφορίες και για να δείτε όλα τα ψηφιακά αρχεία του τεκμηρίου*
Δενδρική αναζήτηση Monte Carlo στο παιχνίδι στρατηγικής "Άποικοι του Κατάν"
(EL)
Monte Carlo tree search in the "Settlers of Catan" strategy game
(EN)
Καραμαλεγκος Εμμανουηλ
(EL)
Karamalegos Emmanouil
(EN)
Δεληγιαννακης Αντωνιος
(EL)
Λαγουδακης Μιχαηλ
(EL)
Χαλκιαδακης Γεωργιος
(EL)
Πολυτεχνείο Κρήτης
(EL)
Chalkiadakis Georgios
(EN)
Lagoudakis Michael
(EN)
Technical University of Crete
(EN)
Deligiannakis Antonios
(EN)
Classic approaches to game AI require either a high quality of domain knowledge, or
a long time to generate effective AI behavior. Monte Carlo Tree Search (MCTS) is a
search method that combines the precision of tree search with the generality of
random sampling. The family of MCTS algorithms has achieved promising results
with perfect-information games such as Go. In our work, we apply Monte-Carlo Tree
Search to the non-deterministic game "Settlers of Catan", a multi-player board-turnedweb-
based game that necessitates strategic planning and negotiation skills. We
implemented an agent which takes into consideration all the aspects of the game for
the first time, using no domain knowledge.
In our work, we are experimenting using a reinforcement learning method Value of
Perfect Information (VPI) and two bandit methods, namely, the Upper Coefficient
Bound and Bayesian Upper Coefficient Bound methods. Such methods attempt to
strike a balance between exploitation and exploration when creating of the search
tree.
For first time, we implemented an agent that takes into consideration the complete
rules-set of the game and makes it possible to negotiate trading between all players.
Furthermore, we included in our agent an alternative initial placement found in the
literature, which is based on the analysis of human behavior in Settlers of Catan
games.
In our experiments we compare the performance of our methods against each other
and against appropriate benchmarks (e.g., JSettlers agents), and examine the effect
that the number of simulations and the simulation depth has on the algorithms’
performance. Our results suggests that VPI scores significantly better than bandit
based methods, even if these employ a much higher number of simulations. In
addition to this, the simulation depth of the algorithm needs to be calculated so the
method will neither get lost in deep simulations of improbable scenarios neither end
up shortly without given a proper estimation of the upcoming moves. Finally, our
results indicate that our agent performance is improved when the alternative, human
behavior-based, initial placement method.
(EN)
Πολυτεχνείο Κρήτης::Σχολή Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών
(EL)
Technical University of Crete::School of Electrical and Computer Engineering
(EN)
*Η εύρυθμη και αδιάλειπτη λειτουργία των διαδικτυακών διευθύνσεων των συλλογών (ψηφιακό αρχείο, καρτέλα τεκμηρίου στο αποθετήριο) είναι αποκλειστική ευθύνη των αντίστοιχων Φορέων περιεχομένου.
Βοηθείστε μας να κάνουμε καλύτερο το OpenArchives.gr.