Content-based Tweets Semantic Clustering and Propagation

 
This item is provided by the institution :

Repository :
IHU Repository
see the original item page
in the repository's web site and access all digital files if the item*
share




2015 (EN)

Content-based Tweets Semantic Clustering and Propagation

Michalakos, Marios Aristotelis

In this thesis our goal was to develop a methodology in order to cluster a set of tweets based on their semantic context. We have used probabilistic topic modeling techniques such as Latent Dirichlet allocation in order to extract topics from our dataset and then we applied several natural language methods in order to automatically generate semantically meaningful and grammatically correct phrases, as candidate labels for our extracted topics, aiming at creating an objective method for topic labeling. Developing a scoring function in order to assign the most semantically similar labels to our extracted topics was an essential part to our research that has helped us to assign the most relevant labels to each topic. Then we have generated the Twitter graph and used community detection algorithms in order to analyze each community topic of interest. This way we have been able to record the propagation of certain topics in our graph and we have been able analyze the topics of interest in each community in our graph. Using visualization layout algorithms was also essential in order to provide meaningful visualizations of our networks. We have created datasets that was populated using Twitter’s API and we have used open source tools in order to develop the software implementation of this method and a fully working prototype has been developed. Our research can be used as a valuable asset for modern market analysis from companies.

masterThesis


English

2015-05-29T18:53:55Z
2015-09-27T05:56:26Z
2015-05-29


School of Science &Technology, Master of Science (MSc) in Information and Communication Systems
ihu




*Institutions are responsible for keeping their URLs functional (digital file, item page in repository site)