A graph representation of the individual exome variation with evidence from biomedical text corpora

A graph representation of the individual exome variation with evidence from biomedical text corpora

URI: https://www.openarchives.gr/aggregator-openarchives/edm/elocus/000018-dlib_d_f_2_metadata-dlib-1663755549-23934-20683.tkl
RDF/XML JSON-LD

Το τεκμήριο παρέχεται από τον φορέα :
Πανεπιστήμιο Κρήτης

Αποθετήριο :
E-Locus Ιδρυματικό Καταθετήριο

δείτε την πρωτότυπη σελίδα τεκμηρίου
στον ιστότοπο του αποθετηρίου του φορέα για περισσότερες πληροφορίες και για να δείτε όλα τα ψηφιακά αρχεία του τεκμηρίου^*

Τίτλος

Μια γραφική αναπαράσταση των παραλλαγών των εξωνίων ατόμου με στοιχεία απο τη συλλογή βιοιατρικών κειμένων

A graph representation of the individual exome variation with evidence from biomedical text corpora

Δημιουργός

Γιαννουλάκης, Ιωάννης

Συντελεστής

Ποταμιάς, Γεώργιος

Ηλιόπουλος, Ιωάννης

Καντεράκης, Αλέξανδρος

Τύπος

text

Τύπος Εργασίας--Μεταπτυχιακές εργασίες ειδίκευσης

Μεταπτυχιακή εργασία (EL)

Ημερομηνία

2022-07-29

Χρονολογία

2022 (EL)

Περιγραφή

One of the most crucial steps in clinical genetics pipelines is variant annotation and prioritization. This step usually includes the consultancy of other databases that can shed light on the importance of the identified genomic variation. One of the genomic data sources with a valuable wealth of information is online BioMedical publication databases such as PubMed. Today is debatable as to which extend modern clinical genetics pipelines involved in Next Generation Sequencing exploit this information. Despite the plethora of available methods for information extraction from biomedical text, they rarely take part in the annotation/prioritization step of typical Next Generation Sequencing pipelines. This is because existing methods are not suited for mass querying the complete genome variation of an individual. Here we present an open tool that builds a graph from the BioC corpus consisting of all open and extensively pre-annotated PubMed articles in less than 10 hours. In this graph, nodes represent Articles (n=27M), Chemicals (n=350K), Diseases (n=12K), Genes (n=37K), Mutations (n=1.1M) interconnected through 190 million edges.The graph can be queried and explored through the Cypher language that is served and visualized through the Neo4j graph database engine. Through this engine we can query the entirety of variants (~50K) identified in NGS experiments in a practical timescale. The result of this query is the intersection of the graph's mutations with those of the file that have been given as input. The articles that contain these mutations are used for topic modeling through Top2Vec.Through the results of topic modeling, a user can easily and flexibly investigate all existing bibliographic evidence linking the genetic profile of the individual with known diseases and chemical/drug interactions. (EN)

Θέμα

Text mining

Mutations

Εξόρυξη κειμένου

Μεταλλάξεις

Γραφική βάση δεδομένων

Graph database

Γλώσσα

Αγγλική γλώσσα

Πηγή

Σχολή/Τμήμα--Ιατρική Σχολή--Τμήμα Ιατρικής--Μεταπτυχιακές εργασίες ειδίκευσης

Πάροχος

Πανεπιστήμιο Κρήτης

Αποθετήριο / συλλογή

E-Locus Ιδρυματικό Καταθετήριο

Επιμέρους συλλογή

Elocus

*Η εύρυθμη και αδιάλειπτη λειτουργία των διαδικτυακών διευθύνσεων των συλλογών (ψηφιακό αρχείο, καρτέλα τεκμηρίου στο αποθετήριο) είναι αποκλειστική ευθύνη των αντίστοιχων Φορέων περιεχομένου.

A graph representation of the individual exome variation with evidence from biomedical text corpora

A graph representation of the individual exome variation with evidence from biomedical text corpora

A graph representation of the individual exome variation with evidence from biomedical text corpora

Βοηθείστε μας να κάνουμε καλύτερο το OpenArchives.gr.