Facilitating NeuroMorpho.Org Curation via Neuronal and Glial Metadata Analysis



Zoubi, Yasmeen

Journal Title

Journal ISSN

Volume Title



NeuroMorpho.Org is a scientific database of digital reconstructions of neurons and glia. It serves as a large-scale repository of a wide range of morphological information that can be accessed all over the world, thus encouraging data sharing and communication amongst the international neuroscience community. The curation of such data is very helpful in mining and understanding the relationships between dendritic and axonal branching, glial processes, brain connectivity, and synaptic signaling. Metadata refers to information about the data. NeuroMorpho.Org specifically provides metadata for each curated cell, including details on the animal subject, brain region, cell type, and experimental protocol. This information is extracted from the corresponding peer reviewed publications that describe the reconstructed neurons or glia. The manual process of metadata extraction and annotation can be labor intensive, time consuming, and error prone. In this regard, machine learning can be employed to overcome such challenges by facilitating and eventually automating the identification of relevant information. To ensure efficacy, machine learning tools must be trained with a corpus of existing annotations. Here we deployed a two-pronged approach for analyzing NeuroMorpho.Org metadata to provide a useful training set to aid the ongoing development of semi-automated annotation. First, we investigated our records of metadata in order to deduce any systematic patterns that may underlie neurobiological rules or statistical trends and could be expressed into artificial intelligence heuristics. Specifically, we used a frequency-based data mining algorithm known as "Apriori", which makes use of association rules to compute frequent itemsets consisting of neuronal and glial metadata. Second, we utilized machine learning tools in extracting key metadata via an approach known as "named entity recognition", or NER, such that metadata acquisition can be automated. In this case, it is necessary to perform several rounds of manual annotations that the algorithm can learn from, thus making automated annotation as precise as possible. Altogether, our investigation can potentially aid technologies in training algorithms for robust metadata annotation, which can lead to the expansion and enhancement of NeuroMorpho.Org.


This thesis has been embargoed for 2 years. It will not be available until July 2023 at the earliest.


NeuroMorpho.Org, Apriori algorithm, Machine learning, Negation rules, Metadata analysis, Named entity recognition (NER)