Grouping Biological Data
Independent thesis Basic level (professional degree), 20 points / 30 hpStudent thesis
Today, scientists in various biomedical fields rely on biological data sources in their research. Large amounts of information concerning, for instance, genes, proteins and diseases are publicly available on the internet, and are used daily for acquiring knowledge. Typically, biological data is spread across multiple sources, which has led to heterogeneity and redundancy.
The current thesis suggests grouping as one way of computationally managing biological data. A conceptual model for this purpose is presented, which takes properties specific for biological data into account. The model defines sub-tasks and key issues where multiple solutions are possible, and describes what approaches for these that have been used in earlier work. Further, an implementation of this model is described, as well as test cases which show that the model is indeed useful.
Since the use of ontologies is relatively new in the management of biological data, the main focus of the thesis is on how semantic similarity of ontological annotations can be used for grouping. The results of the test cases show for example that the implementation of the model, using Gene Ontology, is capable of producing groups of data entries with similar molecular functions.
Place, publisher, year, edition, pages
Institutionen för datavetenskap , 2006. , 81 p.
Biological Data, Grouping, Ontologies, Semantic Similarity
Bioinformatics (Computational Biology)
IdentifiersURN: urn:nbn:se:liu:diva-6327ISRN: LITH-IDA-EX--06/029--SEOAI: oai:DiVA.org:liu-6327DiVA: diva2:21763
2006-04-21, Alan Turing, Hus E, Linköpings universitet, Linköping, 13:15