LiU Electronic Press
Download:
File size:
10643 kb
Format:
application/pdf
Author:
Alirezaie, Marjan (Linköping University, Department of Computer and Information Science)
Title:
Semantic Analysis Of Multi Meaning Words Using Machine Learning And Knowledge Representation
Department:
Linköping University, Department of Computer and Information Science
Publication type:
Student thesis
Language:
English
Level:
Independent thesis Advanced level (degree of Master (Two Years)), 30 credits / 45 HE credits
Undergraduate subject:
Master's programme in Computer Science
Uppsok:
Technology
Pages:
74
Year of publ.:
2011
URI:
urn:nbn:se:liu:diva-70086
Permanent link:
http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-70086
ISRN:
LiU/IDA-EX-A- -11/011- -SE
Subject category:
Computer Science
SVEP category:
Computer science
Keywords(en) :
Machine Learning, Supervised Learning, Unsupervised Learning
Abstract(en) :

The present thesis addresses machine learning in a domain of naturallanguage phrases that are names of universities. It describes two approaches to this problem and a software implementation that has made it possible to evaluate them and to compare them.

In general terms, the system's task is to learn to 'understand' the significance of the various components of a university name, such as the city or region where the university is located, the scienti c disciplines that are studied there, or the name of a famous person which may be part of the university name. A concrete test for whether the system has acquired this understanding is when it is able to compose a plausible university name given some components that should occur in the name.

In order to achieve this capability, our system learns the structure of available names of some universities in a given data set, i.e. it acquires a grammar for the microlanguage of university names. One of the challenges is that the system may encounter ambiguities due to multi meaning words. This problem is addressed using a small ontology that is created during the training phase.

Both domain knowledge and grammatical knowledge is represented using decision trees, which is an ecient method for concept learning. Besides for inductive inference, their role is to partition the data set into a hierarchical structure which is used for resolving ambiguities.

The present report also de nes some modi cations in the de nitions of parameters, for example a parameter for entropy, which enable the system to deal with cognitive uncertainties. Our method for automatic syntax acquisition, ADIOS, is an unsupervised learning method. This method is described and discussed here, including a report on the outcome of the tests using our data set.

The software that has been implemented and used in this project has been implemented in C.

Presentation:
2011-04-04, 11:44 (English)
Supervisor:
Sandewall, Erik (Linköping University, Department of Computer and Information Science, CASL - Cognitive Autonomous Systems Laboratory)
Examiner:
Sandewall, Erik (Linköping University, Department of Computer and Information Science, CASL - Cognitive Autonomous Systems Laboratory)
Available from:
2011-08-25
Created:
2011-08-18
Last updated:
2011-08-25
Statistics:
157 hits
FILE INFORMATION
File size:
10643 kb
Mimetype:
application/pdf
Type:
fulltext
Statistics:
139 hits