Automatic Speech Act Classification: Bootstrapping an embedding-based classifier from a rule-based classifier for Swedish sentences
2024 (English)Independent thesis Basic level (degree of Bachelor), 12 credits / 18 HE credits
Student thesis
Abstract [en]
When we speak, we carry out social actions that are mediated through spoken words. For example, through speech, we may ask for directions. These spoken actions are referred to as speech acts. We humans unconsciously understand and categorize speech acts all the time. But, how can we make computers do the same?
The objective of this thesis was to develop an automatic classifier for speech acts in Swedish sentences. To do this, I first annotated a test set of speech acts, following the MATTER development cycle. The sentences in this test set originate from online discussion forums. I then developed and trained a rule-based classifier using a subsample of these sentences. Finally, this rule-based classifier was used for automatically annotating a large training set, which was then used for training a neural network for classifying speech acts—essentially bootstrapping the network from the rule-based classifier. This neural network uses SBERT to compute the sentence embeddings of the sentences and then classifies their speech acts based on these embeddings.
The results indicate that using the MATTER cycle is a feasible approach for creating a test set for speech acts. Furthermore, the results show that the embedding-based classifier outperforms the rule-based classifier, but also that the rule-based classifier vastly outperforms the baseline. However, it was not possible to conclude if the embedding-based classifier's higher performance was due to the increase in data or because of its differing architecture.
Place, publisher, year, edition, pages
2024.
Keywords [en]
NLP, Speech Act, SBERT, Rule-Based Classification, Semi-supervised Learning
National Category
Natural Language Processing
Identifiers
URN: urn:nbn:se:liu:diva-205702ISRN: LIU-IDA/KOGVET-G--24/020--SEOAI: oai:DiVA.org:liu-205702DiVA, id: diva2:1880018
Subject / course
Cognitive science
Supervisors
Examiners
2025-05-082024-06-302025-05-08Bibliographically approved