LiU Electronic Press
Download:
File size:
2448 kb
Format:
application/pdf
Author:
Fredborg, Johan (Linköping University, Department of Computer and Information Science) (Linköping University, The Institute of Technology)
External cooperation:
Fortytwo Telecom
Title:
Spam filter for SMS-traffic
Department:
Linköping University, Department of Computer and Information Science
Linköping University, The Institute of Technology
Publication type:
Student thesis
Language:
English
Level:
Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE credits
Undergraduate subject:
Computer and information science at the Institute of Technology
Pages:
82
Year of publ.:
2013
URI:
urn:nbn:se:liu:diva-94161
Permanent link:
http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-94161
ISRN:
LIU-IDA/LITH-EX-A--13/021-SE
Subject category:
Computer Systems
Keywords(en) :
Spam filtering, Machine Learning, C45, Support Vector Machine, Dynamic Markov Coding, Naïve Bayes
Abstract(en) :

Communication through text messaging, SMS (Short Message Service), is nowadays a huge industry with billions of active users. Because of the huge userbase it has attracted many companies trying to market themselves through unsolicited messages in this medium in the same way as was previously done through email. This is such a common phenomenon that SMS spam has now become a plague in many countries.

This report evaluates several established machine learning algorithms to see how well they can be applied to the problem of filtering unsolicited SMS messages. Each filter is mainly evaluated by analyzing the accuracy of the filters on stored message data. The report also discusses and compares requirements for hardware versus performance measured by how many messages that can be evaluated in a fixed amount of time.

The results from the evaluation shows that a decision tree filter is the best choice of the filters evaluated. It has the highest accuracy as well as a high enough process rate of messages to be applicable. The decision tree filter which was found to be the most suitable for the task in this environment has been implemented. The accuracy in this new implementation is shown to be as high as the implementation used for the evaluation of this filter.

Though the decision tree filter is shown to be the best choice of the filters evaluated it turned out the accuracy is not high enough to meet the specified requirements. It however shows promising results for further testing in this area by using improved methods on the best performing algorithms.

Presentation:
2013-05-16, Linköping, 15:15 (English)
Supervisor:
Andersson, Olov (Linköping University, Department of Computer and Information Science, Artificial Intelligence and Intergrated Computer systems) (Linköping University, The Institute of Technology)
Söder, Fredrik
Examiner:
Heintz, Fredrik (Linköping University, Department of Computer and Information Science, Artificial Intelligence and Intergrated Computer systems) (Linköping University, The Institute of Technology)
Available from:
2013-08-13
Created:
2013-06-17
Last updated:
2013-08-13
Statistics:
42 hits
FILE INFORMATION
File size:
2448 kb
Mimetype:
application/pdf
Type:
fulltext
Statistics:
121 hits