liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Spam filter for SMS-traffic
Linköping University, Department of Computer and Information Science. Linköping University, The Institute of Technology.
2013 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

Communication through text messaging, SMS (Short Message Service), is nowadays a huge industry with billions of active users. Because of the huge userbase it has attracted many companies trying to market themselves through unsolicited messages in this medium in the same way as was previously done through email. This is such a common phenomenon that SMS spam has now become a plague in many countries.

This report evaluates several established machine learning algorithms to see how well they can be applied to the problem of filtering unsolicited SMS messages. Each filter is mainly evaluated by analyzing the accuracy of the filters on stored message data. The report also discusses and compares requirements for hardware versus performance measured by how many messages that can be evaluated in a fixed amount of time.

The results from the evaluation shows that a decision tree filter is the best choice of the filters evaluated. It has the highest accuracy as well as a high enough process rate of messages to be applicable. The decision tree filter which was found to be the most suitable for the task in this environment has been implemented. The accuracy in this new implementation is shown to be as high as the implementation used for the evaluation of this filter.

Though the decision tree filter is shown to be the best choice of the filters evaluated it turned out the accuracy is not high enough to meet the specified requirements. It however shows promising results for further testing in this area by using improved methods on the best performing algorithms.

Place, publisher, year, edition, pages
2013. , p. 82
Keywords [en]
Spam filtering, Machine Learning, C45, Support Vector Machine, Dynamic Markov Coding, Naïve Bayes
National Category
Computer Systems
Identifiers
URN: urn:nbn:se:liu:diva-94161ISRN: LIU-IDA/LITH-EX-A--13/021-SEOAI: oai:DiVA.org:liu-94161DiVA, id: diva2:629597
External cooperation
Fortytwo Telecom
Subject / course
Computer and information science at the Institute of Technology
Presentation
2013-05-16, Linköping, 15:15 (English)
Supervisors
Examiners
Available from: 2013-08-13 Created: 2013-06-17 Last updated: 2013-08-13Bibliographically approved

Open Access in DiVA

fulltext(2448 kB)418 downloads
File information
File name FULLTEXT01.pdfFile size 2448 kBChecksum SHA-512
f7c394e6048349716294543aaa2150991043549e2024da290cda0479ee60c1584aeeebe30143e47172f05c36fd746170c23b0dccaad100103de3b868be369de9
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Fredborg, Johan
By organisation
Department of Computer and Information ScienceThe Institute of Technology
Computer Systems

Search outside of DiVA

GoogleGoogle Scholar
Total: 418 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 596 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf