Nowcasting using Microblog Data
Independent thesis Basic level (degree of Bachelor), 10,5 credits / 16 HE creditsStudent thesisAlternative title
Nowcasting med mikrobloggdata (Swedish)
The explosion of information and user generated content made publicly available through the internet has made it possible to develop new ways of inferring interesting phenomena automatically. Some interesting examples are the spread of a contagious disease, earth quake occurrences, rainfall rates, box office results, stock market fluctuations and many many more. To this end a mathematical framework, based on theory from machine learning, has been employed to show how frequencies of relevant keywords in user generated content can estimate daily rainfall rates of different regions in Sweden using microblog data.
Microblog data are collected using a microblog crawler. Properties of the data and data collection methods are both discussed extensively. In this thesis three different model types are studied for regression, linear and nonlinear parametric models as well as a nonparametric Gaussian process model. Using cross-validation and optimization the relevant parameters of each model are estimated and the model is evaluated on independent test data. All three models show promising results for nowcasting rainfall rates.
Place, publisher, year, edition, pages
2012. , 52 p.
twitter, statistical learning, machine learning, gaussian process, nowcasting, social media
Probability Theory and Statistics Computer and Information Science
IdentifiersURN: urn:nbn:se:liu:diva-81755ISRN: LiTH-ISY-EX-ET--12/0398--SEOAI: oai:DiVA.org:liu-81755DiVA: diva2:555895
Subject / course
2012-09-12, Systemet, Linköpings Universitet, Campus Valla, 581 83, Linköping, 13:15 (Swedish)
UppsokPhysics, Chemistry, Mathematics
Schön, Thomas, Dr.