Web user clustering and Web prefetching using Random Indexing with weight functions
2012 (English)In: Knowledge and Information Systems, ISSN 0219-1377, E-ISSN 0219-3116, Vol. 33, no 1, 89-115 p.Article in journal (Refereed) Published
Users of a Web site usually perform their interest-oriented actions by clicking or visiting Web pages, which are traced in access log files. Clustering Web user access patterns may capture common user interests to a Web site, and in turn, build user profiles for advanced Web applications, such as Web caching and prefetching. The conventional Web usage mining techniques for clustering Web user sessions can discover usage patterns directly, but cannot identify the latent factors or hidden relationships among users navigational behaviour. In this paper, we propose an approach based on a vector space model, called Random Indexing, to discover such intrinsic characteristics of Web users activities. The underlying factors are then utilised for clustering individual user navigational patterns and creating common user profiles. The clustering results will be used to predict and prefetch Web requests for grouped users. We demonstrate the usability and superiority of the proposed Web user clustering approach through experiments on a real Web log file. The clustering and prefetching tasks are evaluated by comparison with previous studies demonstrating better clustering performance and higher prefetching accuracy.
Place, publisher, year, edition, pages
Springer Verlag (Germany) , 2012. Vol. 33, no 1, 89-115 p.
Web user clustering, Random Indexing, Weight functions, Web prefetching
National CategoryEngineering and Technology
IdentifiersURN: urn:nbn:se:liu:diva-85197DOI: 10.1007/s10115-011-0453-xISI: 000309587800004OAI: oai:DiVA.org:liu-85197DiVA: diva2:566678
Funding Agencies|Program for New Century Excellent Talents in University of the Ministry of Education of China|NCET-10-0239|Fok Ying-Tong Education Foundation for Young Teachers in the Higher Education Institutions of China|121062|Specialized Research Fund for the Doctoral Program of Higher Education|20100005110002|National Natural Science Foundation of China|60805043|National Key Technologies RD Program|2009BAH42B02|Santa Anna IT Research Institute||2012-11-092012-11-092014-01-13