The growing mail volumes for businesses worldwide is one reason why theyare increasingly turning to digital mailrooms. A digital mailroom automaticallymanages the incoming mails, and a vital technology to its success isdocument classication. A problem with digital mailrooms and the documentclassication is separating the input stream of pages into documents.This thesis investigates existing classication theory and applies it to createan algorithm which solves the document separation problem. This algorithmis evaluated and compared against an existing algorithmic solution, over adataset containing real invoices.