liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Classification of photographed document images based on deep-learning features
Ocean University of China, Qingdao, China.
Ocean University of China, Qingdao, China.
Ocean University of China, Qingdao, China.
Ocean University of China, Qingdao, China.
Show others and affiliations
2017 (English)In: Eighth International Conference on Graphic and Image Processing (ICGIP 2016) / [ed] Tuan D. Pham; Vit Vozenilek; Zhu Zeng, SPIE - International Society for Optical Engineering, 2017, Vol. 10225, UNSP 102250XConference paper, (Refereed)
Abstract [en]

In this paper, we propose two new problems related to classification of photographed document images, and based on deep learning methods, present the baseline solutions for these two problems. The first problem is that, for some photographed document images, which book do they belong to? The second one is, for some photographed document images, what is the type of the book they belong to? To address these two problems, we apply “AexNet” to the collected document images. Using the pre-trained “AlexNet” on the ImageNet data set directly, we obtain 92.57% accuracy for the book-name classification and 93.33% accuracy for the book-type one. After fine-tuning on the training set of the photographed document images, the accuracy of the book-name classification increases to 95.54% and that of the booktype one to 95.42%. To our best knowledge, although there exist many image classification algorithm, no previous work has targeted to these two challenging problems. In addition, the experiments demonstrate that deep-learning features outperform features extracted with traditional image descriptors on these two problems. 

Place, publisher, year, edition, pages
SPIE - International Society for Optical Engineering, 2017. Vol. 10225, UNSP 102250X
Series
Proceedings of SPIE, ISSN 0277-786X, E-ISSN 1996-756X
National Category
Medical Image Processing
Identifiers
URN: urn:nbn:se:liu:diva-134440DOI: 10.1117/12.2266984ISI: 000399334200031ISBN: 9781510609518 (print)ISBN: 9781510609525 (electronic)OAI: oai:DiVA.org:liu-134440DiVA: diva2:1073829
Conference
Eighth International Conference on Graphic and Image Processing (ICGIP 2016), Tokyo, Japan, October 29–31, 2016
Note

Funding agencies: National Natural Science Foundation of China (NSFC) [61403353]; Open Project Program of the National Laboratory of Pattern Recognition (NLPR); Fundamental Research Funds for the Central Universities of China

Available from: 2017-02-13 Created: 2017-02-13 Last updated: 2017-05-05

Open Access in DiVA

No full text

Other links

Publisher's full text

Search in DiVA

By author/editor
Pham, Tuan
By organisation
Medical InformaticsFaculty of Science & Engineering
Medical Image Processing

Search outside of DiVA

GoogleGoogle Scholar

Altmetric score

Total: 253 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf