liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Graph Similarity, Parallel Texts, and Automatic Bilingual Lexicon Acquisition
Linköping University, Department of Mathematics.
2008 (English)Independent thesis Basic level (professional degree), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

In this masters’ thesis report we present a graph theoretical method used for automatic bilingual lexicon acquisition with parallel texts. We analyze the concept of graph similarity and give an interpretation, of the parallel texts, connected to the vector space model. We represent the parallel texts by a directed, tripartite graph and from here use the corresponding adjacency matrix, A, to compute the similarity of the graph. By solving the eigenvalue problem ρS = ASAT + ATSA we obtain the self-similarity matrix S and the Perron root ρ. A rank k approximation of the self-similarity matrix is computed by implementations of the singular value decomposition and the non-negative matrix factorization algorithm GD-CLS. We construct an algorithm in order to extract the bilingual lexicon from the self-similarity matrix and apply a statistical model to estimate the precision, the correctness, of the translations in the bilingual lexicon. The best result is achieved with an application of the vector space model with a precision of about 80 %. This is a good result and can be compared with the precision of about 60 % found in the literature.

Place, publisher, year, edition, pages
Matematiska institutionen , 2008. , 111 p.
Keyword [en]
Parallel texts, graph similarity, bilingual lexicon, SVD, ARPACK, NMF, OpenMP, text mining
National Category
Computational Mathematics
Identifiers
URN: urn:nbn:se:liu:diva-11550ISRN: LITH-MAT-EX--08/03--SEOAI: oai:DiVA.org:liu-11550DiVA: diva2:17980
Subject / course
Scientific Computing
Uppsok
Physics, Chemistry, Mathematics
Supervisors
Examiners
Available from: 2008-05-08 Created: 2008-05-08 Last updated: 2012-04-24Bibliographically approved

Open Access in DiVA

fulltext(2371 kB)1025 downloads
File information
File name FULLTEXT01.pdfFile size 2371 kBChecksum MD5
34b38862778c2858afd43ea9cf0d678f9d2f10ea0ccd78da3c46e86106e84566d48c58b5
Type fulltextMimetype application/pdf

By organisation
Department of Mathematics
Computational Mathematics

Search outside of DiVA

GoogleGoogle Scholar
Total: 1025 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 647 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf