liu.seSearch for publications in DiVA
Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
An Aligned Resource of Swedish Complex-Simple Sentence Pairs
Linköpings universitet, Institutionen för datavetenskap, Interaktiva och kognitiva system. Linköpings universitet, Tekniska fakulteten.ORCID-id: 0000-0002-0932-7048
2018 (engelsk)Inngår i: Proceedings of the Seventh Swedish Language Technology Conference (SLTC), 2018Konferansepaper, Publicerat paper (Annet vitenskapelig)
Abstract [en]

We present a method for aligning comparable corpora of simple-complex articles at the sentence level. Three methods were tested; Average Alignment (AA), Maximum Alignment (MA), and Hungarian Alignment (HA). For evaluating the algorithms, and finding the optimal combination of parameters, a dataset of manually annotated sentences was constructed. The algorithms were evaluated against the manually annotated dataset, and the best-performing algorithm proved to be the MA algorithm, which resulted in corpus comprising 59,513 aligned sentence pairs, of which 17,653 were unique sentences.

sted, utgiver, år, opplag, sider
2018.
HSV kategori
Identifikatorer
URN: urn:nbn:se:liu:diva-169794OAI: oai:DiVA.org:liu-169794DiVA, id: diva2:1468938
Konferanse
The Seventh Swedish Language Technology Conference (SLTC-18), Stockholm, Sweden, 7-9 November 2018
Tilgjengelig fra: 2020-09-18 Laget: 2020-09-18 Sist oppdatert: 2025-02-07

Open Access i DiVA

Fulltekst mangler i DiVA

Person

Rennes, Evelina

Søk i DiVA

Av forfatter/redaktør
Rennes, Evelina
Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric

urn-nbn
Totalt: 58 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf