liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Canonical correlation analysis for data reduction in data mining applied to predictive models for breast cancer recurrence
Linköping University, The Institute of Technology. Linköping University, Department of Biomedical Engineering, Medical Informatics.
Linköping University, The Institute of Technology. Linköping University, Department of Biomedical Engineering, Medical Informatics.
Linköping University, The Institute of Technology. Linköping University, Department of Biomedical Engineering, Medical Informatics.
Linköping University, The Institute of Technology. Linköping University, Department of Biomedical Engineering, Medical Informatics.
2005 (English)In: The XIXth International Congress of the European Federation for Medical Informatics,2005, Amsterdam: IOSPress , 2005, 175-180 p.Conference paper, Published paper (Refereed)
Abstract [en]

Data mining methods can be used for extracting specific medical knowledge such as important predictors for recurrence of breast cancer in pertinent data material. However, when there is a huge quantity of variables in the data material it is first necessary to identify and select important variables. In this study we present a preprocessing method for selecting important variables in a dataset prior to building a predictive model. In the dataset, data from 5787 female patients were, analysed. To cover more predictors and obtain a better assessment of the outcomes, data were retrieved from three different registers: the regional breast cancer, tumour markers, and cause of death registers. After retrieving information about selected predictors and outcomes from the different registers, the raw data were cleaned by running different logical rules. Thereafter, domain experts selected predictors assumed to be important regarding recurrence of breast cancer. After that, Canonical Correlation Analysis (CCA) was applied as a dimension reduction technique to preserve the character of the original data. Artificial Neural Network (ANN) was applied to the resulting dataset for two different analyses with the same settings. Performance of the predictive models was confirmed by ten-fold cross validation. The results showed an increase in the accuracy of the prediction and reduction of the mean absolute error.

Place, publisher, year, edition, pages
Amsterdam: IOSPress , 2005. 175-180 p.
National Category
Medical and Health Sciences
Identifiers
URN: urn:nbn:se:liu:diva-29180ISI: 000273025900029Local ID: 14453OAI: oai:DiVA.org:liu-29180DiVA: diva2:249992
Available from: 2009-10-09 Created: 2009-10-09 Last updated: 2010-08-11

Open Access in DiVA

No full text

Authority records BETA

Razavi, Amir RezaGill, HansÅhlfeldt, HansShahsavar, Nosrat

Search in DiVA

By author/editor
Razavi, Amir RezaGill, HansÅhlfeldt, HansShahsavar, Nosrat
By organisation
The Institute of TechnologyMedical Informatics
Medical and Health Sciences

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 484 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf