liu.seSearch for publications in DiVA
Change search
Refine search result
1234567 1 - 50 of 417
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Abramian, David
    et al.
    Linköping University, Department of Biomedical Engineering, Division of Biomedical Engineering. Linköping University, Faculty of Science & Engineering. Linköping University, Center for Medical Image Science and Visualization (CMIV).
    Blystad, Ida
    Linköping University, Faculty of Medicine and Health Sciences. Linköping University, Center for Medical Image Science and Visualization (CMIV). Region Östergötland, Center for Diagnostics, Department of Radiology in Linköping. Linköping University, Department of Health, Medicine and Caring Sciences, Division of Diagnostics and Specialist Medicine.
    Eklund, Anders
    Linköping University, Department of Biomedical Engineering, Division of Biomedical Engineering. Linköping University, Faculty of Science & Engineering. Linköping University, Center for Medical Image Science and Visualization (CMIV). Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning.
    Evaluation of inverse treatment planning for gamma knife radiosurgery using fMRI brain activation maps as organs at risk2023In: Medical physics (Lancaster), ISSN 0094-2405, Vol. 50, no 9, p. 5297-5311Article in journal (Refereed)
    Abstract [en]

    Background: Stereotactic radiosurgery (SRS) can be an effective primary or adjuvant treatment option for intracranial tumors. However, it carries risks of various radiation toxicities, which can lead to functional deficits for the patients. Current inverse planning algorithms for SRS provide an efficient way for sparing organs at risk (OARs) by setting maximum radiation dose constraints in the treatment planning process.Purpose: We propose using activation maps from functional MRI (fMRI) to map the eloquent regions of the brain and define functional OARs (fOARs) for Gamma Knife SRS treatment planning.Methods: We implemented a pipeline for analyzing patient fMRI data, generating fOARs from the resulting activation maps, and loading them onto the GammaPlan treatment planning software. We used the Lightning inverse planner to generate multiple treatment plans from open MRI data of five subjects, and evaluated the effects of incorporating the proposed fOARs.Results: The Lightning optimizer designs treatment plans with high conformity to the specified parameters. Setting maximum dose constraints on fOARs successfully limits the radiation dose incident on them, but can have a negative impact on treatment plan quality metrics. By masking out fOAR voxels surrounding the tumor target it is possible to achieve high quality treatment plans while controlling the radiation dose on fOARs.Conclusions: The proposed method can effectively reduce the radiation dose incident on the eloquent brain areas during Gamma Knife SRS of brain tumors.

    Download full text (pdf)
    fulltext
  • 2.
    Abramian, David
    et al.
    Linköping University, Department of Biomedical Engineering, Division of Biomedical Engineering. Linköping University, Faculty of Science & Engineering.
    Eklund, Anders
    Linköping University, Department of Biomedical Engineering, Division of Biomedical Engineering. Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Science & Engineering. Linköping University, Center for Medical Image Science and Visualization (CMIV).
    REFACING: RECONSTRUCTING ANONYMIZED FACIAL FEATURES USING GANS2019In: 2019 IEEE 16TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI 2019), IEEE , 2019, p. 1104-1108Conference paper (Refereed)
    Abstract [en]

    Anonymization of medical images is necessary for protecting the identity of the test subjects, and is therefore an essential step in data sharing. However, recent developments in deep learning may raise the bar on the amount of distortion that needs to be applied to guarantee anonymity. To test such possibilities, we have applied the novel CycleGAN unsupervised image-to-image translation framework on sagittal slices of T1 MR images, in order to reconstruct, facial features from anonymized data. We applied the CycleGAN framework on both face-blurred and face-removed images. Our results show that face blurring may not provide adequate protection against malicious attempts at identifying the subjects, while face removal provides more robust anonymization, but is still partially reversible.

    Download full text (pdf)
    fulltext
  • 3.
    Abramian, David
    et al.
    Linköping University, Department of Biomedical Engineering, Division of Biomedical Engineering. Linköping University, Faculty of Science & Engineering. Linköping University, Center for Medical Image Science and Visualization (CMIV).
    Larsson, Martin
    Centre of Mathematical Sciences, Lund University, Lund, Sweden.
    Eklund, Anders
    Linköping University, Department of Biomedical Engineering, Division of Biomedical Engineering. Linköping University, Faculty of Science & Engineering. Linköping University, Center for Medical Image Science and Visualization (CMIV). Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning.
    Aganj, Iman
    Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Harvard Medical School, Boston, USA; Computer Science and Artificial Intelligence Lab, Massachusetts Institute of Technology, Cambridge, USA.
    Westin, Carl-Fredrik
    Department of Radiology, Brigham and Women’s Hospital, Harvard Medical School, Boston, USA.
    Behjat, Hamid
    Department of Biomedical Engineering, Lund University, Lund, Sweden; Department of Radiology, Brigham and Women’s Hospital, Harvard Medical School, Boston, USA; Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Harvard Medical School, Boston, USA.
    Diffusion-Informed Spatial Smoothing of fMRI Data in White Matter Using Spectral Graph Filters2021In: NeuroImage, ISSN 1053-8119, E-ISSN 1095-9572, Vol. 237, article id 118095Article in journal (Refereed)
    Abstract [en]

    Brain activation mapping using functional magnetic resonance imaging (fMRI) has been extensively studied in brain gray matter (GM), whereas in large disregarded for probing white matter (WM). This unbalanced treatment has been in part due to controversies in relation to the nature of the blood oxygenation level-dependent (BOLD) contrast in WM and its detachability. However, an accumulating body of studies has provided solid evidence of the functional significance of the BOLD signal in WM and has revealed that it exhibits anisotropic spatio-temporal correlations and structure-specific fluctuations concomitant with those of the cortical BOLD signal. In this work, we present an anisotropic spatial filtering scheme for smoothing fMRI data in WM that accounts for known spatial constraints on the BOLD signal in WM. In particular, the spatial correlation structure of the BOLD signal in WM is highly anisotropic and closely linked to local axonal structure in terms of shape and orientation, suggesting that isotropic Gaussian filters conventionally used for smoothing fMRI data are inadequate for denoising the BOLD signal in WM. The fundamental element in the proposed method is a graph-based description of WM that encodes the underlying anisotropy observed across WM, derived from diffusion-weighted MRI data. Based on this representation, and leveraging graph signal processing principles, we design subject-specific spatial filters that adapt to a subject’s unique WM structure at each position in the WM that they are applied at. We use the proposed filters to spatially smooth fMRI data in WM, as an alternative to the conventional practice of using isotropic Gaussian filters. We test the proposed filtering approach on two sets of simulated phantoms, showcasing its greater sensitivity and specificity for the detection of slender anisotropic activations, compared to that achieved with isotropic Gaussian filters. We also present WM activation mapping results on the Human Connectome Project’s 100-unrelated subject dataset, across seven functional tasks, showing that the proposed method enables the detection of streamline-like activations within axonal bundles.

    Download full text (pdf)
    fulltext
  • 4.
    Abramian, David
    et al.
    Linköping University, Faculty of Science & Engineering. Linköping University, Center for Medical Image Science and Visualization (CMIV). Linköping University, Department of Biomedical Engineering, Division of Biomedical Engineering.
    Larsson, Martin
    Centre for Mathematical Sciences, Lund University, Sweden.
    Eklund, Anders
    Linköping University, Department of Biomedical Engineering, Division of Biomedical Engineering. Linköping University, Faculty of Science & Engineering. Linköping University, Center for Medical Image Science and Visualization (CMIV). Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning.
    Behjat, Hamid
    Department of Biomedical Engineering, Lund University, Sweden.
    Improved Functional MRI Activation Mapping in White Matter Through Diffusion-Adapted Spatial Filtering2020In: ISBI 2020: IEEE International Symposium on Biomedical Imaging, IEEE, 2020Conference paper (Refereed)
    Abstract [en]

    Brain activation mapping using functional MRI (fMRI) based on blood oxygenation level-dependent (BOLD) contrast has been conventionally focused on probing gray matter, the BOLD contrast in white matter having been generally disregarded. Recent results have provided evidence of the functional significance of the white matter BOLD signal, showing at the same time that its correlation structure is highly anisotropic, and related to the diffusion tensor in shape and orientation. This evidence suggests that conventional isotropic Gaussian filters are inadequate for denoising white matter fMRI data, since they are incapable of adapting to the complex anisotropic domain of white matter axonal connections. In this paper we explore a graph-based description of the white matter developed from diffusion MRI data, which is capable of encoding the anisotropy of the domain. Based on this representation we design localized spatial filters that adapt to white matter structure by leveraging graph signal processing principles. The performance of the proposed filtering technique is evaluated on semi-synthetic data, where it shows potential for greater sensitivity and specificity in white matter activation mapping, compared to isotropic filtering.

    Download full text (pdf)
    fulltext
  • 5.
    Abramian, David
    et al.
    Linköping University, Center for Medical Image Science and Visualization (CMIV). Linköping University, Faculty of Science & Engineering. Linköping University, Department of Biomedical Engineering, Division of Biomedical Engineering.
    Sidén, Per
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Arts and Sciences.
    Knutsson, Hans
    Linköping University, Faculty of Science & Engineering. Linköping University, Department of Biomedical Engineering, Division of Biomedical Engineering. Linköping University, Center for Medical Image Science and Visualization (CMIV).
    Villani, Mattias
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Arts and Sciences. Linköping University, Center for Medical Image Science and Visualization (CMIV). Department of Statistics, Stockholm University.
    Eklund, Anders
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Science & Engineering. Linköping University, Department of Biomedical Engineering, Division of Biomedical Engineering. Linköping University, Center for Medical Image Science and Visualization (CMIV).
    Anatomically Informed Bayesian Spatial Priors for FMRI Analysis2020In: ISBI 2020: IEEE International Symposium on Biomedical Imaging / [ed] IEEE, IEEE, 2020Conference paper (Refereed)
    Abstract [en]

    Existing Bayesian spatial priors for functional magnetic resonance imaging (fMRI) data correspond to stationary isotropic smoothing filters that may oversmooth at anatomical boundaries. We propose two anatomically informed Bayesian spatial models for fMRI data with local smoothing in each voxel based on a tensor field estimated from a T1-weighted anatomical image. We show that our anatomically informed Bayesian spatial models results in posterior probability maps that follow the anatomical structure.

    Download full text (pdf)
    fulltext
  • 6.
    Ahmadian, Amirhossein
    et al.
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Science & Engineering.
    Ding, Yifan
    Linköping University, Department of Science and Technology, Media and Information Technology. Linköping University, Faculty of Science & Engineering.
    Eilertsen, Gabriel
    Linköping University, Department of Science and Technology, Media and Information Technology. Linköping University, Faculty of Science & Engineering. Linköping University, Center for Medical Image Science and Visualization (CMIV).
    Lindsten, Fredrik
    Linköping University, Department of Electrical Engineering, Automatic Control. Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Science & Engineering.
    Unsupervised Novelty Detection in Pretrained Representation Space with Locally Adapted Likelihood Ratio2024In: International Conference on Artificial Intelligence and Statistics 2024, Proceedings of Machine Learning Research, 2024Conference paper (Refereed)
  • 7.
    Ahmadian, Amirhossein
    et al.
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Science & Engineering.
    Lindsten, Fredrik
    Linköping University, Department of Electrical Engineering, Automatic Control. Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Science & Engineering.
    Enhancing Representation Learning with Deep Classifiers in Presence of Shortcut2023In: Proceedings of IEEE ICASSP 2023, 2023Conference paper (Refereed)
    Abstract [en]

    A deep neural classifier trained on an upstream task can be leveraged to boost the performance of another classifier in a related downstream task through the representations learned in hidden layers. However, presence of shortcuts (easy-to-learn features) in the upstream task can considerably impair the versatility of intermediate representations and, in turn, the downstream performance. In this paper, we propose a method to improve the representations learned by deep neural image classifiers in spite of a shortcut in upstream data. In our method, the upstream classification objective is augmented with a type of adversarial training where an auxiliary network, so called lens, fools the classifier by exploiting the shortcut in reconstructing images. Empirical comparisons in self-supervised and transfer learning problems with three shortcut-biased datasets suggest the advantages of our method in terms of downstream performance and/or training time.

  • 8.
    Ahmadian, Amirhossein
    et al.
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Science & Engineering.
    Lindsten, Fredrik
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Science & Engineering.
    Likelihood-free Out-of-Distribution Detection with Invertible Generative Models2021In: Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence (IJCAI 2021), International Joint Conferences on Artifical Intelligence (IJCAI) , 2021, p. 2119-2125Conference paper (Refereed)
    Abstract [en]

    Likelihood of generative models has been used traditionally as a score to detect atypical (Out-of-Distribution, OOD) inputs. However, several recent studies have found this approach to be highly unreliable, even with invertible generative models, where computing the likelihood is feasible. In this paper, we present a different framework for generative model--based OOD detection that employs the model in constructing a new representation space, instead of using it directly in computing typicality scores, where it is emphasized that the score function should be interpretable as the similarity between the input and training data in the new space. In practice, with a focus on invertible models, we propose to extract low-dimensional features (statistics) based on the model encoder and complexity of input images, and then use a One-Class SVM to score the data. Contrary to recently proposed OOD detection methods for generative models, our method does not require computing likelihood values. Consequently, it is much faster when using invertible models with iteratively approximated likelihood (e.g. iResNet), while it still has a performance competitive with other related methods

  • 9.
    Aitken, Colin
    et al.
    Univ Edinburgh, Scotland.
    Nordgaard, Anders
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Arts and Sciences. Swedish Police Author, Natl Forens Ctr, SE-58194 Linkoping, Sweden.
    The Roles of Participants Differing Background Information in the Evaluation of Evidence2018In: Journal of Forensic Sciences, ISSN 0022-1198, E-ISSN 1556-4029, Vol. 63, no 2, p. 648-649Article in journal (Other academic)
    Abstract [en]

    n/a

  • 10.
    Akbar, Muhammad Usman
    et al.
    Linköping University, Department of Biomedical Engineering, Division of Biomedical Engineering. Linköping University, Faculty of Science & Engineering. Linköping University, Center for Medical Image Science and Visualization (CMIV).
    Larsson, Måns
    Eigenvision, Malmö, Sweden.
    Blystad, Ida
    Linköping University, Faculty of Medicine and Health Sciences. Linköping University, Center for Medical Image Science and Visualization (CMIV). Region Östergötland, Center for Diagnostics, Department of Radiology in Linköping. Linköping University, Department of Health, Medicine and Caring Sciences, Division of Diagnostics and Specialist Medicine.
    Eklund, Anders
    Linköping University, Department of Biomedical Engineering, Division of Biomedical Engineering. Linköping University, Faculty of Science & Engineering. Linköping University, Center for Medical Image Science and Visualization (CMIV). Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning.
    Brain tumor segmentation using synthetic MR images - A comparison of GANs and diffusion models2024In: Scientific Data, E-ISSN 2052-4463, Vol. 11, no 1, article id 259Article in journal (Refereed)
    Abstract [en]

    Large annotated datasets are required for training deep learning models, but in medical imaging data sharing is often complicated due to ethics, anonymization and data protection legislation. Generative AI models, such as generative adversarial networks (GANs) and diffusion models, can today produce very realistic synthetic images, and can potentially facilitate data sharing. However, in order to share synthetic medical images it must first be demonstrated that they can be used for training different networks with acceptable performance. Here, we therefore comprehensively evaluate four GANs (progressive GAN, StyleGAN 1–3) and a diffusion model for the task of brain tumor segmentation (using two segmentation networks, U-Net and a Swin transformer). Our results show that segmentation networks trained on synthetic images reach Dice scores that are 80%–90% of Dice scores when training with real images, but that memorization of the training images can be a problem for diffusion models if the original dataset is too small. Our conclusion is that sharing synthetic medical images is a viable option to sharing real images, but that further work is required. The trained generative models and the generated synthetic images are shared on AIDA data hub.

    Download full text (pdf)
    fulltext
  • 11.
    Alenlöv, Johan
    et al.
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Science & Engineering.
    Doucet, Arnaud
    Univ Oxford, England.
    Lindsten, Fredrik
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Science & Engineering.
    Pseudo-Marginal Hamiltonian Monte Carlo2021In: Journal of machine learning research, ISSN 1532-4435, E-ISSN 1533-7928, Vol. 22Article in journal (Refereed)
    Abstract [en]

    Bayesian inference in the presence of an intractable likelihood function is computationally challenging. When following a Markov chain Monte Carlo (MCMC) approach to approximate the posterior distribution in this context, one typically either uses MCMC schemes which target the joint posterior of the parameters and some auxiliary latent variables, or pseudo-marginal Metropolis-Hastings (MH) schemes. The latter mimic a MH algorithm targeting the marginal posterior of the parameters by approximating unbiasedly the intractable likelihood. However, in scenarios where the parameters and auxiliary variables are strongly correlated under the posterior and/or this posterior is multimodal, Gibbs sampling or Hamiltonian Monte Carlo (HMC) will perform poorly and the pseudo-marginal MH algorithm, as any other MH scheme, will be inefficient for high-dimensional parameters. We propose here an original MCMC algorithm, termed pseudo-marginal HMC, which combines the advantages of both HMC and pseudo-marginal schemes. Specifically, the PM-HMC method is controlled by a precision parameter N, controlling the approximation of the likelihood and, for any N, it samples the marginal posterior of the parameters. Additionally, as N tends to infinity, its sample trajectories and acceptance probability converge to those of an ideal, but intractable, HMC algorithm which would have access to the intractable likelihood and its gradient. We demonstrate through experiments that PM-HMC can outperform significantly both standard HMC and pseudo-marginal MH schemes.

    Download full text (pdf)
    fulltext
  • 12.
    Alfredsson, Joakim
    et al.
    Linköping University, Department of Health, Medicine and Caring Sciences, Division of Diagnostics and Specialist Medicine. Linköping University, Faculty of Medicine and Health Sciences. Region Östergötland, Heart Center, Department of Cardiology in Linköping.
    Wegmann, Bertil
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Arts and Sciences. Linköping University, Center for Medical Image Science and Visualization (CMIV).
    Holmström, Margareta
    Linköping University, Department of Health, Medicine and Caring Sciences, Division of Diagnostics and Specialist Medicine. Linköping University, Faculty of Medicine and Health Sciences. Region Östergötland, Local Health Care Services in Central Östergötland, Department of Acute Internal Medicine and Geriatrics.
    Östgren, Carl Johan
    Linköping University, Department of Health, Medicine and Caring Sciences, Division of Prevention, Rehabilitation and Community Medicine. Linköping University, Faculty of Medicine and Health Sciences. Region Östergötland, Primary Care Center, Primary Health Care Center Ekholmen. Linköping University, Center for Medical Image Science and Visualization (CMIV).
    Larsson, A.
    Uppsala Univ, Sweden.
    Lindahl, Tomas
    Linköping University, Department of Biomedical and Clinical Sciences, Division of Clinical Chemistry and Pharmacology. Linköping University, Faculty of Medicine and Health Sciences. Region Östergötland, Center for Diagnostics, Department of Clinical Chemistry.
    Coagulation factor XI relative to established cardiovascular risk factors and atherosclerosis, in a large middle-aged population2024In: Thrombosis Research, ISSN 0049-3848, E-ISSN 1879-2472, Vol. 241, article id 109069Article in journal (Other academic)
    The full text will be freely available from 2025-06-18 00:00
  • 13.
    Alhasan, Ahmed
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning.
    Generating Geospatial Trip DataUsing Deep Neural Networks2022Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Synthetic data provides a good alternative to real data when the latter is not sufficientor limited by privacy requirements. In spatio-temporal applications, generating syntheticdata is generally more complex due to the existence of both spatial and temporal dependencies.Recently, with the advent of deep generative modeling such as GenerativeAdversarial Networks (GAN), synthetic data generation has seen a lot of development andsuccess. This thesis uses a GAN model based on two Recurrent Neural Networks (RNN)as a generator and a discriminator to generate new trip data for transport vehicles, wherethe data is represented as a time series. This model is compared with a standalone RNNnetwork that does not have an adversarial counterpart. The result shows that the RNNmodel (without the adversarial counterpart) performed better than the GAN model dueto the difficulty that involves training and tuning GAN models.

    Download full text (pdf)
    Master Thesis
  • 14.
    Al-Mter, Yusur
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning.
    Automatic Prediction of Human Age based on Heart Rate Variability Analysis using Feature-Based Methods2020Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Heart rate variability (HRV) is the time variation between adjacent heartbeats. This variation is regulated by the autonomic nervous system (ANS) and its two branches, the sympathetic and parasympathetic nervous system. HRV is considered as an essential clinical tool to estimate the imbalance between the two branches, hence as an indicator of age and cardiac-related events.This thesis focuses on the ECG recordings during nocturnal rest to estimate the influence of HRV in predicting the age decade of healthy individuals. Time and frequency domains, as well as non-linear methods, are explored to extract the HRV features. Three feature-based methods (support vector machine (SVM), random forest, and extreme gradient boosting (XGBoost)) were employed, and the overall test accuracy achieved in capturing the actual class was relatively low (lower than 30%). SVM classifier had the lowest performance, while random forests and XGBoost performed slightly better. Although the difference is negligible, the random forest had the highest test accuracy, approximately 29%, using a subset of ten optimal HRV features. Furthermore, to validate the findings, the original dataset was shuffled and used as a test set and compared the performance to other related research outputs.

    Download full text (pdf)
    fulltext
  • 15.
    Alsaadi, Sarah
    et al.
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning.
    Wänström, Linda
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning.
    Sjögren, Björn
    Linköping University, Department of Behavioural Sciences and Learning, Education, Teaching and Learning.
    Bjärehed, Marlene
    Linköping University, Department of Behavioural Sciences and Learning, Education, Teaching and Learning.
    Thornberg, Robert
    Linköping University, Department of Behavioural Sciences and Learning, Education, Teaching and Learning. Linköping University, Faculty of Educational Sciences.
    Collective moral disengagement and school bullying: An initial validation study of the Swedish scale version2016Conference paper (Refereed)
  • 16.
    Alsaadi, Sarah
    et al.
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Arts and Sciences.
    Wänström, Linda
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Arts and Sciences. Linköping University, Department of Behavioural Sciences and Learning, Education, Teaching and Learning. Linköping University, Faculty of Educational Sciences.
    Thornberg, Robert
    Linköping University, Department of Behavioural Sciences and Learning, Education, Teaching and Learning. Linköping University, Faculty of Educational Sciences.
    Sjögren, Björn
    Linköping University, Department of Behavioural Sciences and Learning, Education, Teaching and Learning. Linköping University, Faculty of Educational Sciences.
    Bjärehed, Marlene
    Linköping University, Department of Behavioural Sciences and Learning, Education, Teaching and Learning. Linköping University, Faculty of Educational Sciences.
    Forsberg, Camilla
    Linköping University, Department of Behavioural Sciences and Learning, Education, Teaching and Learning. Linköping University, Faculty of Educational Sciences.
    Collective moral disengagement at school: A validation of a scale for Swedish children2018Conference paper (Other academic)
    Abstract [en]

    The purpose of this study was to evaluate a recently developed classroom collective moral disengagement scale (CMD). The 18-item scale was evaluated on a sample of 1626 fourth grade students in Sweden. Through confirmatory factor analysis, the unidimensional structure of the scale was verified, and the internal consistency was good. The scale is related to individual moral disengagement and to bullying behavior both on an individual level, which supports the criteria validity of the scale and on class level, which supports the construct validity of the scale. Multigroup analyses demonstrated measurement invariance across gender. These results indicate that the scale can be used in studies on CMD, and girls’ and boys’ mean scores may be compared.

  • 17.
    Alsén, Simon
    et al.
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning.
    Åkesson, Andreas
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning.
    Jämförelse av metoder för hantering av partiellt bortfall vid logistisk regressionsanalys2021Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
    Abstract [en]

    Missing data is a common problem in research and can lead to loss of statistical power and bias in parameter estimates. Numerous methods have been developed for dealing with missing data, and the aim of this thesis is to evaluate how a number of these methods affect the parameter estimates in a logistic regression model, and whether these methods are suitable for the data in question. The methods included in this study are complete case analysis, MICE and missForest.

    For the purpose of evaluating the methods, missing values in varying proportions and under different missing mechanisms are generated in a real dataset consisting of 2987 observations and five variables. The performance of the methods is assessed by normalized root mean squared error (NRMSE), and by comparing the regression coefficients estimated using the original, true data set with the regression coefficients estimated using imputed data sets.

    missForest results in the lowest NRMSE. In the subsequent logistic regression analysis, however, MICE results in considerably lower bias than missForest.

    Download full text (pdf)
    fulltext
  • 18.
    Anand, Abhijeet
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning.
    Point clouds in the application of Bin Picking2023Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Automatic bin picking is a well-known problem in industrial automation and computer vision, where a robot picks an object from a bin and places it somewhere else. There is continuous ongoing research for many years to improve the contemporary solution. With camera technology advancing rapidly and available fast computation resources, solving this problem with deep learning has become a current interest for several researchers. This thesis intends to leverage the current state-of-the-art deep learning based methods of 3D instance segmentation and point cloud registration and combine them to improve the bin picking solution by improving the performance and make them robust.

    The problem of bin picking becomes complex when the bin contains identical objects with heavy occlusion. To solve this problem, a 3D instance segmentation is performed with Fast Point Cloud Clustering (FPCC) method to detect and locate the objects in the bin. Further, an extraction strategy is proposed to choose one predicted instance at a time. Inthe next step, a point cloud registration technique is implemented based on PointNetLK method to estimate the pose of the selected object from the bin.

    The above implementation is trained, tested, and evaluated on synthetically generated datasets. The synthetic dataset also contains several noisy point clouds to imitate a real situation. The real data captured at the company ’SICK IVP’ is also tested with the implemented model.

    It is observed that the 3D instance segmentation can detect and locate the objects available in the bin. In a noisy environment, the performance degrades as the noise level increase. However, the decrease in the performance is found to be not so significant. Point cloud registration is observed to register best with the full point cloud of the object, when compared to point cloud with missing points.

    Download full text (pdf)
    fulltext
  • 19.
    Anders, Erik
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning.
    Classification of Corporate Social Performance2021Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Over the past few years there has been an exponentially increasing attention in financetowards socially responsible investments which creates a need to determine whether acompany is socially responsible or not. The ESG ratings often used to do this are based onEnvironmental, Social and Governance related data about the companies and have manyflaws. This thesis proposes to instead model them by their controversies discussed in themedia. It tries to answer the question if it is possible to predict future controversies of acompany by its controversies and ESG indicators in the past and to isolate predictors whichinfluence these. This has not been done before and offers a new way of rating companieswithout falling for the biases of conventional ESG ratings. The chosen method to approachthis issue is the Zero Inflated Poisson Regression with Random Intercepts. A selectionof variables was determined by Lasso and projection predictive variable selection. Thismethod discovered new connections in the data between ESG indicators and the numberof controversies but also made it apparent that it is difficult to make predictions for futureyears. Nether the less the coefficients of the selected indicators can give a valuable insightinto the potential risk of an investment.

    Download full text (pdf)
    fulltext
  • 20.
    Anderskär, Erika
    et al.
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning.
    Thomasson, Frida
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning.
    Inkrementell responsanalys av Scandnavian Airlines medlemmar: Vilka kunder ska väljas vid riktad marknadsföring?2017Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
    Abstract [en]

    Scandinavian Airlines has a large database containing their Eurobonus members. In order to analyze which customers they should target with direct marketing, such as emails, uplift models have been used. With a binary response variable that indicates whether the customer has bought or not, and a binary dummy variable that indicates if the customer has received the campaign or not conclusions can be drawn about which customers are persuadable. That means that the customers that buy when they receive a campaign and not if they don't are spotted. Analysis have been done with one campaign for Sweden and Scandinavia. The methods that have been used are logistic regression with Lasso and logistic regression with Penalized Net Information Value. The best method for predicting purchases is Lasso regression when comparing with a confusion matrix. The variable that best describes persuadable customers in logistic regression with PNIV is Flown (customers that have own with SAS within the last six months). In Lassoregression the variable that describes a persuadable customer in Sweden is membership level1 (the rst level of membership) and in Scandinavia customers that receive campaigns with delivery code 13 are persuadable, which is a form of dispatch.

    Download full text (pdf)
    fulltext
  • 21.
    Andersson Naesseth, Christian
    et al.
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Science & Engineering.
    Lindsten, Fredrik
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Science & Engineering.
    Schon, Thomas B.
    Uppsala Univ, Sweden.
    High-Dimensional Filtering Using Nested Sequential Monte Carlo2019In: IEEE Transactions on Signal Processing, ISSN 1053-587X, E-ISSN 1941-0476, Vol. 67, no 16, p. 4177-4188Article in journal (Refereed)
    Abstract [en]

    Sequential Monte Carlo (SMC) methods comprise one of the most successful approaches to approximate Bayesian filtering. However, SMC without a good proposal distribution can perform poorly, in particular in high dimensions. We propose nested sequential Monte Carlo, a methodology that generalizes the SMC framework by requiring only approximate, properly weighted, samples from the SMC proposal distribution, while still resulting in a correctSMCalgorithm. This way, we can compute an "exact approximation" of, e. g., the locally optimal proposal, and extend the class of models forwhichwe can perform efficient inference using SMC. We showimproved accuracy over other state-of-the-art methods on several spatio-temporal state-space models.

    Download full text (pdf)
    fulltext
  • 22.
    Andersson, Olov
    et al.
    Linköping University, Department of Computer and Information Science, Artificial Intelligence and Integrated Computer Systems. Linköping University, Faculty of Science & Engineering.
    Sidén, Per
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Arts and Sciences.
    Dahlin, Johan
    Kotte Consulting AB.
    Doherty, Patrick
    Linköping University, Department of Computer and Information Science, Artificial Intelligence and Integrated Computer Systems. Linköping University, Faculty of Science & Engineering.
    Villani, Mattias
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Arts and Sciences. Stockholm University, Stockholm, Sweden.
    Real-Time Robotic Search using Structural Spatial Point Processes2020In: 35TH UNCERTAINTY IN ARTIFICIAL INTELLIGENCE CONFERENCE (UAI 2019), Association For Uncertainty in Artificial Intelligence (AUAI) , 2020, Vol. 115, p. 995-1005Conference paper (Refereed)
    Abstract [en]

    Aerial robots hold great potential for aiding Search and Rescue (SAR) efforts over large areas, such as during natural disasters. Traditional approaches typically search an area exhaustively, thereby ignoring that the density of victims varies based on predictable factors, such as the terrain, population density and the type of disaster. We present a probabilistic model to automate SAR planning, with explicit minimization of the expected time to discovery. The proposed model is a spatial point process with three interacting spatial fields for i) the point patterns of persons in the area, ii) the probability of detecting persons and iii) the probability of injury. This structure allows inclusion of informative priors from e.g. geographic or cell phone traffic data, while falling back to latent Gaussian processes when priors are missing or inaccurate. To solve this problem in real-time, we propose a combination of fast approximate inference using Integrated Nested Laplace Approximation (INLA), and a novel Monte Carlo tree search tailored to the problem. Experiments using data simulated from real world Geographic Information System (GIS) maps show that the framework outperforms competing approaches, finding many more injured in the crucial first hours.

  • 23.
    Ardalan, Aydin
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning.
    Detection and Classification of Femur Fractures Using Deep Learning2024Independent thesis Advanced level (degree of Master (Two Years)), 80 credits / 120 HE creditsStudent thesis
    Abstract [en]

    This thesis focuses on Atypical Femur Fractures (AFFs), a rare yet severe complicationoften associated with extended bisphosphonate therapy. AFFs exhibit distinctive radiographicfeatures that differentiate them from Normal Femur Fractures (NFFs). However,these features are typically subtle and can be easily overlooked.The objective of this study is to develop and implement a novel hybrid methodologyfor detecting and classifying fracture centers in radiographic images. For the detectionphase, two different methods are utilized and compared: fully convolutional neural networks(FCNs) and object detection methods. Following detection, a convolutional neuralnetwork (CNN) is used for classifying the detected fractures as either AFFs or NFFs.This approach is evaluated against direct classification methods using full-size images andtraditional statistical approaches, specifically logistic regression, to assess the efficacy ofclassical models compared to machine learning-based CNN models.The research employs a dataset of 1,161 radiographs sourced from 72 hospitals acrossSweden, with 20.03% depicting AFFs and the remaining 79.97% identified as NFFs. Consideringthe occurrence of multiple radiographs per patient, a patient-level partitioningstrategy is implemented to prevent data leakage. Additionally, class weights are appliedto correct the imbalanced class distribution in the dataset. Results demonstrate that FCNseffectively localize fractures near image centers, though performance drops for off-centerfractures. Object detection networks maintain high accuracy across all positions. CNNsoutperform traditional methods, achieving higher AUC scores, particularly when trainedwith 512x512 pixel patches. Traditional statistical methods, such as logistic regression, lagbehind deep learning approaches, indicating the latter’s superior capability in modelingcomplex radiographic data.

    Download full text (pdf)
    fulltext
  • 24.
    Asokan, Mowniesh
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning.
    A study of forecasts in Financial Time Series using Machine Learning methods2022Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Forecasting financial time series is one of the most challenging problems in economics and business. Markets are highly complex due to non-linear factors in data and uncertainty. It moves up and down without any pattern. Based on historical univariate close prices from the S\&P 500, SSE, and FTSE 100 indexes, this thesis forecasts future values using two different approaches: one using a classical method, a Seasonal ARIMA model, and a hybrid ARIMA-GARCH model, while the other uses an LSTM neural network. Each method is used to perform at different forecast horizons. Experimental results have proven that the LSTM and Hybrid ARIMA-GARCH model performs better than the SARIMA model. To measure the model performance we used the Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE).

    Download full text (pdf)
    A study of forecasts in Financial Time Series using Machine Learning methods
  • 25.
    Baier, Friederike
    et al.
    Leuphana University of Lüneburg, Lüneburg, Germany.
    Mair, Sebastian
    Uppsala University, Uppsala, Sweden.
    Fadel, Samuel G.
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Science & Engineering.
    Self-Supervised Siamese Autoencoders2024In: Advances in Intelligent Data Analysis XXII: Part I / [ed] Ioanna Miliou, Nico Piatkowski, Panagiotis Papapetrou, Springer, 2024, Vol. 14641, p. 117-128Conference paper (Refereed)
    Abstract [en]

    In contrast to fully-supervised models, self-supervised representation learning only needs a fraction of data to be labeled and often achieves the same or even higher downstream performance. The goal is to pre-train deep neural networks on a self-supervised task, making them able to extract meaningful features from raw input data afterwards. Previously, autoencoders and Siamese networks have been successfully employed as feature extractors for tasks such as image classification. However, both have their individual shortcomings and benefits. In this paper, we combine their complementary strengths by proposing a new method called SidAE (Siamese denoising autoencoder). Using an image classification downstream task, we show that our model outperforms two self-supervised baselines across multiple data sets and scenarios. Crucially, this includes conditions in which only a small amount of labeled data is available. Empirically, the Siamese component has more impact, but the denoising autoencoder is nevertheless necessary to improve performance.

  • 26.
    Balgi, Sourabh
    et al.
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Science & Engineering.
    Daoud, Adel
    Linköping University, Department of Management and Engineering, The Institute for Analytical Sociology, IAS. Linköping University, Faculty of Arts and Sciences.
    Peña, Jose M.
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Science & Engineering.
    Wodtke, Geoffrey
    Department of Sociology, University of Chicago, Chicago, IL, USA.
    Zhou, Jesse
    Department of Sociology, University of Chicago, Chicago, IL, USA.
    Deep Learning With DAGsManuscript (preprint) (Other academic)
    Abstract [en]

    Social science theories often postulate causal relationships among a set of variables or events. Although directed acyclic graphs (DAGs) are increasingly used to represent these theories, their full potential has not yet been realized in practice. As non-parametric causal models, DAGs require no assumptions about the functional form of the hypothesized relationships. Nevertheless, to simplify the task of empirical evaluation, researchers tend to invoke such assumptions anyway, even though they are typically arbitrary and do not reflect any theoretical content or prior knowledge. Moreover, functional form assumptions can engender bias, whenever they fail to accurately capture the complexity of the causal system under investigation. In this article, we introduce causal-graphical normalizing flows (cGNFs), a novel approach to causal inference that leverages deep neural networks to empirically evaluate theories represented as DAGs. Unlike conventional approaches, cGNFs model the full joint distribution of the data according to a DAG supplied by the analyst, without relying on stringent assumptions about functional form. In this way, the method allows for flexible, semi-parametric estimation of any causal estimand that can be identified from the DAG, including total effects, conditional effects, direct and indirect effects, and path-specific effects. We illustrate the method with a reanalysis of Blau and Duncan’s (1967) model of status attainment and Zhou’s (2019) model of conditional versus controlled mobility. To facilitate adoption, we provide open-source software together with a series of online tutorials for implementing cGNFs. The article concludes with a discussion of current limitations and directions for future development.

  • 27.
    Balgi, Sourabh
    et al.
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Science & Engineering.
    Peña, Jose M.
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Science & Engineering.
    Daoud, Adel
    Linköping University, Department of Management and Engineering, The Institute for Analytical Sociology, IAS. Linköping University, Faculty of Arts and Sciences.
    Counterfactual Analysis of the Impact of the IMF Program on Child Povertyin the Global-South Region using Causal-Graphical Normalizing FlowsManuscript (preprint) (Other academic)
    Abstract [en]

    This work demonstrates the application of a particular branch of causal inference and deep learning models: \emph{causal-Graphical Normalizing Flows (c-GNFs)}. In a recent contribution, scholars showed that normalizing flows carry certain properties, making them particularly suitable for causal and counterfactual analysis. However, c-GNFs have only been tested in a simulated data setting and no contribution to date have evaluated the application of c-GNFs on large-scale real-world data. Focusing on the \emph{AI for social good}, our study provides a counterfactual analysis of the impact of the International Monetary Fund (IMF) program on child poverty using c-GNFs. The analysis relies on a large-scale real-world observational data: 1,941,734 children under the age of 18, cared for by 567,344 families residing in the 67 countries from the Global-South. While the primary objective of the IMF is to support governments in achieving economic stability, our results find that an IMF program reduces child poverty as a positive side-effect by about 1.2±0.24 degree (`0' equals no poverty and `7' is maximum poverty). Thus, our article shows how c-GNFs further the use of deep learning and causal inference in AI for social good. It shows how learning algorithms can be used for addressing the untapped potential for a significant social impact through counterfactual inference at population level (ACE), sub-population level (CACE), and individual level (ICE). In contrast to most works that model ACE or CACE but not ICE, c-GNFs enable personalization using \emph{`The First Law of Causal Inference'}.

  • 28.
    Balgi, Sourabh
    et al.
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning.
    Peña, Jose M.
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning.
    Daoud, Adel
    Linköping University, Department of Management and Engineering, The Institute for Analytical Sociology, IAS.
    Counterfactually-Equivalent Structural Causal Modelling Using Causal Graphical Normalizing Flows2024In: 12th International Conference on Probabilistic Graphical Models, Nijmegen, September 11 - 13, 2024. PMLR 246:164-181 / [ed] Johan Kwisthout, Silja Renooij, 2024, Vol. 246, p. 164-181Conference paper (Refereed)
    Abstract [en]

    Recent research has highlighted the properties that deep-learning inspired causal models such as Deep-Structural Causal Model (Deep-SCM), Causal Autoregressive Flow (CAREFL) and Causal-Graphical Normalizing Flow (c-GNF) should exhibit to guarantee observational and interventional distribution equivalence with the true underlying causal data generating process (DGP), making them suitable for estimating average causal effect (ACE) or conditional ACE (CACE). However, for accurate individual-level causal effect (ICE) estimation and personalized treatment/public-policy formulation, it is crucial to ensure counterfactual equivalence between these models and the DGP. Firstly, we demonstrate that c-GNFs provide counterfactual equivalence under certain monotonicity assumption of the DGP, enabling precise ICE estimation and personalized treatment/public-policy analysis. Secondly, using this counterfactual equivalence of c-GNFs, we perform a counterfactual analysis and personalized public-policy analysis of the impact of International Monetary Fund (IMF) programs on child poverty using large-scale real-world observational data. Our results indicate a reduction in child poverty due to the IMF program at different personalization granularities. Our study also performs sensitivity analyses to assess potential threats to the unconfoundedness assumption and estimates ACE bounds and the E-value. This illustrates the potential of c-GNFs for causal and counterfactual inference in fields such as social, natural, and medical sciences.

  • 29.
    Balgi, Sourabh
    et al.
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Science & Engineering.
    Peña, Jose M.
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Science & Engineering.
    Daoud, Adel
    Linköping University, Department of Management and Engineering, The Institute for Analytical Sociology, IAS. Linköping University, Faculty of Arts and Sciences. Department of Computer Science and Engineering, Chalmers University of Technology, Gothenburg, Sweden.
    Personalized Public Policy Analysis in Social Sciences Using Causal-Graphical Normalizing Flows2022In: Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence: AAAI Special Track on AI for Social Impact, Palo Alto, California USA: AAAI Press, 2022, Vol. 36, no 11, p. 11810-11818, article id 21437Conference paper (Refereed)
    Abstract [en]

    Structural Equation/Causal Models (SEMs/SCMs) are widely used in epidemiology and social sciences to identify and analyze the average causal effect (ACE) and conditional ACE (CACE). Traditional causal effect estimation methods such as Inverse Probability Weighting (IPW) and more recently Regression-With-Residuals (RWR) are widely used - as they avoid the challenging task of identifying the SCM parameters - to estimate ACE and CACE. However, much work remains before traditional estimation methods can be used for counterfactual inference, and for the benefit of Personalized Public Policy Analysis (P3A) in the social sciences. While doctors rely on personalized medicine to tailor treatments to patients in laboratory settings (relatively closed systems), P3A draws inspiration from such tailoring but adapts it for open social systems. In this article, we develop a method for counterfactual inference that we name causal-Graphical Normalizing Flow (c-GNF), facilitating P3A. A major advantage of c-GNF is that it suits the open system in which P3A is conducted. First, we show how c-GNF captures the underlying SCM without making any assumption about functional forms. This capturing capability is enabled by the deep neural networks that model the underlying SCM via observational data likelihood maximization using gradient descent. Second, we propose a novel dequantization trick to deal with discrete variables, which is a limitation of normalizing flows in general. Third, we demonstrate in experiments that c-GNF performs on-par with IPW and RWR in terms of bias and variance for estimating the ACE, when the true functional forms are known, and better when they are unknown. Fourth and most importantly, we conduct counterfactual inference with c-GNFs, demonstrating promising empirical performance. Because IPW and RWR, like other traditional methods, lack the capability of counterfactual inference, c-GNFs will likely play a major role in tailoring personalized treatment, facilitating P3A, optimizing social interventions - in contrast to the current `one-size-fits-all' approach of existing methods.

  • 30.
    Balgi, Sourabh
    et al.
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning.
    Peña, Jose M.
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning.
    Daoud, Adel
    Linköping University, Department of Management and Engineering, The Institute for Analytical Sociology, IAS.
    ρ-GNF: A Copula-based Sensitivity Analysis to Unobserved Confounding Using Normalizing Flows2024In: 12th International Conference on Probabilistic Graphical Models, Nijmegen, September 11 - 13, 2024. PMLR 246:20-37 / [ed] Johan Kwisthout, Silja Renooij, 2024, Vol. 246, p. 20-37Conference paper (Refereed)
    Abstract [en]

    We propose a novel sensitivity analysis to unobserved confounding in observational studies using copulas and normalizing flows. Using the idea of interventional equivalence of structural causal models, we develop ρ-GNF (ρ-graphical normalizing flow), where ρ∈[-1,+1] is a bounded sensitivity parameter. This parameter represents the back-door non-causal association due to unobserved confounding, and which is encoded with a Gaussian copula. In other words, the ρ-GNF enables scholars to estimate the average causal effect (ACE) as a function of ρ, while accounting for various assumed strengths of the unobserved confounding. The output of the $\rho$-GNF is what we denote as the ρ_curve that provides the bounds for the ACE given an interval of assumed ρ values. In particular, the ρ_curve enables scholars to identify the confounding strength required to nullify the ACE, similar to other sensitivity analysis methods (e.g., the E-value). Leveraging on experiments from simulated and real-world data, we show the benefits of ρ-GNF. One benefit is that the ρ-GNF uses a Gaussian copula to encode the distribution of the unobserved causes, which is commonly used in many applied settings. This distributional assumption produces narrower ACE bounds compared to other popular sensitivity analysis methods.

  • 31.
    Balgi, Sourabh
    et al.
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Science & Engineering.
    Peña, Jose M.
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Science & Engineering.
    Daoud, Adel
    Linköping University, Department of Management and Engineering, The Institute for Analytical Sociology, IAS. Linköping University, Faculty of Arts and Sciences.
    ρ-GNF: A Novel Sensitivity Analysis Approach Under Unobserved ConfoundersManuscript (preprint) (Other academic)
    Abstract [en]

    We propose a new sensitivity analysis model that combines copulas and normalizing flows for causal inference under unobserved confounding. We refer to the new model as ρ-GNF (ρ-Graphical Normalizing Flow), where ρ∈[−1,+1] is a bounded sensitivity parameter representing the backdoor non-causal association due to unobserved confounding modeled using the most well studied and widely popular Gaussian copula. Specifically, ρ-GNF enables us to estimate and analyse the frontdoor causal effect or average causal effect (ACE) as a function of ρ. We call this the ρcurve. The ρcurve enables us to specify the confounding strength required to nullify the ACE. We call this the ρvalue. Further, the ρcurve also enables us to provide bounds for the ACE given an interval of ρ values. We illustrate the benefits of ρ-GNF with experiments on simulated and real-world data in terms of our empirical ACE bounds being narrower than other popular ACE bounds.

  • 32.
    Balter, Katarina
    et al.
    Malardalen Univ, Sweden; Karolinska Inst, Sweden.
    King, Abby C.
    Stanford Univ, CA USA.
    Fritz, Johanna
    Malardalen Univ, Sweden.
    Tillander, Annika
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Arts and Sciences.
    Ullberg, Oskar Halling
    Malardalen Univ, Sweden.
    Sustainable Lifestyle Among Office Workers (the SOFIA Study): Protocol for a Cluster Randomized Controlled Trial2024In: JMIR Research Protocols, E-ISSN 1929-0748, Vol. 13, article id e57777Article in journal (Refereed)
    Abstract [en]

    Background: Society is facing multiple challenges, including lifestyle- and age-related diseases of major public health relevance, and this is of particular importance when the general population, as well as the workforce, is getting older. In addition, we are facing global climate change due to extensive emissions of greenhouse gases and negative environmental effects. A lifestyle that promotes healthy life choices as well as climate and environmentally friendly decisions is considered a sustainable lifestyle. Objective: This study aims to evaluate if providing information about a sustainable lifestyle encourages individuals to adopt more nutritious dietary habits and increase physical activity, as compared to receiving information solely centered around health-related recommendations for dietary intake and physical activity by the Nordic Nutrition Recommendations and the World Health Organization. Novel features of this study include the use of the workplace as an arena for health promotion, particularly among office workers-a group known to be often sedentary at work and making up 60% of all employees in Sweden. Methods: The Sustainable Office Intervention (SOFIA) study is a 2-arm, participant-blinded, cluster randomized controlled trial that includes a multilevel sustainable lifestyle arm (intervention arm, n=19) and a healthy lifestyle arm (control arm, n=14).The eligibility criteria were being aged 18-65 years and doing office work >= 20 hours per week. Both intervention arms are embedded in the theoretically based behavioral change wheel method. The intervention study runs for approximately 8 weeks and contains 6 workshops. The study focuses on individual behavior change as well as environmental and policy features at an organizational level to facilitate or hinder a sustainable lifestyle at work. Through implementing a citizen science methodology within the trial, the participants (citizen scientists) collect data using the Stanford Our Voice Discovery Tool app and are involved in analyzing the data, formulating a list of potential actions to bring about feasible changes in the workplace. Results: Participant recruitment and data collection began in August 2022. As of June 2024, a total of 37 participants have been recruited. The results of the pilot phase are expected to be published in 2024 or 2025. Conclusions: Given the ongoing climate change, negative environmental effects, and the global epidemic of metabolic diseases, a sustainable lifestyle among office workers holds important potential to help in counteracting this trend. Thus, there is an urgent unmet need to test the impact of a sustainable lifestyle on food intake, physical activity, and environmental and climate impacts in a worksite-based randomized controlled trial. This study protocol responds to a societal need by addressing multilevel aspects, including individual behavior changes as well as environmental and organizational changes of importance for the successful implementation of sustainable lifestyle habits in an office setting

  • 33.
    Barakat, Arian
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning.
    What makes an (audio)book popular?2018Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Audiobook reading has traditionally been used for educational purposes but has in recent times grown into a popular alternative to the more traditional means of consuming literature. In order to differentiate themselves from other players in the market, but also provide their users enjoyable literature, several audiobook companies have lately directed their efforts on producing own content. Creating highly rated content is, however, no easy task and one reoccurring challenge is how to make a bestselling story. In an attempt to identify latent features shared by successful audiobooks and evaluate proposed methods for literary quantification, this thesis employs an array of frameworks from the field of Statistics, Machine Learning and Natural Language Processing on data and literature provided by Storytel - Sweden’s largest audiobook company.

    We analyze and identify important features from a collection of 3077 Swedish books concerning their promotional and literary success. By considering features from the aspects Metadata, Theme, Plot, Style and Readability, we found that popular books are typically published as a book series, cover 1-3 central topics, write about, e.g., daughter-mother relationships and human closeness but that they also hold, on average, a higher proportion of verbs and a lower degree of short words. Despite successfully identifying these, but also other factors, we recognized that none of our models predicted “bestseller” adequately and that future work may desire to study additional factors, employ other models or even use different metrics to define and measure popularity.

    From our evaluation of the literary quantification methods, namely topic modeling and narrative approximation, we found that these methods are, in general, suitable for Swedish texts but that they require further improvement and experimentation to be successfully deployed for Swedish literature. For topic modeling, we recognized that the sole use of nouns provided more interpretable topics and that the inclusion of character names tended to pollute the topics. We also identified and discussed the possible problem of word inflections when modeling topics for more morphologically complex languages, and that additional preprocessing treatments such as word lemmatization or post-training text normalization may improve the quality and interpretability of topics. For the narrative approximation, we discovered that the method currently suffers from three shortcomings: (1) unreliable sentence segmentation, (2) unsatisfactory dictionary-based sentiment analysis and (3) the possible loss of sentiment information induced by translations. Despite only examining a handful of literary work, we further found that books written initially in Swedish had narratives that were more cross-language consistent compared to books written in English and then translated to Swedish.

    Download full text (pdf)
    what_makes_an_audiobook_popular
  • 34.
    Bartoszek, Krzysztof
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Arts and Sciences.
    A Central Limit Theorem for punctuated equilibrium2020In: Stochastic Models, ISSN 1532-6349, E-ISSN 1532-4214, Vol. 36, no 3, p. 473-517Article in journal (Refereed)
    Abstract [en]

    Current evolutionary biology models usually assume that a phenotype undergoes gradual change. This is in stark contrast to biological intuition, which indicates that change can also be punctuated-the phenotype can jump. Such a jump could especially occur at speciation, i.e., dramatic change occurs that drives the species apart. Here we derive a Central Limit Theorem for punctuated equilibrium. We show that, if adaptation is fast, for weak convergence to normality to hold, the variability in the occurrence of change has to disappear with time.

    Download full text (pdf)
    fulltext
  • 35.
    Bartoszek, Krzysztof
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Arts and Sciences.
    Closed and asymptotic formulæ for harmonic and quadratic harmonic sumsManuscript (preprint) (Other academic)
    Abstract [en]

    We present here a large collection of harmonic and quadratic harmonic sums, that can be useful in applied questions, e.g., probabilistic ones. We find closed-form formulae, that we were not able to locate in the literature. 

  • 36.
    Bartoszek, Krzysztof
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Arts and Sciences.
    Exact and approximate limit behaviour of the Yule trees cophenetic index2018In: Mathematical Biosciences, ISSN 0025-5564, E-ISSN 1879-3134, Vol. 303, p. 26-45Article in journal (Refereed)
    Abstract [en]

    In this work we study the limit distribution of an appropriately normalized cophenetic index of the pure-birth tree conditioned on n contemporary tips. We show that this normalized phylogenetic balance index is a sub-martingale that converges almost surely and in L-2. We link our work with studies on trees without branch lengths and show that in this case the limit distribution is a contraction-type distribution, similar to the Quicksort limit distribution. In the continuous branch case we suggest approximations to the limit distribution. We propose heuristic methods of simulating from these distributions and it may be observed that these algorithms result in reasonable tails. Therefore, we propose a way based on the quantiles of the derived distributions for hypothesis testing, whether an observed phylogenetic tree is consistent with the pure-birth process. Simulating a sample by the proposed heuristics is rapid, while exact simulation (simulating the tree and then calculating the index) is a time-consuming procedure. We conduct a power study to investigate how well the cophenetic indices detect deviations from the Yule tree and apply the methodology to empirical phylogenies.

    Download full text (pdf)
    fulltext
  • 37.
    Bartoszek, Krzysztof
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Arts and Sciences.
    Limit distribution of the quartet balance index for Aldous’s $(\beta \ge 0)$-model2020In: Applicationes Mathematicae, ISSN 1233-7234, E-ISSN 1730-6280, Vol. 6, p. 29-44Article in journal (Refereed)
    Abstract [en]

    This paper builds on T. Martínez-Coronado, A. Mir, F. Rosselló and G. Valiente’s 2018 work, introducing a new balance index for trees. We show that this balance index, in the case of Aldous’s $(\beta \ge 0)$-model, converges weakly to a distribution that can be characterized as the fixed point of a contraction operator on a class of distributions.

  • 38.
    Bartoszek, Krzysztof
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Arts and Sciences.
    Revisiting the Nowosiółka skull with RMaCzek2023In: Mathematica Applicanda, ISSN 1730-2668, Vol. 50, no 2, p. 255-266Article in journal (Refereed)
    Abstract [en]

    One of the first fully quantitative distance matrix visualization methods was proposed by Jan Czekanowski at the beginning of the previous century. Recently, a software package, RMaCzek, was made available that allows for producing such diagrams in R. Here we reanalyze the original data that Czekanowski used for introducing his method, and in the accompanying code show how the user can specify their own custom distance functions in the package.

  • 39.
    Bartoszek, Krzysztof
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Arts and Sciences.
    Simulating an infinite mean waiting time2019In: Mathematica Applicanda, ISSN 1730-2668, Vol. 47, no 1, p. 93-102Article in journal (Refereed)
    Abstract [en]

    We consider a hybrid method to simulate the return time to the initial state in a critical-case birth-death process. The expected value of this return time is infinite, but its distribution asymptotically follows a power-law. Hence, the simulation approach is to directly simulate the process, unless the simulated time exceeds some threshold and if it does, draw the return time from the tail of the power law.

    Download full text (pdf)
    Simulating an infinite mean waiting time
  • 40.
    Bartoszek, Krzysztof
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Arts and Sciences.
    The phylogenetic effective sample size and jumps2018In: MATHEMATICA APPLICANDA (MATEMATYKA STOSOWANA), ISSN 1730-2668, Vol. 46, no 1, p. 25-33Article in journal (Refereed)
    Abstract [en]

    The phylogenetic effective sample size is a parameter that has as its goal the quantification of the amount of independent signal in a phylogenetically correlatedsample. It was studied for Brownian motion and Ornstein-Uhlenbeck models of trait evolution. Here, we study this composite parameter when the trait is allowedto jump at speciation points of the phylogeny. Our numerical study indicates thatthere is a non-trivial limit as the effect of jumps grows. The limit depends on thevalue of the drift parameter of the Ornstein-Uhlenbeck process.

  • 41.
    Bartoszek, Krzysztof
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Arts and Sciences.
    Trait evolution with jumps: illusionary normality2017In: Proceedings of the XXIII National Conference on Applications of Mathematics in Biology and Medicine, 2017, p. 23-28Conference paper (Refereed)
    Abstract [en]

    Phylogenetic comparative methods for real-valued traits usually make use of stochastic process whose trajectories are continuous.This is despite biological intuition that evolution is rather punctuated thangradual. On the other hand, there has been a number of recent proposals of evolutionarymodels with jump components. However, as we are only beginning to understandthe behaviour of branching Ornstein-Uhlenbeck (OU) processes the asymptoticsof branching  OU processes with jumps is an even greater unknown. In thiswork we build up on a previous study concerning OU with jumps evolution on a pure birth tree.We introduce an extinction component and explore via simulations, its effects on the weak convergence of such a process.We furthermore, also use this work to illustrate the simulation and graphic generation possibilitiesof the mvSLOUCH package.

  • 42.
    Bartoszek, Krzysztof
    et al.
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Arts and Sciences.
    Bartoszek, Wojciech
    Gdansk Univ Technol, Poland.
    Krzeminski, Michal
    Gdansk Univ Technol, Poland.
    Simple SIR models with Markovian control2021In: Japanese Journal of Statistics and Data Science, ISSN 2520-8756, Vol. 4, no 1, p. 731-762Article in journal (Refereed)
    Abstract [en]

    We consider a random dynamical system, where the deterministic dynamics are driven by a finite-state space Markov chain. We provide a comprehensive introduction to the required mathematical apparatus and then turn to a special focus on the susceptible-infected-recovered epidemiological model with random steering. Through simulations we visualize the behaviour of the system and the effect of the high-frequency limit of the driving Markov chain. We formulate some questions and conjectures of a purely theoretical nature.

    Download full text (pdf)
    fulltext
  • 43.
    Bartoszek, Krzysztof
    et al.
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Arts and Sciences.
    Coronado, Tomas M.
    Univ Balearic Isl, Spain; Balearic Isl Hlth Res Inst IdISBa, Spain.
    Mir, Arnau
    Univ Balearic Isl, Spain; Balearic Isl Hlth Res Inst IdISBa, Spain.
    Rossello, Francesc
    Univ Balearic Isl, Spain; Balearic Isl Hlth Res Inst IdISBa, Spain.
    Squaring within the Colless index yields a better balance index2021In: Mathematical Biosciences, ISSN 0025-5564, E-ISSN 1879-3134, Vol. 331, article id 108503Article in journal (Refereed)
    Abstract [en]

    The Colless index for bifurcating phylogenetic trees, introduced by Colless (1982), is defined as the sum, over all internal nodes v of the tree, of the absolute value of the difference of the sizes of the clades defined by the children of v. It is one of the most popular phylogenetic balance indices, because, in addition to measuring the balance of a tree in a very simple and intuitive way, it turns out to be one of the most powerful and discriminating phylogenetic shape indices. But it has some drawbacks. On the one hand, although its minimum value is reached at the so-called maximally balanced trees, it is almost always reached also at trees that are not maximally balanced. On the other hand, its definition as a sum of absolute values of differences makes it difficult to study analytically its distribution under probabilistic models of bifurcating phylogenetic trees. In this paper we show that if we replace in its definition the absolute values of the differences of Glade sizes by the squares of these differences, all these drawbacks are overcome and the resulting index is still more powerful and discriminating than the original Colless index.

    Download full text (pdf)
    fulltext
  • 44.
    Bartoszek, Krzysztof
    et al.
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Arts and Sciences. Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Matematiska institutionen, Analys och sannolikhetsteori.
    Domsta, Joachim
    State Univ Appl Sci Elblag, Krzysztof Brzeski Inst Appl Informat, Ul Wojska Polskiego 1, PL-82300 Elblag, Poland.
    Pulka, Malgorzata
    Gdansk Univ Technol, Dept Probabil & Biomath, Ul Narutowicza 11-12, PL-80233 Gdansk, Poland.
    Weak Stability of Centred Quadratic Stochastic Operators2019In: BULLETIN OF THE MALAYSIAN MATHEMATICAL SCIENCES SOCIETY, ISSN 0126-6705, Vol. 42, no 4, p. 1813-1830Article in journal (Refereed)
    Abstract [en]

    We consider the weak convergence of iterates of so-called centred quadratic stochastic operators. These iterations allow us to study the discrete time evolution of probability distributions of vector-valued traits in populations of inbreeding or hermaphroditic species, whenever the offsprings trait is equal to an additively perturbed arithmetic mean of the parents traits. It is shown that for the existence of a weak limit, it is sufficient that the distributions of the trait and the perturbation have a finite variance or have tails controlled by a suitable power function. In particular, probability distributions from the domain of attraction of stable distributions have found an application, although in general the limit is not stable.

    Download full text (pdf)
    Weak Stability of Centred Quadratic Stochastic Operators
  • 45.
    Bartoszek, Krzysztof
    et al.
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Arts and Sciences.
    Erhardsson, Torkel
    Linköping University, Department of Mathematics, Applied Mathematics. Linköping University, Faculty of Science & Engineering.
    NORMAL APPROXIMATION FOR MIXTURES OF NORMAL DISTRIBUTIONS AND THE EVOLUTION OF PHENOTYPIC TRAITS2021In: Advances in Applied Probability, ISSN 0001-8678, E-ISSN 1475-6064, Vol. 53, no 1, p. 162-188Article in journal (Refereed)
    Abstract [en]

    Explicit bounds are given for the Kolmogorov andWasserstein distances between a mixture of normal distributions, by which we mean that the conditional distribution given some sigma-algebra is normal, and a normal distribution with properly chosen parameter values. The bounds depend only on the first two moments of the first two conditional moments given the sigma-algebra. The proof is based on Steins method. As an application, we consider the Yule-Ornstein-Uhlenbeck model, used in the field of phylogenetic comparative methods. We obtain bounds for both distances between the distribution of the average value of a phenotypic trait over n related species, and a normal distribution. The bounds imply and extend earlier limit theorems by Bartoszek and Sagitov.

  • 46.
    Bartoszek, Krzysztof
    et al.
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Arts and Sciences.
    Fuentes Gonzalez, Jesualdo
    Florida International University, Miami, USA..
    Mitov, Venelin
    IntiQuan GmbH, Basel, Switzerland..
    Pienaar, Jason
    Florida International University, Miami, USA..
    Piwczyński, Marcin
    Nicolaus Copernicus University, Toruń, Poland..
    Puchałka, Radosław,
    Nicolaus Copernicus University, Toruń, Poland..
    Spalik, Krzysztof
    University of Warsaw, Warszawa, Poland..
    Voje, Kjetil
    University of Oslo, Oslo, Norway..
    Fast mvSLOUCH: Model comparison for multivariate Ornstein--Uhlenbeck-based models of trait evolution on large phylogenies2023Data set
    Abstract [en]

    These are the Supplementary Material, R scripts and numerical results accompanying Bartoszek, Fuentes Gonzalez, Mitov, Pienaar, Piwczyński, Puchałka, Spalik and Voje "Model Selection Performance in Phylogenetic Comparative Methods under multivariate Ornstein–Uhlenbeck Models of Trait Evolution".

    The four data files concern two datasets. Ungulates: measurements of muzzle width, unworn lower third molar crown height, unworn lower third molar crown width and feeding style and their phylogeny; Ferula: measurements of ratio of canals, periderm thickness, wing area, wing thickness,  and fruit mass, and their phylogeny.

    Methods

    Ungulates

    The compiled ungulate dataset involves two key components: phenotypic data (Data.csv) and phylogenetic tree (Tree.tre), which consist on the following (full references for the citations presented below are provided in the paper linked to this repository, which also provides further details on the compiled dataset):The phenotypic data includes three continuous variables and one categorical variable. The continuous variables (MZW: muzzle width; HM3: unworn lower third molar crown height; WM3: unworn lower third molar crown width), measured in cm, come from Mendoza et al. (2002; J. Zool.). The categorical variable (FS, i.e. feeding style: B=browsers, G=grazers, M=mixed feeders) is based on Pérez–Barbería and Gordon (2001; Proc. R. Soc. B: Biol. Sci.). Taxonomic mismatches between these two sources were resolved based on Wilson and Reeder (2005; Johns Hopkins University Press). Only taxa with full entries for all these variables were included (i.e. no missing data allowed).

    The phylogenetic tree is pruned from the unsmoothed mammalian timetree of Hedges et al. (2015; MBE) to only include the 104 ungulate species for which there is complete phenotypic data available. Wilson and Reeder (2005; Johns Hopkins University Press) was used again to resolve taxonomic mismatches with the phenotypic data. The branch lengths of the tree are scaled to unit height and thus informative of relative time.

    Ferula

    1) The phenotypic data are divided into two data sets: first containing five continuous variables (no_ME) measured on mericarps (the dispersal unit of fruit in Apiaceae), whereas the second having the same variables together with measurement error (ME; see paper for computational details) for 75 species of Ferula and three species of Leutea. Three continuous variables were measured on anatomical cross sections (ratio_canals_ln – the proportion of oil ducts covering the space between median and lateral ribs [dimensionless], mean_gr_peri_ln_um – periderm (fruit wall) thickness [μm], thick_wings_ln_um – wing thickness [μm]); the remaining two on whole mericarps (Wings_area_ln_mm – wings area [mm2], Seed_mass_ln_mg – seed mass [mg])

    2) The phylogenetic tree was pruned from the tree obtained from the recent taxonomic revision of the genus (Panahi et al. 2018) to only include the 78 species for which the phenotypic data were generated. This tree and the associated alignment, composed of one nuclear and three plastid markers (Panahi et al. 2018), constituted an input to mcmctree software (Yang 2007) to obtain dated tree using a secondary calibration point for the root based on Banasiak et al.’s (2013) work. The branch lengths of the final tree (Ferula_fruits_tree.txt) were scaled to unit height and thus informative of relative time.

    The R setup for the manuscript was as follows:

    R version 3.6.1 (2019-09-12) Platform: x86_64-pc-linux-gnu (64-bit) Running under: openSUSE Leap 42.3

    The exact output can depend on the random seed. However, in the script we have the option of rerunning the analyses as it was in the manuscript, i.e.the random seeds that were used to generate the results are saved, included and can be read in.

    The code is divided into several directories with scripts, random seeds and result files.

    1) LikelihoodTestingDirectory contains the script test_rotation_invariance_mvSLOUCH.R that demonstrates that mvSLOUCH's likelihood calculations are rotation invariant.        

    2) Carnivorans

    Directory contains files connected to the Carnivrons' vignette in mvSLOUCH.       

    2.1) Carnivora_mvSLOUCH_objects_Full.RData

    Full output of  running the R code in the vignette.With mvSLOUCH is a very bare-minimum subset of this file that allows for the creation of the vignette.           

     2.2) Carnivora_mvSLOUCH_objects.RData              

    Reduced objects from Carnivora_mvSLOUCH_objects_Full.RData that are included with mvSLOUCH's vignette.                            

    2.3) Carnivora_mvSLOUCH_objects_remove_script.R               

    R script to reduce Carnivora_mvSLOUCH_objects_Full.RData to Carnivora_mvSLOUCH_objects.RData.     

    2.4) mvSLOUCH_Carnivorans.Rmd               

    The vignette itself.           

    2.5) refs_mvSLOUCH.bib               

    Bib file for the vignette.           

    2.6) ScaledTree.png, ScaledTree2.png, ScaledTree3.png, ScaledTree4.png   

    Plots of phylogenetic trees for vignette.

    3) SimulationStudy

    Directory contains all the output of the simulation study presented in the manuscript and scripts that allow for replication (the random number generator seeds are also provided) or running ones own simulation study, and scripts to generate graphs, and model comparison summary. This study was done using version 2.6.2 of mvSLOUCH. If one reruns using mvSLOUCH >= 2.7, then one will obtain different (corrected) values of R2 and an additional R2 version.    

    4) Ungulates

    Directory contains files connected to the "Feeding styles and oral morphology in ungulates" analyses performed for the manuscript.       

    4.1) Data.csv       

    The phenotypic data includes three continuous variables and one categorical variable. Continuous variables (MZW: muzzle width; HM3: unworn lower third molar crown height; WM3: unworn lower third molar crown width) from Mendoza et al. (2002), measured in cm. Categorical variable (FS, i.e. feeding style: B=browsers, G=grazers, M=mixed feeders) based on Pérez–Barbería and Gordon (2001). Phylogeny pruned from Hedges et al. (2015).

    Taxonomic mismatches among these sources were resolved based on Wilson and Reeder (2005). Hedges, S. B., J. Marin, M. Suleski, M. Paymer, and S. Kumar. 2015. Tree of life reveals clock-like speciation and diversification. Molecular Biology and Evolution 32:835-845. Mendoza, M., C. M. Janis, and P. Palmqvist. 2002. Characterizing complex craniodental patterns related to feeding behaviour in ungulates:a multivariate approach. Journal of Zoology 258:223-246 Pérez–Barbería, F. J., and I. J. Gordon. 2001. Relationships between oral morphology and feeding style in the Ungulata: a phylogenetically controlled evaluation. Proceedings of the Royal Society of London. Series B: Biological Sciences 268:1023-1032. Wilson, D. E., and D. M. Reeder. 2005. Mammal species of the world: A taxonomic and geographic reference. Johns Hopkins University Press, Baltimore, Maryland.                

    4.2) Tree.tre       

    Ungulates' phylogeny, extracted from the mammalian phylogeny of Hedges, S. B., J. Marin, M. Suleski, M. Paymer, and S. Kumar. 2015. Tree of life reveals clock–like speciation and diversification. Mol. Biol. Evol. 32:835–845.           

    4.3) OUB.R, OUF.R, OUG.R       

    R scripts for the analyses performed in the manuscript. Different files correspond to different regime setups of the feeding style variable.           

    4.4) OU1.txt, OUB.txt, OUF.txt, OUG.txt       

    Outputs of the model comparison conducted under the R scripts presented above (4.3). Different files correspond to different regime setups of the feeding style variable.        

    5) Ferula analyses

    In the models_ME directory there are input and output files from the mvSLOUCH analyzes of Ferula data with measurement error included, while in the models_no_ME directory the analyzes of data without measurement error. In each directory, one can find the following files:

    - input files: Data_ME.csv (with mesurment error) or Data_no_ME.csv (without measurement error) and tree file in Newick format (Ferula_fruits_tree.txt); the trait names in data files are abbreviated as follows: ration_canals – the proportion of oil ducts covering the space between median and lateral ribs, mean_gr_peri – periderm thickness, wings_area – wing area, thick_wings – wing thickness and seed_mass – seed mass,

    - the results for 8 analyzed models (see Fig. 2 in the main text), each in separate directory named model1, model2 and so on,

    - each model directory comprises the following files: two R scripts (for analyzes with diagonal and with upper triangular matrix Σyy; each model was run 1000 times), two csv files included information such as number of repetition (i), seed for preliminary analyzes generating starting point (seed_start_point), seed for the main analyses (seed) and AIC, AICc, SIC, BIC, R2 and loglik for each model run (these csv files are sorted according to AICc values), two directories containing results for 1000 analyzes, and two files extracted from these directories showing parameter estimation for the best models (with UpperTri and Diagonal matrix Σyy)

  • 47.
    Bartoszek, Krzysztof
    et al.
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Arts and Sciences.
    Fuentes-Gonzalez, Jesualdo
    Florida Int Univ, FL USA; Florida Int Univ, FL USA.
    Mitov, Venelin
    IntiQuan GmbH, Switzerland.
    Pienaar, Jason
    Florida Int Univ, FL USA; Florida Int Univ, FL USA.
    Piwczynski, Marcin
    Nicolaus Copernicus Univ Torun, Poland.
    Puchalka, Radoslaw
    Nicolaus Copernicus Univ Torun, Poland.
    Spalik, Krzysztof
    Univ Warsaw, Poland.
    Voje, Kjetil Lysne
    Univ Oslo, Norway.
    Analytical advances alleviate model misspecification in non-Brownian multivariate comparative methods2024In: Evolution, ISSN 0014-3820, E-ISSN 1558-5646, Vol. 78, no 3, p. 389-400Article in journal (Refereed)
    Abstract [en]

    Adams and Collyer argue that contemporary multivariate (Gaussian) phylogenetic comparative methods are prone to favouring more complex models of evolution and sometimes rotation invariance can be an issue. Here we dissect the concept of rotation invariance and point out that, depending on the understanding, this can be an issue with any method that relies on numerical instead of analytical estimation approaches. We relate this to the ongoing discussion concerning phylogenetic principal component analysis. Contrary to what Adams and Collyer found, we do not observe a bias against the simpler Brownian motion process in simulations when we use the new, improved, likelihood evaluation algorithm employed by mvSLOUCH, which allows for studying much larger phylogenies and more complex model setups.

  • 48.
    Bartoszek, Krzysztof
    et al.
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Arts and Sciences. Uppsala University, Sweden.
    Glemin, Sylvain
    Uppsala University, Sweden; CNRS University of Montpellier IRD EPHE, France.
    Kaj, Ingemar
    Uppsala University, Sweden.
    Lascoux, Martin
    Uppsala University, Sweden.
    Using the Ornstein-Uhlenbeck process to model the evolution of interacting populations2017In: Journal of Theoretical Biology, ISSN 0022-5193, E-ISSN 1095-8541, Vol. 429, p. 35-45Article in journal (Refereed)
    Abstract [en]

    The Ornstein-Uhlenbeck (OU) process plays a major role in the analysis of the evolution of phenotypic traits along phylogenies. The standard OU process includes random perturbations and stabilizing selection and assumes that species evolve independently. However, evolving species may interact through various ecological process and also exchange genes especially in plants. This is particularly true if we want to study phenotypic evolution among diverging populations within species. In this work we present a straightforward statistical approach with analytical solutions that allows for the inclusion of adaptation and migration in a common phylogenetic framework, which can also be useful for studying local adaptation among populations within the same species. We furthermore present a detailed simulation study that clearly indicates the adverse effects of ignoring migration. Similarity between species due to migration could be misinterpreted as very strong convergent evolution without proper correction for these additional dependencies. Finally, we show that our model can be interpreted in terms of ecological interactions between species, providing a general framework for the evolution of traits between "interacting" species or populations.(C) 2017 Elsevier Ltd. All rights reserved.

    Download full text (pdf)
    fulltext
  • 49.
    Bartoszek, Krzysztof
    et al.
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Arts and Sciences.
    Gonzalez, Jesualdo Fuentes
    Department of Biological Sciences, Florida International University, Miami, Fl 33199, USA.
    Mitov, Venelin
    IntiQuan GmbH, Basel, Switzerland.
    Pienaar, Jason
    Department of Biological Sciences and the Institute of Environment, Florida International University, Miami, Fl 33199, USA.
    Piwczyński, Marcin
    Department of Ecology and Biogeography, Nicolaus Copernicus University in Toruń, Toruń, Poland.
    Puchałka, Radosław
    Department of Ecology and Biogeography, Nicolaus Copernicus University in Toruń, Toruń, Poland.
    Spalik, Krzysztof
    Institute of Evolutionary Biology, Faculty of Biology, Biological and Chemical Research Centre, University of Warsaw, Warszawa, Poland.
    Voje, Kjetil Lysne
    Natural History Museum, University of Oslo, Oslo, Norway.
    Model Selection Performance in Phylogenetic Comparative Methods Under Multivariate Ornstein–Uhlenbeck Models of Trait Evolution2023In: Systematic Biology, ISSN 1063-5157, E-ISSN 1076-836X, Vol. 72, no 2, p. 275-293Article in journal (Refereed)
    Abstract [en]

    The advent of fast computational algorithms for phylogenetic comparative methods allows for considering multiple hypotheses concerning the co-adaptation of traits and also for studying if it is possible to distinguish between such models based on contemporary species measurements. Here we demonstrate how one can perform a study with multiple competing hypotheses using mvSLOUCH by analyzing two data sets, one concerning feeding styles and oral morphology in ungulates, and the other concerning fruit evolution in Ferula (Apiaceae). We also perform simulations to determine if it is possible to distinguish between various adaptive hypotheses. We find that Akaikes information criterion corrected for small sample size has the ability to distinguish between most pairs of considered models. However, in some cases there seems to be bias towards Brownian motion or simpler Ornstein-Uhlenbeck models. We also find that measurement error and forcing the sign of the diagonal of the drift matrix for an Ornstein-Uhlenbeck process influences identifiability capabilities. It is a cliche that some models, despite being imperfect, are more useful than others. Nonetheless, having a much larger repertoire of models will surely lead to a better understanding of the natural world, as it will allow for dissecting in what ways they are wrong. [Adaptation; AICc; model selection; multivariate Ornstein-Uhlenbeck process; multivariate phylogenetic comparative methods; mvSLOUCH.]

  • 50.
    Bartoszek, Krzysztof
    et al.
    Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Arts and Sciences.
    Guidotti, Emanuele
    Univ Neuchatel, Switzerland.
    Iacus, Stefano Maria
    European Commiss, Italy.
    Okroj, Marcin
    Univ Gdansk, Poland; Med Univ Gdansk, Poland.
    Are official confirmed cases and fatalities counts good enough to study the COVID-19 pandemic dynamics? A critical assessment through the case of Italy2020In: Nonlinear dynamics, ISSN 0924-090X, E-ISSN 1573-269X, Vol. 101, p. 1951-1979Article in journal (Refereed)
    Abstract [en]

    As the COVID-19 outbreak is developing the two most frequently reported statistics seem to be the raw confirmed case and case fatalities counts. Focusing on Italy, one of the hardest hit countries, we look at how these two values could be put in perspective to reflect the dynamics of the virus spread. In particular, we find that merely considering the confirmed case counts would be very misleading. The number of daily tests grows, while the daily fraction of confirmed cases to total tests has a change point. It (depending on region) generally increases with strong fluctuations till (around, depending on region) 15-22 March and then decreases linearly after. Combined with the increasing trend of daily performed tests, the raw confirmed case counts are not representative of the situation and are confounded with the sampling effort. This we observe when regressing on time the logged fraction of positive tests and for comparison the logged raw confirmed count. Hence, calibrating model parameters for this viruss dynamics should not be done based only on confirmed case counts (without rescaling by the number of tests), but take also fatalities and hospitalization count under consideration as variables not prone to be distorted by testing efforts. Furthermore, reporting statistics on the national level does not say much about the dynamics of the disease, which are taking place at the regional level. These findings are based on the official data of total death counts up to 15 April 2020 released by ISTAT and up to 10 May 2020 for the number of cases. In this work, we do not fit models but we rather investigate whether this task is possible at all. This work also informs about a new tool to collect and harmonize official statistics coming from different sources in the form of a package for the R statistical environment and presents the "COVID-19 Data Hub."

    Download full text (pdf)
    fulltext
1234567 1 - 50 of 417
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf