liu.seSearch for publications in DiVA
Change search
Link to record
Permanent link

Direct link
Publications (10 of 77) Show all publications
Ekström Kelvinius, F. & Lindsten, F. (2024). Discriminator Guidance for Autoregressive Diffusion Models. In: Proceedings of The 27th International Conference on Artificial Intelligence and Statistics: . Paper presented at International Conference on Artificial Intelligence and Statistics, 2-4 May 2024, Palau de Congressos, Valencia, Spain (pp. 3403-3411). PMLR, 238
Open this publication in new window or tab >>Discriminator Guidance for Autoregressive Diffusion Models
2024 (English)In: Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR , 2024, Vol. 238, p. 3403-3411Conference paper, Published paper (Refereed)
Abstract [en]

We introduce discriminator guidance in the setting of Autoregressive Diffusion Models. The use of a discriminator to guide a diffusion process has previously been used for continuous diffusion models, and in this work we derive ways of using a discriminator together with a pretrained generative model in the discrete case. First, we show that using an optimal discriminator will correct the pretrained model and enable exact sampling from the underlying data distribution. Second, to account for the realistic scenario of using a sub-optimal discriminator, we derive a sequential Monte Carlo algorithm which iteratively takes the predictions from the discriminator into account during the generation process. We test these approaches on the task of generating molecular graphs and show how the discriminator improves the generative performance over using only the pretrained model.

Place, publisher, year, edition, pages
PMLR, 2024
Series
Proceedings of Machine Learning Research, ISSN 2640-3498
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:liu:diva-203716 (URN)
Conference
International Conference on Artificial Intelligence and Statistics, 2-4 May 2024, Palau de Congressos, Valencia, Spain
Note

This article has a CC BY-licence.

Available from: 2024-05-27 Created: 2024-05-27 Last updated: 2024-05-31Bibliographically approved
Olmin, A., Lindqvist, J., Svensson, L. & Lindsten, F. (2024). On the connection between Noise-Contrastive Estimation and Contrastive Divergence. In: : . Paper presented at International Conference on Artificial Intelligence and Statistics, 2-4 May 2024, Palau de Congressos, Valencia, Spain (pp. 3016-3024). , 238
Open this publication in new window or tab >>On the connection between Noise-Contrastive Estimation and Contrastive Divergence
2024 (English)Conference paper, Published paper (Refereed)
Abstract [en]

Noise-contrastive estimation (NCE) is a popular method for estimating unnormalised probabilistic models, such as energy-based models, which are effective for modelling complex data distributions. Unlike classical maximum likelihood (ML) estimation that relies on importance sampling (resulting in ML-IS) or MCMC (resulting in contrastive divergence, CD), NCE uses a proxy criterion to avoid the need for evaluating an often intractable normalisation constant. Despite apparent conceptual differences, we show that two NCE criteria, ranking NCE (RNCE) and conditional NCE (CNCE), can be viewed as ML estimation methods. Specifically, RNCE is equivalent to ML estimation combined with conditional importance sampling, and both RNCE and CNCE are special cases of CD. These findings bridge the gap between the two method classes and allow us to apply techniques from the ML-IS and CD literature to NCE, offering several advantageous extensions.

Series
Proceedings of Machine Learning Research, ISSN 2640-3498
Keywords
Unnormalised models, noise-contrastive estimation, contrastive divergence
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:liu:diva-204020 (URN)
Conference
International Conference on Artificial Intelligence and Statistics, 2-4 May 2024, Palau de Congressos, Valencia, Spain
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Note

This article has a CC BY-licence.

Available from: 2024-05-31 Created: 2024-05-31 Last updated: 2024-06-05Bibliographically approved
Ahmadian, A., Ding, Y., Eilertsen, G. & Lindsten, F. (2024). Unsupervised Novelty Detection in Pretrained Representation Space with Locally Adapted Likelihood Ratio. In: International Conference on Artificial Intelligence and Statistics 2024, Proceedings of Machine Learning Research: . Paper presented at The 27th International Conference on Artificial Intelligence and Statistics (AISTATS).
Open this publication in new window or tab >>Unsupervised Novelty Detection in Pretrained Representation Space with Locally Adapted Likelihood Ratio
2024 (English)In: International Conference on Artificial Intelligence and Statistics 2024, Proceedings of Machine Learning Research, 2024Conference paper, Published paper (Refereed)
National Category
Computer Sciences Computer Vision and Robotics (Autonomous Systems) Signal Processing
Identifiers
urn:nbn:se:liu:diva-203391 (URN)
Conference
The 27th International Conference on Artificial Intelligence and Statistics (AISTATS)
Available from: 2024-05-08 Created: 2024-05-08 Last updated: 2024-05-16
Zimmermann, H., Lindsten, F., Meent, J.-W. v. & Naesseth, C. A. (2023). A Variational Perspective on Generative Flow Networks. Transactions on Machine Learning Research
Open this publication in new window or tab >>A Variational Perspective on Generative Flow Networks
2023 (English)In: Transactions on Machine Learning Research, E-ISSN 2835-8856Article in journal (Refereed) Published
National Category
Probability Theory and Statistics Computer Sciences
Identifiers
urn:nbn:se:liu:diva-204028 (URN)
Funder
ELLIIT - The Linköping‐Lund Initiative on IT and Mobile CommunicationsWallenberg AI, Autonomous Systems and Software Program (WASP)Swedish Research Council, 2020-04122
Available from: 2024-06-01 Created: 2024-06-01 Last updated: 2024-06-01
Olmin, A., Lindqvist, J., Svensson, L. & Lindsten, F. (2023). Active Learning with Weak Supervision for Gaussian Processes. In: M. Tanveer et al. (Ed.), Neural Information Processing 29th International Conference, ICONIP 2022, Virtual Event, November 22–26, 2022, Proceedings, Part V: . Paper presented at 29th International Conference on Neural Information Processing, ICONIP 2022, Virtual Event, November 22–26, 2022 (pp. 195-204). Singapore: Springer Nature
Open this publication in new window or tab >>Active Learning with Weak Supervision for Gaussian Processes
2023 (English)In: Neural Information Processing 29th International Conference, ICONIP 2022, Virtual Event, November 22–26, 2022, Proceedings, Part V / [ed] M. Tanveer et al., Singapore: Springer Nature, 2023, p. 195-204Conference paper, Published paper (Refereed)
Abstract [en]

Annotating data for supervised learning can be costly. When the annotation budget is limited, active learning can be used to select and annotate those observations that are likely to give the most gain in model performance. We propose an active learning algorithm that, in addition to selecting which observation to annotate, selects the precision of the annotation that is acquired. Assuming that annotations with low precision are cheaper to obtain, this allows the model to explore a larger part of the input space, with the same annotation budget. We build our acquisition function on the previously proposed BALD objective for Gaussian Processes, and empirically demonstrate the gains of being able to adjust the annotation precision in the active learning loop.

Place, publisher, year, edition, pages
Singapore: Springer Nature, 2023
Series
Communications in Computer and Information Science, ISSN 1865-0929, E-ISSN 1865-0937 ; 1792
Keywords
Machine learning, Active learning, Weak supervision
National Category
Computer Sciences
Identifiers
urn:nbn:se:liu:diva-195039 (URN)10.1007/978-981-99-1642-9_17 (DOI)978-981-99-1641-2 (ISBN)978-981-99-1642-9 (ISBN)
Conference
29th International Conference on Neural Information Processing, ICONIP 2022, Virtual Event, November 22–26, 2022
Available from: 2023-06-14 Created: 2023-06-14 Last updated: 2023-06-15
Govindarajan, H., Sidén, P., Roll, J. & Lindsten, F. (2023). DINO as a von Mises-Fisher mixture model. In: The Eleventh International Conference on Learning Representations: . Paper presented at The Eleventh International Conference on Learning Representations, ICLR 2023.
Open this publication in new window or tab >>DINO as a von Mises-Fisher mixture model
2023 (English)In: The Eleventh International Conference on Learning Representations, 2023Conference paper, Published paper (Refereed)
Abstract [en]

Self-distillation methods using Siamese networks are popular for self-supervised pre-training. DINO is one such method based on a cross-entropy loss between K-dimensional probability vectors, obtained by applying a softmax function to the dot product between representations and learnt prototypes. Given the fact that the learned representations are L2-normalized, we show that DINO and its derivatives, such as iBOT, can be interpreted as a mixture model of von Mises-Fisher components. With this interpretation, DINO assumes equal precision for all components when the prototypes are also L2-normalized. Using this insight we propose DINO-vMF, that adds appropriate normalization constants when computing the cluster assignment probabilities. Unlike DINO, DINO-vMF is stable also for the larger ViT-Base model with unnormalized prototypes. We show that the added flexibility of the mixture model is beneficial in terms of better image representations. The DINO-vMF pre-trained model consistently performs better than DINO on a range of downstream tasks. We obtain similar improvements for iBOT-vMF vs iBOT and thereby show the relevance of our proposed modification also for other methods derived from DINO.

Keywords
self-supervised learning, vision transformers, mixture models
National Category
Computer Vision and Robotics (Autonomous Systems)
Identifiers
urn:nbn:se:liu:diva-203630 (URN)
Conference
The Eleventh International Conference on Learning Representations, ICLR 2023
Funder
Swedish Research Council, 2020-04122Wallenberg AI, Autonomous Systems and Software Program (WASP)ELLIIT - The Linköping‐Lund Initiative on IT and Mobile Communications
Available from: 2024-05-21 Created: 2024-05-21 Last updated: 2024-05-31
Ahmadian, A. & Lindsten, F. (2023). Enhancing Representation Learning with Deep Classifiers in Presence of Shortcut. In: Proceedings of IEEE ICASSP 2023: . Paper presented at 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
Open this publication in new window or tab >>Enhancing Representation Learning with Deep Classifiers in Presence of Shortcut
2023 (English)In: Proceedings of IEEE ICASSP 2023, 2023Conference paper, Published paper (Refereed)
Abstract [en]

A deep neural classifier trained on an upstream task can be leveraged to boost the performance of another classifier in a related downstream task through the representations learned in hidden layers. However, presence of shortcuts (easy-to-learn features) in the upstream task can considerably impair the versatility of intermediate representations and, in turn, the downstream performance. In this paper, we propose a method to improve the representations learned by deep neural image classifiers in spite of a shortcut in upstream data. In our method, the upstream classification objective is augmented with a type of adversarial training where an auxiliary network, so called lens, fools the classifier by exploiting the shortcut in reconstructing images. Empirical comparisons in self-supervised and transfer learning problems with three shortcut-biased datasets suggest the advantages of our method in terms of downstream performance and/or training time.

Keywords
Deep Representation Learning, Shortcut Learning, Transfer Learning, Adversarial Methods, Computer Vision
National Category
Computer Sciences Computer Vision and Robotics (Autonomous Systems)
Identifiers
urn:nbn:se:liu:diva-198763 (URN)10.1109/ICASSP49357.2023.10096346 (DOI)
Conference
2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Available from: 2023-10-26 Created: 2023-10-26 Last updated: 2023-11-01
Glaser, P., Widmann, D., Lindsten, F. & Gretton, A. (2023). Fast and Scalable Score-Based Kernel Calibration Tests. In: Thirty-Ninth Conference on Uncertainty in Artificial Intelligence: PMLR 216. Paper presented at Uncertainty in Artificial Intelligence.
Open this publication in new window or tab >>Fast and Scalable Score-Based Kernel Calibration Tests
2023 (English)In: Thirty-Ninth Conference on Uncertainty in Artificial Intelligence: PMLR 216, 2023Conference paper, Published paper (Refereed)
Abstract [en]

We introduce the Kernel Calibration Conditional Stein Discrepancy test (KCCSD test), a non-parametric, kernel-based test for assessing the calibration of probabilistic models with well-defined scores. In contrast to previous methods, our test avoids the need for possibly expensive expectation approximations while providing control over its type-I error. We achieve these improvements by using a new family of kernels for score-based probabilities that can be estimated without probability density samples, and by using a conditional goodness-of-fit criterion for the KCCSD test’s U-statistic. The tractability of the KCCSD test widens the surface area of calibration measures to new promising use-cases, such as regularization during model training. We demonstrate the properties of our test on various synthetic settings.

National Category
Probability Theory and Statistics Computer Sciences
Identifiers
urn:nbn:se:liu:diva-204029 (URN)
Conference
Uncertainty in Artificial Intelligence
Available from: 2024-06-01 Created: 2024-06-01 Last updated: 2024-06-27Bibliographically approved
Lindqvist, J., Olmin, A., Svensson, L. & Lindsten, F. (2023). Generalised Active Learning With Annotation Quality Selection. In: IEEE 33rd International Workshop on Machine Learning for Signal Processing (MLSP): . Paper presented at IEEE 33rd International Workshop on Machine Learning for Signal Processing (MLSP), Rome, Italy, 17-20 September, 2023..
Open this publication in new window or tab >>Generalised Active Learning With Annotation Quality Selection
2023 (English)In: IEEE 33rd International Workshop on Machine Learning for Signal Processing (MLSP), 2023Conference paper, Published paper (Refereed)
Abstract [en]

In this paper we promote a general formulation of active learning (AL), wherein the typically binary decision to annotate a point or not is extended to selecting the qualities with which the points should be annotated. By linking the annotation quality to the cost of acquiring the label, we can trade a lower quality for a larger set of training samples, which may improve learning for the same annotation cost. To investigate this AL formulation, we introduce a concrete criterion, based on the mutual information (MI) between model parameters and noisy labels, for selecting annotation qualities for the entire dataset, before any labels are acquired. We illustrate the usefulness of our formulation with examples for both classification and regression and find that MI is a good candidate for a criterion, but its complexity limits its usefulness.

Keywords
Training;Costs;Annotations;Conferences;Machine learning;Signal processing;Complexity theory;Active learning;noisy labels;mutual information
National Category
Computer Sciences Probability Theory and Statistics
Identifiers
urn:nbn:se:liu:diva-204030 (URN)10.1109/MLSP55844.2023.10285931 (DOI)
Conference
IEEE 33rd International Workshop on Machine Learning for Signal Processing (MLSP), Rome, Italy, 17-20 September, 2023.
Available from: 2024-06-01 Created: 2024-06-01 Last updated: 2024-06-27Bibliographically approved
Varga, J., Karlsson, E., Raidl, G. R., Rönnberg, E., Lindsten, F. & Rodemann, T. (2023). Speeding Up Logic-Based Benders Decomposition by Strengthening Cuts with Graph Neural Networks. In: Giuseppe Nicosia, Varun Ojha, Emanuele La Malfa, Gabriele La Malfa, Panos M. Pardalos, Renato Umeton (Ed.), Machine Learning, Optimization, and Data Science: . Paper presented at 9th International Conference, LOD 2023, Grasmere, UK, September 22–26, 2023 (pp. 24-38). Cham
Open this publication in new window or tab >>Speeding Up Logic-Based Benders Decomposition by Strengthening Cuts with Graph Neural Networks
Show others...
2023 (English)In: Machine Learning, Optimization, and Data Science / [ed] Giuseppe Nicosia, Varun Ojha, Emanuele La Malfa, Gabriele La Malfa, Panos M. Pardalos, Renato Umeton, Cham, 2023, p. 24-38Conference paper, Published paper (Refereed)
Abstract [en]

Logic-based Benders decomposition is a technique to solve optimization problems to optimality. It works by splitting the problem into a master problem, which neglects some aspects of the problem, and a subproblem, which is used to iteratively produce cuts for the master problem to account for those aspects. It is critical for the computational performance that these cuts are strengthened, but the strengthening of cuts comes at the cost of solving additional subproblems. In this work we apply a graph neural network in an autoregressive fashion to approximate the compilation of an irreducible cut, which then only requires few postprocessing steps to ensure its validity. We test the approach on a job scheduling problem with a single machine and multiple time windows per job and compare to approaches from the literature. Results show that our approach is capable of considerably reducing the number of subproblems that need to be solved and hence the total computational effort.

Place, publisher, year, edition, pages
Cham: , 2023
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 14505
Keywords
Logic-based Benders Decomposition; Cut Strengthening; Graph Neural Networks; Job Scheduling
National Category
Computational Mathematics
Identifiers
urn:nbn:se:liu:diva-200891 (URN)10.1007/978-3-031-53969-5_3 (DOI)001217088300003 ()9783031539688 (ISBN)9783031539695 (ISBN)
Conference
9th International Conference, LOD 2023, Grasmere, UK, September 22–26, 2023
Note

Funding Agencies|Honda Research Institute Europe

Available from: 2024-02-15 Created: 2024-02-15 Last updated: 2024-06-12Bibliographically approved
Projects
Sequential Monte Carlo Workshop [2017-00515_VR]; Uppsala UniversityFuture Leaders Program of STS Forum [2018-03088_ VINNOVA]; Uppsala University
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0003-3749-5820

Search in DiVA

Show all publications