liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Data-Driven Robot Perception in the Wild
Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
2023 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

As technology continues to advance, the interest in the relief of humans from tedious or dangerous tasks through automation increases. Some of the tasks that have received increasing attention are autonomous driving, disaster relief, and forestry inspection. Developing and deploying an autonomous robotic system to this type of unconstrained environments —in a safe way— is highly challenging. The system requires precise control and high-level decision making. Both of which require a robust and reliable perception system to understand the surroundings correctly. 

The main purpose of perception is to extract meaningful information from the environment, be it in the form of 3D maps, dense classification of the type of object and surfaces, or high-level information about the position and direction of moving objects. Depending on the limitations and application of the system, various types of sensors can be used: lidars, to collect sparse depth information; cameras, to collect dense information for different parts of the visual spectra, of-ten the red-green-blue (RGB) bands; Inertial Measurements Units (IMUs), to estimate the ego motion; microphones, to interact and respond to humans; GPS receivers, to get global position information; just to mention a few. 

This thesis investigates some of the necessities to approach the requirements of this type of system. Specifically, focusing on data-driven approaches, that is, machine learning, which has been shown time and again to be the main competitor for high-performance perception tasks in recent years. Although precision requirements might be high in industrial production plants, the environment is relatively controlled and the task is fixed. Instead, this thesis is studying some of the aspects necessary for complex, unconstrained environments, primarily outdoors and potentially near humans or other systems. The term in the wild refers exactly to the unconstrained nature of these environments, where the system can easily encounter something previously unseen and where the system might interact with unknowing humans. Some examples of environments are: city traffic, disaster relief scenarios, and dense forests. 

This thesis will mainly focus on the following three key aspects necessary to handle the types of tasks and situations that could occur in the wild: 1) generalizing to a new environment, 2) adapting to new tasks and requirements, and 3) modeling uncertainty in the perception system. 

First, a robotic system should be able to generalize to new environments and still function reliably. Papers B and G address this by using an intermediate representation to allow the system to handle much more diverse types of environment than otherwise possible. Paper B also investigates how robust the proposed autonomous driving system was to incorrect predictions, which is one of the likely results of changing the environment. 

Second, a robot should be sufficiently adaptive to allow it to learn new tasks without forgetting the previous ones. Paper E proposed a way to allow incrementally adding new semantic classes to a trained model without access to the previous training data. The approach is based on utilizing the uncertainty in the predictions to model the unknown classes, marked as background. 

Finally, the perception system will always be partially flawed, either because of the lack of modeling capabilities or because of ambiguities in the sensor data. To properly take this into account, it is fundamental that the system has the ability to estimate the certainty in the predictions. Paper F proposed a method for predicting the uncertainty in the model predictions when interpolating sparse data. Paper G addresses the ambiguities that exist when estimating the 3D pose of a human from a single camera image. 

Abstract [sv]

Allt eftersom tekniken utvecklas ökar intresset av att underlätta för människan genom att automatisera vissa farliga eller slitsamma uppgifter. Några av de områden som har potential för att automatisera är: transporter, genom självkörande bilar; räddningsarbete i samband med katastrofer; samt inventering av skog och liknande. Den här typen av komplicerade och potentiellt farliga miljöer kräver avancerade beslutssystem samt precisa kontrollsystem. Båda dessa delar kräver en robust och tillförlitlig perception av omgivningen.

Perceptionens huvudsyfte är att extrahera meningsfull information från omgivning som kan underlätta för planering och utförande av olika typer av uppgifter. Informationen som sådan kan vara i form av 3D kartor, detaljerad information om typ av underlag samt information om enstaka objekt i form av deras position samt rörelser. Ett autonomt system kan vara konstruerat på flera sätt men några av de vanliga sensorerna som används är: lidar, för att samla in glesa 3D mätningar om underlag och hinder; kameror, för att samla in färg- eller temperaturinformation från objekt i omgivningen; IMU, för att skatta hur systemet förflyttar sig; samt GPS för att kunna positionera systemet utomhus i ett globalt koordinatsystem.

Den här avhandlingen undersöker en del av de komponenter som krävs för att uppfylla de krav på perception som finns. Fokuset i avhandlingen är på maskininlärning, vilket har påvisats kunna hantera många avancerade uppgifter på ett robust sätt. Avhandlingen fokuserar inte på de högprecisionskrav vilka finns inom industriell tillverkningsindustri, utan fokuset är på att kunna hantera de komplicerade och utmanande miljöerna som klassas som in the wild. Några exempel på den här typen av miljöer är: stadstrafik, katastrofområden, samt täta skogar.

Tre aspekter av problemet avhandlas i den här avhandlingen: 1) generaliserande till andra miljöer, 2) anpassning till nya uppgifter samt miljöer, och 3) modellera eventuella osäkerheter.

Ett autonomt system ska helst inte vara begränsad till en typ av miljö, till exempel ska inte en självkörande bil bara kunna hantera skinande sol på motorvägar i bra skick. Artikel B och G adresserar detta till viss del genom att separera uppgiften i två delproblem, där den första genererar input data till den andra delen. Träningsdatan för delproblem ett är lättare att samla från varierande miljöer, vilket gör den mer generell än om all enbart träningsdata för hela problem är tillgängligt. Artikel B undersöker även hur felkällor i den här representationen påverkar systemet som helhet.

Ett autonomt system bör även vara designat för att kunna anpassas till nya uppgifter på ett effektivt sätt. Artikel E undersökte det här problemet från perspektivet att kunna utöka den mängd av kända klasser som systemet känner till, utan att träna om det helt och hållet.

Slutligen behöver man acceptera att perceptionen aldrig kommer kunna bli perfekt i alla typer av miljöer utan det kommer alltid finnas viss osäkerhet. Den här osäkerheten kan dels komma från modellen som sådan, men det är också möjligt att sensor data inte räcker till för att kunna avgöra vilken av flera möjligheter som är den sanna. Artikel F designade ett system för att kunna skatta osäkerheten i dess estimat medan artikel G fokuserar på hur man kan hantera osäkerheten kring hur en människa står om en del av kroppen är skymd.  

Place, publisher, year, edition, pages
Linköping: Linköping University Electronic Press, 2023. , p. 45
Series
Linköping Studies in Science and Technology. Dissertations, ISSN 0345-7524 ; 2293
National Category
Computer graphics and computer vision
Identifiers
URN: urn:nbn:se:liu:diva-192087DOI: 10.3384/9789180750677ISBN: 9789180750660 (print)ISBN: 9789180750677 (electronic)OAI: oai:DiVA.org:liu-192087DiVA, id: diva2:1740415
Public defence
2023-03-31, Ada Lovelace, B-building and online via: https://liu-se.zoom.us/j/63470801417, Campus Valla, Linköping, 09:15 (English)
Opponent
Supervisors
Note

Funding agencies: the European Union's Horizon 2020 Program; Sweden´s Innovation Agency (Vinnova); the Swedish Research Council (VR); and the Swedish Foundation for Strategic Research (SSF).

Available from: 2023-03-01 Created: 2023-03-01 Last updated: 2025-02-07Bibliographically approved
List of papers
1. Computing a Collision-Free Path using the monogenic scale space
Open this publication in new window or tab >>Computing a Collision-Free Path using the monogenic scale space
2018 (English)In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, 2018, p. 8097-8102Conference paper, Published paper (Refereed)
Abstract [en]

Mobile robots have been used for various purposes with different functionalities which require them to freely move in environments containing both static and dynamic obstacles to accomplish given tasks. One of the most relevant capabilities in terms of navigating a mobile robot in such an environment is to find a safe path to a goal position. This paper shows that there exists an accurate solution to the Laplace equation which allows finding a collision-free path and that it can be efficiently calculated for a rectangular bounded domain such as a map which is represented as an image. This is accomplished by the use of the monogenic scale space resulting in a vector field which describes the attracting and repelling forces from the obstacles and the goal. The method is shown to work in reasonably convex domains and by the use of tessellation of the environment map for non-convex environments.

Place, publisher, year, edition, pages
IEEE, 2018
Series
International Conference on Intelligent Robots and Systems (IROS), ISSN 2153-0858
National Category
Computer graphics and computer vision Engineering and Technology
Identifiers
urn:nbn:se:liu:diva-152713 (URN)10.1109/IROS.2018.8593583 (DOI)000458872707044 ()978-1-5386-8094-0 (ISBN)978-1-5386-8095-7 (ISBN)978-1-5386-8093-3 (ISBN)
Conference
IROS 2018, Madrid, Spain, October 1-5, 2018
Note

Funding agencies:This work was founded by the European Union's Horizon 2020 Programme under grant agreement 644839 (CEN-TAURO).

Available from: 2018-11-16 Created: 2018-11-16 Last updated: 2025-02-01
2. A Bayesian Approach to Reinforcement Learning of Vision-Based Vehicular Control
Open this publication in new window or tab >>A Bayesian Approach to Reinforcement Learning of Vision-Based Vehicular Control
2021 (English)In: 2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), IEEE COMPUTER SOC , 2021, p. 3947-3954Conference paper, Published paper (Refereed)
Abstract [en]

In this paper, we present a state-of-the-art reinforcement learning method for autonomous driving. Our approach employs temporal difference learning in a Bayesian framework to learn vehicle control signals from sensor data. The agent has access to images from a forward facing camera, which are pre-processed to generate semantic segmentation maps. We trained our system using both ground truth and estimated semantic segmentation input. Based on our observations from a large set of experiments, we conclude that training the system on ground truth input data leads to better performance than training the system on estimated input even if estimated input is used for evaluation. The system is trained and evaluated in a realistic simulated urban environment using the CARLA simulator. The simulator also contains a benchmark that allows for comparing to other systems and methods. The required training time of the system is shown to be lower and the performance on the benchmark superior to competing approaches.

Place, publisher, year, edition, pages
IEEE COMPUTER SOC, 2021
Series
International Conference on Pattern Recognition, ISSN 1051-4651
Keywords
Reinforcement Learning; Semantic Segmentation; Autonomous Driving; Bayesian method
National Category
Computer graphics and computer vision
Identifiers
urn:nbn:se:liu:diva-178788 (URN)10.1109/ICPR48806.2021.9412200 (DOI)000678409204009 ()978-1-7281-8808-9 (ISBN)
Conference
25th International Conference on Pattern Recognition (ICPR), ELECTR NETWORK, jan 10-15, 2021
Note

Funding Agencies|SSF project [RIT15-0097]; Wallenberg AI, Autonomous Systems and Software Program (WASP) - Knut and Alice Wallenberg Foundation

Available from: 2021-09-01 Created: 2021-09-01 Last updated: 2025-02-07
3. Flexible Disaster Response of Tomorrow: Final Presentation and Evaluation of the CENTAURO System
Open this publication in new window or tab >>Flexible Disaster Response of Tomorrow: Final Presentation and Evaluation of the CENTAURO System
Show others...
2019 (English)In: IEEE robotics & automation magazine, ISSN 1070-9932, E-ISSN 1558-223X, Vol. 26, no 4, p. 59-72Article in journal (Refereed) Published
Abstract [en]

Mobile manipulation robots have great potential for roles in support of rescuers on disaster-response missions. Robots can operate in places too dangerous for humans and therefore can assist in accomplishing hazardous tasks while their human operators work at a safe distance. We developed a disaster-response system that consists of the highly flexible Centauro robot and suitable control interfaces, including an immersive telepresence suit and support-operator controls offering different levels of autonomy.

Place, publisher, year, edition, pages
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC, 2019
Keywords
Robot sensing systems; Task analysis; Hardware; Batteries; Legged locomotion
National Category
Robotics and automation
Identifiers
urn:nbn:se:liu:diva-162953 (URN)10.1109/MRA.2019.2941248 (DOI)000502779800009 ()
Note

Funding Agencies|European UnionEuropean Union (EU) [644839]

Available from: 2020-01-02 Created: 2020-01-02 Last updated: 2025-02-09
4. Class-Incremental Learning for Semantic Segmentation - A study
Open this publication in new window or tab >>Class-Incremental Learning for Semantic Segmentation - A study
2021 (English)In: 2021 Swedish Artificial Intelligence Society Workshop (SAIS), IEEE , 2021, p. 25-28Conference paper, Published paper (Refereed)
Abstract [en]

One of the main challenges of applying deep learning for robotics is the difficulty of efficiently adapting to new tasks while still maintaining the same performance on previous tasks. The problem of incrementally learning new tasks commonly struggles with catastrophic forgetting in which the previous knowledge is lost.Class-incremental learning for semantic segmentation, addresses this problem in which we want to learn new semantic classes without having access to labeled data for previously learned classes. This is a problem in industry, where few pre-trained models and open datasets matches exactly the requisites. In these cases it is both expensive and labour intensive to collect an entirely new fully-labeled dataset. Instead, collecting a smaller dataset and only labeling the new classes is much more efficient in terms of data collection.In this paper we present the class-incremental learning problem for semantic segmentation, we discuss related work in terms of the more thoroughly studied classification task and experimentally validate the current state-of-the-art for semantic segmentation. This lays the foundation as we discuss some of the problems that still needs to be investigated and improved upon in order to reach a new state-of-the-art for class-incremental semantic segmentation.

Place, publisher, year, edition, pages
IEEE, 2021
Keywords
Industries, Deep learning, Conferences, Semantics, Labeling, Task analysis, Artificial intelligence
National Category
Computer Sciences
Identifiers
urn:nbn:se:liu:diva-189039 (URN)10.1109/sais53221.2021.9483955 (DOI)000855522600007 ()9781665442367 (ISBN)9781665442374 (ISBN)
Conference
2021 Swedish Artificial Intelligence Society Workshop (SAIS), 14-15 June 2021, Sweden
Funder
Vinnova
Note

Funding agencies: Vinnova [2020-02838]

Available from: 2022-10-08 Created: 2022-10-08 Last updated: 2023-03-01Bibliographically approved
5. Evidential Deep Learning for Class-Incremental Semantic Segmentation
Open this publication in new window or tab >>Evidential Deep Learning for Class-Incremental Semantic Segmentation
2023 (English)In: Image Analysis. SCIA 2023. / [ed] Rikke Gade, Michael Felsberg, Joni-Kristian Kämäräinen, Springer, 2023, p. 32-48Conference paper, Published paper (Refereed)
Abstract [en]

Class-Incremental Learning is a challenging problem in machine learning that aims to extend previously trained neural networks with new classes. This is especially useful if the system is able to classify new objects despite the original training data being unavailable. Although the semantic segmentation problem has received less attention than classification, it poses distinct problems and challenges, since previous and future target classes can be unlabeled in the images of a single increment. In this case, the background, past and future classes are correlated and there exists a background-shift.

In this paper, we address the problem of how to model unlabeled classes while avoiding spurious feature clustering of future uncorrelated classes. We propose to use Evidential Deep Learning to model the evidence of the classes as a Dirichlet distribution. Our method factorizes the problem into a separate foreground class probability, calculated by the expected value of the Dirichlet distribution, and an unknown class (background) probability corresponding to the uncertainty of the estimate. In our novel formulation, the background probability is implicitly modeled, avoiding the feature space clustering that comes from forcing the model to output a high background score for pixels that are not labeled as objects. Experiments on the incremental Pascal VOC and ADE20k benchmarks show that our method is superior to the state of the art, especially when repeatedly learning new classes with increasing number of increments.

Place, publisher, year, edition, pages
Springer, 2023
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 13886
Keywords
Class-incremental learning, Continual-learning, Semantic Segmentation
National Category
Computer graphics and computer vision
Identifiers
urn:nbn:se:liu:diva-193265 (URN)10.1007/978-3-031-31438-4_3 (DOI)001592157300003 ()2-s2.0-85161371821 (Scopus ID)9783031314377 (ISBN)9783031314384 (ISBN)
Conference
SCIA 2023, 23rd Scandinavian Conference on Image Analysis. Sirkka, Finland, April 18–21, 2023
Note

Funding Agencies|Sweden's Innovation Agency (Vinnova)

Available from: 2023-04-26 Created: 2023-04-26 Last updated: 2026-02-05Bibliographically approved
6. Uncertainty-Aware CNNs for Depth Completion: Uncertainty from Beginning to End
Open this publication in new window or tab >>Uncertainty-Aware CNNs for Depth Completion: Uncertainty from Beginning to End
2020 (English)In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2020, p. 12011-12020Conference paper, Published paper (Refereed)
Abstract [en]

The focus in deep learning research has been mostly to push the limits of prediction accuracy. However, this was often achieved at the cost of increased complexity, raising concerns about the interpretability and the reliability of deep networks. Recently, an increasing attention has been given to untangling the complexity of deep networks and quantifying their uncertainty for different computer vision tasks. Differently, the task of depth completion has not received enough attention despite the inherent noisy nature of depth sensors. In this work, we thus focus on modeling the uncertainty of depth data in depth completion starting from the sparse noisy input all the way to the final prediction. We propose a novel approach to identify disturbed measurements in the input by learning an input confidence estimator in a self-supervised manner based on the normalized convolutional neural networks (NCNNs). Further, we propose a probabilistic version of NCNNs that produces a statistically meaningful uncertainty measure for the final prediction. When we evaluate our approach on the KITTI dataset for depth completion, we outperform all the existing Bayesian Deep Learning approaches in terms of prediction accuracy, quality of the uncertainty measure, and the computational efficiency. Moreover, our small network with 670k parameters performs on-par with conventional approaches with millions of parameters. These results give strong evidence that separating the network into parallel uncertainty and prediction streams leads to state-of-the-art performance with accurate uncertainty estimates.

Place, publisher, year, edition, pages
IEEE, 2020
Series
Conference on Computer Vision and Pattern Recognition (CVPR), ISSN 1063-6919, E-ISSN 2575-7075
Keywords
Uncertainty, Task analysis, Probabilistic logic, Measurement uncertainty, Noise measurement, Convolution, Computer vision
National Category
Computer graphics and computer vision
Identifiers
urn:nbn:se:liu:diva-169106 (URN)10.1109/CVPR42600.2020.01203 (DOI)001309199904086 ()978-1-7281-7168-5 (ISBN)978-1-7281-7169-2 (ISBN)
Conference
2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Available from: 2020-09-09 Created: 2020-09-09 Last updated: 2025-02-07

Open Access in DiVA

fulltext(77927 kB)1457 downloads
File information
File name FULLTEXT02.pdfFile size 77927 kBChecksum SHA-512
1cfe259e4f6dc508ec0444ab7b958c3226c123706de0efe9c933e5f76c214d3a92044967699516b688bab959cb18d7806848077bb0d1ec53f871c43bff6b1d38
Type fulltextMimetype application/pdf
Order online >>

Other links

Publisher's full text

Authority records

Holmquist, Karl

Search in DiVA

By author/editor
Holmquist, Karl
By organisation
Computer VisionFaculty of Science & Engineering
Computer graphics and computer vision

Search outside of DiVA

GoogleGoogle Scholar
Total: 1458 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 2272 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf