liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Class-Incremental Learning for Semantic Segmentation - A study
Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering.
Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. Office of the National Police Commissioner, The Swedish Police Authority.
Linköping University, Department of Electrical Engineering, Computer Vision. Linköping University, Faculty of Science & Engineering. University of KwaZulu-Natal, Durban, South Africa.ORCID iD: 0000-0002-6096-3648
2021 (English)In: 2021 Swedish Artificial Intelligence Society Workshop (SAIS), IEEE , 2021, p. 25-28Conference paper, Published paper (Refereed)
Abstract [en]

One of the main challenges of applying deep learning for robotics is the difficulty of efficiently adapting to new tasks while still maintaining the same performance on previous tasks. The problem of incrementally learning new tasks commonly struggles with catastrophic forgetting in which the previous knowledge is lost.Class-incremental learning for semantic segmentation, addresses this problem in which we want to learn new semantic classes without having access to labeled data for previously learned classes. This is a problem in industry, where few pre-trained models and open datasets matches exactly the requisites. In these cases it is both expensive and labour intensive to collect an entirely new fully-labeled dataset. Instead, collecting a smaller dataset and only labeling the new classes is much more efficient in terms of data collection.In this paper we present the class-incremental learning problem for semantic segmentation, we discuss related work in terms of the more thoroughly studied classification task and experimentally validate the current state-of-the-art for semantic segmentation. This lays the foundation as we discuss some of the problems that still needs to be investigated and improved upon in order to reach a new state-of-the-art for class-incremental semantic segmentation.

Place, publisher, year, edition, pages
IEEE , 2021. p. 25-28
Keywords [en]
Industries, Deep learning, Conferences, Semantics, Labeling, Task analysis, Artificial intelligence
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:liu:diva-189039DOI: 10.1109/sais53221.2021.9483955ISI: 000855522600007ISBN: 9781665442367 (electronic)ISBN: 9781665442374 (print)OAI: oai:DiVA.org:liu-189039DiVA, id: diva2:1701982
Conference
2021 Swedish Artificial Intelligence Society Workshop (SAIS), 14-15 June 2021, Sweden
Funder
Vinnova
Note

Funding agencies: Vinnova [2020-02838]

Available from: 2022-10-08 Created: 2022-10-08 Last updated: 2023-03-01Bibliographically approved
In thesis
1. Data-Driven Robot Perception in the Wild
Open this publication in new window or tab >>Data-Driven Robot Perception in the Wild
2023 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

As technology continues to advance, the interest in the relief of humans from tedious or dangerous tasks through automation increases. Some of the tasks that have received increasing attention are autonomous driving, disaster relief, and forestry inspection. Developing and deploying an autonomous robotic system to this type of unconstrained environments —in a safe way— is highly challenging. The system requires precise control and high-level decision making. Both of which require a robust and reliable perception system to understand the surroundings correctly. 

The main purpose of perception is to extract meaningful information from the environment, be it in the form of 3D maps, dense classification of the type of object and surfaces, or high-level information about the position and direction of moving objects. Depending on the limitations and application of the system, various types of sensors can be used: lidars, to collect sparse depth information; cameras, to collect dense information for different parts of the visual spectra, of-ten the red-green-blue (RGB) bands; Inertial Measurements Units (IMUs), to estimate the ego motion; microphones, to interact and respond to humans; GPS receivers, to get global position information; just to mention a few. 

This thesis investigates some of the necessities to approach the requirements of this type of system. Specifically, focusing on data-driven approaches, that is, machine learning, which has been shown time and again to be the main competitor for high-performance perception tasks in recent years. Although precision requirements might be high in industrial production plants, the environment is relatively controlled and the task is fixed. Instead, this thesis is studying some of the aspects necessary for complex, unconstrained environments, primarily outdoors and potentially near humans or other systems. The term in the wild refers exactly to the unconstrained nature of these environments, where the system can easily encounter something previously unseen and where the system might interact with unknowing humans. Some examples of environments are: city traffic, disaster relief scenarios, and dense forests. 

This thesis will mainly focus on the following three key aspects necessary to handle the types of tasks and situations that could occur in the wild: 1) generalizing to a new environment, 2) adapting to new tasks and requirements, and 3) modeling uncertainty in the perception system. 

First, a robotic system should be able to generalize to new environments and still function reliably. Papers B and G address this by using an intermediate representation to allow the system to handle much more diverse types of environment than otherwise possible. Paper B also investigates how robust the proposed autonomous driving system was to incorrect predictions, which is one of the likely results of changing the environment. 

Second, a robot should be sufficiently adaptive to allow it to learn new tasks without forgetting the previous ones. Paper E proposed a way to allow incrementally adding new semantic classes to a trained model without access to the previous training data. The approach is based on utilizing the uncertainty in the predictions to model the unknown classes, marked as background. 

Finally, the perception system will always be partially flawed, either because of the lack of modeling capabilities or because of ambiguities in the sensor data. To properly take this into account, it is fundamental that the system has the ability to estimate the certainty in the predictions. Paper F proposed a method for predicting the uncertainty in the model predictions when interpolating sparse data. Paper G addresses the ambiguities that exist when estimating the 3D pose of a human from a single camera image. 

Abstract [sv]

Allt eftersom tekniken utvecklas ökar intresset av att underlätta för människan genom att automatisera vissa farliga eller slitsamma uppgifter. Några av de områden som har potential för att automatisera är: transporter, genom självkörande bilar; räddningsarbete i samband med katastrofer; samt inventering av skog och liknande. Den här typen av komplicerade och potentiellt farliga miljöer kräver avancerade beslutssystem samt precisa kontrollsystem. Båda dessa delar kräver en robust och tillförlitlig perception av omgivningen.

Perceptionens huvudsyfte är att extrahera meningsfull information från omgivning som kan underlätta för planering och utförande av olika typer av uppgifter. Informationen som sådan kan vara i form av 3D kartor, detaljerad information om typ av underlag samt information om enstaka objekt i form av deras position samt rörelser. Ett autonomt system kan vara konstruerat på flera sätt men några av de vanliga sensorerna som används är: lidar, för att samla in glesa 3D mätningar om underlag och hinder; kameror, för att samla in färg- eller temperaturinformation från objekt i omgivningen; IMU, för att skatta hur systemet förflyttar sig; samt GPS för att kunna positionera systemet utomhus i ett globalt koordinatsystem.

Den här avhandlingen undersöker en del av de komponenter som krävs för att uppfylla de krav på perception som finns. Fokuset i avhandlingen är på maskininlärning, vilket har påvisats kunna hantera många avancerade uppgifter på ett robust sätt. Avhandlingen fokuserar inte på de högprecisionskrav vilka finns inom industriell tillverkningsindustri, utan fokuset är på att kunna hantera de komplicerade och utmanande miljöerna som klassas som in the wild. Några exempel på den här typen av miljöer är: stadstrafik, katastrofområden, samt täta skogar.

Tre aspekter av problemet avhandlas i den här avhandlingen: 1) generaliserande till andra miljöer, 2) anpassning till nya uppgifter samt miljöer, och 3) modellera eventuella osäkerheter.

Ett autonomt system ska helst inte vara begränsad till en typ av miljö, till exempel ska inte en självkörande bil bara kunna hantera skinande sol på motorvägar i bra skick. Artikel B och G adresserar detta till viss del genom att separera uppgiften i två delproblem, där den första genererar input data till den andra delen. Träningsdatan för delproblem ett är lättare att samla från varierande miljöer, vilket gör den mer generell än om all enbart träningsdata för hela problem är tillgängligt. Artikel B undersöker även hur felkällor i den här representationen påverkar systemet som helhet.

Ett autonomt system bör även vara designat för att kunna anpassas till nya uppgifter på ett effektivt sätt. Artikel E undersökte det här problemet från perspektivet att kunna utöka den mängd av kända klasser som systemet känner till, utan att träna om det helt och hållet.

Slutligen behöver man acceptera att perceptionen aldrig kommer kunna bli perfekt i alla typer av miljöer utan det kommer alltid finnas viss osäkerhet. Den här osäkerheten kan dels komma från modellen som sådan, men det är också möjligt att sensor data inte räcker till för att kunna avgöra vilken av flera möjligheter som är den sanna. Artikel F designade ett system för att kunna skatta osäkerheten i dess estimat medan artikel G fokuserar på hur man kan hantera osäkerheten kring hur en människa står om en del av kroppen är skymd.  

Place, publisher, year, edition, pages
Linköping: Linköping University Electronic Press, 2023. p. 45
Series
Linköping Studies in Science and Technology. Dissertations, ISSN 0345-7524 ; 2293
National Category
Computer Vision and Robotics (Autonomous Systems)
Identifiers
urn:nbn:se:liu:diva-192087 (URN)10.3384/9789180750677 (DOI)9789180750660 (ISBN)9789180750677 (ISBN)
Public defence
2023-03-31, Ada Lovelace, B-building and online via: https://liu-se.zoom.us/j/63470801417, Campus Valla, Linköping, 09:15 (English)
Opponent
Supervisors
Note

Funding agencies: the European Union's Horizon 2020 Program; Sweden´s Innovation Agency (Vinnova); the Swedish Research Council (VR); and the Swedish Foundation for Strategic Research (SSF).

Available from: 2023-03-01 Created: 2023-03-01 Last updated: 2023-04-26Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full text

Authority records

Holmquist, KarlKlasén, LenaFelsberg, Michael

Search in DiVA

By author/editor
Holmquist, KarlKlasén, LenaFelsberg, Michael
By organisation
Computer VisionFaculty of Science & Engineering
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 206 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf