With the rise of deep reinforcement learning, there has also been a string of successes on continuous control problems using physics simulators. This has lead to some optimism regarding use in autonomous robots and vehicles. However, to successful apply such techniques to the real world requires a firm grasp of their limitations. As recent work has raised questions of how diverse these simulation benchmarks really are, we here instead analyze a popular deep RL approach on toy examples from robot obstacle avoidance. We find that these converge very slowly, if at all, to safe policies. We identify convergence issues on stochastic environments and local minima as problems that warrant more attention for safety-critical control applications.
A research arena (WARA-PS) for sensing, data fusion, user interaction, planning and control of collaborative autonomous aerial and surface vehicles in public safety applications is presented. The objective is to demonstrate scientific discoveries and to generate new directions for future research on autonomous systems for societal challenges. The enabler is a computational infrastructure with a core system architecture for industrial and academic collaboration. This includes a control and command system together with a framework for planning and executing tasks for unmanned surface vehicles and aerial vehicles. The motivating application for the demonstration is marine search and rescue operations. A state-of-art delegation framework for the mission planning together with three specific applications is also presented. The first one concerns model predictive control for cooperative rendezvous of autonomous unmanned aerial and surface vehicles. The second project is about learning to make safe real-time decisions under uncertainty for autonomous vehicles, and the third one is on robust terrain-aided navigation through sensor fusion and virtual reality tele-operation to support a GPS-free positioning system in marine environments. The research results have been experimentally evaluated and demonstrated to industry and public sector audiences at a marine test facility. It would be most difficult to do experiments on this large scale without the WARA-PS research arena. Furthermore, these demonstrator activities have resulted in effective research dissemination with high public visibility, business impact and new research collaborations between academia and industry.
Reinforcement learning for robot control tasks in continuous environments is a challenging problem due to the dimensionality of the state and action spaces, time and resource costs for learning with a real robot as well as constraints imposed for its safe operation. In this paper we propose a model-based reinforcement learning approach for continuous environments with constraints. The approach combines model-based reinforcement learning with recent advances in approximate optimal control. This results in a bounded-rationality agent that makes decisions in real-time by efficiently solving a sequence of constrained optimization problems on learned sparse Gaussian process models. Such a combination has several advantages. No high-dimensional policy needs to be computed or stored while the learning problem often reduces to a set of lower-dimensional models of the dynamics. In addition, hard constraints can easily be included and objectives can also be changed in real-time to allow for multiple or dynamic tasks. The efficacy of the approach is demonstrated on both an extended cart pole domain and a challenging quadcopter navigation task using real data.
Aerial robots hold great potential for aiding Search and Rescue (SAR) efforts over large areas, such as during natural disasters. Traditional approaches typically search an area exhaustively, thereby ignoring that the density of victims varies based on predictable factors, such as the terrain, population density and the type of disaster. We present a probabilistic model to automate SAR planning, with explicit minimization of the expected time to discovery. The proposed model is a spatial point process with three interacting spatial fields for i) the point patterns of persons in the area, ii) the probability of detecting persons and iii) the probability of injury. This structure allows inclusion of informative priors from e.g. geographic or cell phone traffic data, while falling back to latent Gaussian processes when priors are missing or inaccurate. To solve this problem in real-time, we propose a combination of fast approximate inference using Integrated Nested Laplace Approximation (INLA), and a novel Monte Carlo tree search tailored to the problem. Experiments using data simulated from real world Geographic Information System (GIS) maps show that the framework outperforms competing approaches, finding many more injured in the crucial first hours.
Modern optimization-based approaches to control increasingly allow automatic generation of complex behavior from only a model and an objective. Recent years has seen growing interest in fast solvers to also allow real-time operation on robots, but the computational cost of such trajectory optimization remains prohibitive for many applications. In this paper we examine a novel deep neural network approximation and validate it on a safe navigation problem with a real nano-quadcopter. As the risk of costly failures is a major concern with real robots, we propose a risk-aware resampling technique. Contrary to prior work this active learning approach is easy to use with existing solvers for trajectory optimization, as well as deep learning. We demonstrate the efficacy of the approach on a difficult collision avoidance problem with non-cooperative moving obstacles. Our findings indicate that the resulting neural network approximations are least 50 times faster than the trajectory optimizer while still satisfying the safety requirements. We demonstrate the potential of the approach by implementing a synthesized deep neural network policy on the nano-quadcopter microcontroller.
Robots are increasingly expected to move out of the controlled environment of research labs and into populated streets and workplaces. Collision avoidance in such cluttered and dynamic environments is of increasing importance as robots gain more autonomy. However, efficient avoidance is fundamentally difficult since computing safe trajectories may require considering both dynamics and uncertainty. While heuristics are often used in practice, we take a holistic stochastic trajectory optimization perspective that merges both collision avoidance and control. We examine dynamic obstacles moving without prior coordination, like pedestrians or vehicles. We find that common stochastic simplifications lead to poor approximations when obstacle behavior is difficult to predict. We instead compute efficient approximations by drawing upon techniques from machine learning. We propose to combine policy search with model-predictive control. This allows us to use recent fast constrained model-predictive control solvers, while gaining the stochastic properties of policy-based methods. We exploit recent advances in Bayesian optimization to efficiently solve the resulting probabilistically-constrained policy optimization problems. Finally, we present a real-time implementation of an obstacle avoiding controller for a quadcopter. We demonstrate the results in simulation as well as with real flight experiments.
Recently substantial research has been devoted to Unmanned Aerial Vehicles (UAVs). One of a UAV's most demanding subsystem is vision. The vision subsystem must dynamically combine different algorithms as the UAVs goal and surrounding change. To fully utilize the available hardware, a run time system must be able to vary the quality and the size of regions the algorithms are applied to, as the number of image processing tasks changes. To allow this the run time system and the underlying computational model must be integrated. In this paper we present a computational model suitable for integration with a run time system. The computational model is called Image Processing Data Flow Graph (IP-DFG). IP-DFG has been developed for modeling of complex image processing algorithms. IP-DFG is based on data flow graphs, but has been extended with hierarchy and new rules for token consumption, which makes the computational model more flexible and more suitable for human interaction. In this paper we also show that IP-DFGs are suitable for modelling expressions, including data dependent decisions and iterations, which are common in complex image processing algorithms.
The Association for the Advancement of Artificial Intelligence presented the 2011 Spring Symposium Series Monday through Wednesday, March 21-23, 2011, at Stanford University. This report summarizes the eight symposia.
When unmanned aerial vehicles (UAVs) are used to survey distant targets, it is important to transmit sensor information back to a base station. As this communication often requires high uninterrupted bandwidth, the surveying UAV often needs afree line-of-sight to the base station, which can be problematic in urban or mountainous areas. Communication ranges may also belimited, especially for smaller UAVs. Though both problems can be solved through the use of relay chains consisting of one or more intermediate relay UAVs, this leads to a new problem: Where should relays be placed for optimum performance? We present two new algorithms capable of generating such relay chains, one being a dual ascent algorithm and the other a modification of the Bellman-Ford algorithm. As the priorities between the numberof hops in the relay chain and the cost of the chain may vary, wecalculate chains of different lengths and costs and let the ground operator choose between them. Several different formulations for edge costs are presented. In our test cases, both algorithms are substantially faster than an optimized version of the original Bellman-Ford algorithm, which is used for comparison.
When unmanned aerial vehicles (UAVs) are used for surveillance, information must often be transmitted to a base station in real time. However, limited communication ranges and the common requirement of free line of sight may make direct transmissions from distant targets impossible. This problem can be solved using relay chains consisting of one or more intermediate relay UAVs. This leads to the problem of positioning such relays given known obstacles, while taking into account a possibly mission-specific quality measure. The maximum quality of a chain may depend strongly on the number of UAVs allocated. Therefore, it is desirable to either generate a chain of maximum quality given the available UAVs or allow a choice from a spectrum of Pareto-optimal chains corresponding to different trade-offs between the number of UAVs used and the resulting quality. In this article, we define several problem variations in a continuous three-dimensional setting. We show how sets of Pareto-optimal chains can be generated using graph search and present a new label-correcting algorithm generating such chains significantly more efficiently than the best-known algorithms in the literature. Finally, we present a new dual ascent algorithm with better performance for certain tasks and situations.
We consider a constrained optimization problem with mixed integer and real variables. It models optimal placement of communications relay nodes in the presence of obstacles. This problem is widely encountered, for instance, in robotics, where it is required to survey some target located in one point and convey the gathered information back to a base station located in another point. One or more unmanned aerial or ground vehicles (UAVs or UGVs) can be used for this purpose as communications relays. The decision variables are the number of unmanned vehicles (UVs) and the UV positions. The objective function is assumed to access the placement quality. We suggest one instance of such a function which is more suitable for accessing UAV placement. The constraints are determined by, firstly, a free line of sight requirement for every consecutive pair in the chain and, secondly, a limited communication range. Because of these requirements, our constrained optimization problem is a difficult multi-extremal problem for any fixed number of UVs. Moreover, the feasible set of real variables is typically disjoint. We present an approach that allows us to efficiently find a practically acceptable approximation to a global minimum in the problem of optimal placement of communications relay nodes. It is based on a spatial discretization with a subsequent reduction to a shortest path problem. The case of a restricted number of available UVs is also considered here. We introduce two label correcting algorithms which are able to take advantage of using some peculiarities of the resulting restricted shortest path problem. The algorithms produce a Pareto solution to the two-objective problem of minimizing the path cost and the number of hops. We justify their correctness. The presented results of numerical 3D experiments show that our algorithms are superior to the conventional Bellman-Ford algorithm tailored to solving this problem.
We consider the directed Steiner tree problem (DSTP) with a constraint on the total number of arcs (hops) in the tree. This problem is known to be NP-hard, and therefore, only heuristics can be applied in the case of its large-scale instances. For the hop-constrained DSTP, we propose local search strategies aimed at improving any heuristically produced initial Steiner tree. They are based on solving a sequence of hop-constrained shortest path problems for which we have recently developed ecient label correcting algorithms. The presented approach is applied to nding suitable 3D locations where unmanned aerial vehicles (UAVs) can be placed to relay information gathered in multi-target monitoring and surveillance. The eciency of our algorithms is illustrated by results of numerical experiments involving problem instances with up to 40 000 nodes and up to 20 million arcs.
We consider the directed Steiner tree problem (DSTP) with a constraint on the total number of arcs (hops) in the tree. This problem is known to be NP-hard, and therefore, only heuristics can be applied in the case of its large-scale instances.For the hop-constrained DSTP, we propose local search strategies aimed at improving any heuristically produced initial Steiner tree. They are based on solving a sequence of hop-constrained shortest path problems for which we have recently developed efficient label correcting algorithms.The presented approach is applied to finding suitable 3D locations where unmanned aerial vehicles (UAVs) can be placed to relay information gathered in multi-target monitoring and surveillance. The efficiency of our algorithms is illustrated by results of numerical experiments involving problem instances with up to 40 000 nodes and up to 20 million arcs.
Guarding the perimeter of an area in order to detect potential intruders is an important task in a variety of security-related applications. This task can in many circumstances be performed by a set of camera-equipped unmanned aerial vehicles (UAVs). Such UAVs will occasionally require refueling or recharging, in which case they must temporarily be replaced by other UAVs in order to maintain complete surveillance of the perimeter. In this paper we consider the problem of scheduling such replacements. We present optimal replacement strategies and justify their optimality.
The aim of this paper is to explore the possibility of using geo-referenced satellite or aerial images to augment an Unmanned Aerial Vehicle (UAV) navigation system in case of GPS failure. A vision based navigation system which combines inertial sensors, visual odometer and registration of a UAV on-board video to a given geo-referenced aerial image has been developed and tested on real flight-test data. The experimental results show that it is possible to extract useful position information from aerial imagery even when the UAV is flying at low altitude. It is shown that such information can be used in an automated way to compensate the drift of the UAV state estimation which occurs when only inertial sensors and visual odometer are used.
This paper investigates the possibility of augmenting an Unmanned Aerial Vehicle (UAV) navigation system with a passive video camera in order to cope with long-term GPS outages. The paper proposes a vision-based navigation architecture which combines inertial sensors, visual odometry, and registration of the on-board video to a geo-referenced aerial image. The vision-aided navigation system developed is capable of providing high-rate and drift-free state estimation for UAV autonomous navigation without the GPS system. Due to the use of image-to-map registration for absolute position calculation, drift-free position performance depends on the structural characteristics of the terrain. Experimental evaluation of the approach based on offline flight data is provided. In addition the architecture proposed has been implemented on-board an experimental UAV helicopter platform and tested during vision-based autonomous flights.
This paper presents a method for high accuracy ground target localization using a Micro Aerial Vehicle (MAV) equipped with a video camera sensor. The proposed method is based on a satellite or aerial image registration technique. The target geo-location is calculated by registering the ground target image taken from an on-board video camera with a geo- referenced satellite image. This method does not require accurate knowledge of the aircraft position and attitude, therefore it is especially suitable for MAV platforms which do not have the capability to carry accurate sensors due to their limited payload weight and power resources. The paper presents results of a ground target geo-location experiment based on an image registration technique. The platform used is a MAV prototype which won the 3rd US-European Micro Aerial Vehicle Competition (MAV07). In the experiment a ground object was localized with an accuracy of 2.3 meters from a ight altitude of 70 meters.
The paper presents a light-weight and low-cost airborne terrain mapping system. The developed Airborne LiDAR Scanner (ALS) sys- tem consists of a high-precision GNSS receiver, an inertial measurement unit and a magnetic compass which are used to complement a LiDAR sensor in order to compute the terrain model. Evaluation of the accuracy of the generated 3D model is presented. Additionally, a comparison is provided between the terrain model generated from the developed ALS system and a model generated using a commer- cial photogrammetric software. Finally, the multi-echo capability of the used LiDAR sensor is evaluated in areas covered with dense vegetation. The ALS system and camera systems were mounted on-board an industrial unmanned helicopter of around 100 kilograms maximum take-off weight. Presented results are based on real flight-test data.
This paper presents a comparison of two light-weight and low-cost airborne mapping systems. One is based on a lidar technology and the other on a video camera. The airborne lidar system consists of a high-precision global navigation satellite system (GNSS) receiver, a microelectromechanical system (MEMS) inertial measurement unit, a magnetic compass and a low-cost lidar scanner. The vision system is based on a consumer grade video camera. A commercial photogrammetric software package is used to process the acquired images and generate a digital surface model. The two systems are described and compared in terms of hardware requirements and data processing. The systems are also tested and compared with respect to their application on board of an unmanned aerial vehicle (UAV). An evaluation of the accuracy of the two systems is presented. Additionally, the multi echo capability of the lidar sensor is evaluated in a test site covered with dense vegetation. The lidar and the camera systems were mounted and tested on-board an industrial unmanned helicopter with maximum take-off weight of around 100 kilograms. The presented results are based on real flight-test data.
Micro unmanned aerial vehicles are becoming increasingly interesting for aiding and collaborating with human agents in myriads of applications, but in particular they are useful for monitoring inaccessible or dangerous areas. In order to interact with and monitor humans, these systems need robust and real-time computer vision subsystems that allow to detect and follow persons.
In this work, we propose a low-level active vision framework to accomplish these challenging tasks. Based on the LinkQuad platform, we present a system study that implements the detection and tracking of people under fully autonomous flight conditions, keeping the vehicle within a certain distance of a person. The framework integrates state-of-the-art methods from visual detection and tracking, Bayesian filtering, and AI-based control. The results from our experiments clearly suggest that the proposed framework performs real-time detection and tracking of persons in complex scenarios
The subject of this thesis is the formalization of a type of non-monotonic reasoning using a three-valued logic based on the strong definitions of Kleene. Non-monotonic reasoning is the rule rather than the exception when agents, human or machine, must act where information about the environment is uncertain or incomplete. Information about the environment is subject to change due to external causes, or may simply become outdated. This implies that inferences previously made may no longer hold and in turn must be retracted along with the revision of other information dependent on the retractions. This is the variety of reasoning we would like to find formal models for.
We start by extending Kleene-s three-valued logic with an "external negation" connective where ~ a is true when a is false or unknown. In addition, a default operator D is added where D a is interpreted as "a is true by default. The addition of the default operator increases the expressivity of the language, where statements such as "a is not a default" are directly representable. The logic has an intuitive model theoretic semantics without any appeal to the use of a fixpoint semantics for the default operator. The semantics is based on the notion of preferential entailment, where a set of sentences G preferentially entails a sentence a, if and only if a preferred set of the models of G are models of a. We also show that one version of the logic belongs to the class of cumulative non-monotonic formalisms which are a subject of current interest.
A decision procedure for the propositional case, based on the semantic tableaux proof method is described and serves as a basis for a QA-system where it can be determined if a sentence a is preferentially entailed by a set of premises G. The procedure is implemented.
The emerging area of intelligent unmanned aerial vehicle (UAV) research has shown rapid development in recent years and offers a great number of research challenges for artificial intelligence and knowledge representation. For both military and civilian applications, there is a desire to develop more sophisticated UAV platforms where the emphasis is placed on intelligent capabilities and their integration in complex distributed software architectures. Such architectures should support the integration of deliberative, reactive and control functionalities in addition to the UAV’s integration with larger network centric systems. In my talk I will present some of the research and results from a long term basic research project with UAVs currently being pursued at Linköping University, Sweden. The talk will focus on knowledge representation techniques used in the project and the support for these techniques provided by the software architecture developed for our UAV platform, a Yamaha RMAX helicopter. Additional focus will be placed on some of the planning and execution monitoring functionality developed for our applications in the areas of traffic monitoring, surveying and photogrammetry and emergency services assistance.
Knowledge representation technologies play a fundamental role in any autonomous system that includes deliberative capability and that internalizes models of its internal and external environments. Integrating both high- and low-end autonomous functionality seamlessly in autonomous architectures is currently one of the major open problems in robotics research. UAVs offer especially difficult challenges in comparison with ground robotic systems due to the often tight time constraints and safety considerations that must be taken into account. This article provides an overview of some of the knowledge representation technologies and deliberative capabilities developed for a fully deployed autonomous unmanned aerial vehicle system to meet some of these challenges.
In this paper, we propose a formalization of non-monotonic reasoning using a three-valued logic based on the strong definitions of Kleene. We start by extending Kleene's three-valued logic with an "external negation" connective where ~ alpha is true when alpha is false or unknown. In addition, a default operator D is added where D alpha is interpreted as "alpha is true by default". The addition of the default operator increases the expressivity of the language, where statements such as "alpha is not a default" are directly representable. The logic has an intuitive model theoretic semantics without any appeal to the use of a fixpoint semantics for the default operator. The semantics is based on the notion of preferential entailment, where a set of sentences Gamma preferentially entails a sentence alpha, if and only if a preferred set of the models of Gamma are models of alpha. We also show that the logic belongs to the class of cumulative non-monotonic formalisms which are a subject of current interest.
The thesis is a study of a particular approach to defeasible reasoning based on the notion of an information state consisting of a set of partial interpretations constrained by an information ordering. The formalism proposed, called NML3, is a non-monotonic logic with explicit defaults and is characterized by the following features: (1) The use of the strong Kleene three-valued logic as a basis. (2) The addition of an explicit default operator which enables distinguishing tentative conclusions from ordinary conclusions in the object language. (3) The use of the technique of preferential entailment to generate non-monotonic behavior. The central feature of the formalism, the use of an explicit default operator with a model theoretic semantics based on the notion of a partial interpretation, distinguishes NML3 from the existing formalisms. By capitalizing on the distinction between tentative and ordinary conclusions, NML3 provides increased expressibility in comparison to many of the standard non-monotonic formalisms and greater flexibility in the representation of subtle aspects of default reasoning.
In addition to NML3, a novel extension of the tableau-based proof technique is presented where a signed formula is tagged with a set of truth values rather than a single truth value. This is useful if the tableau-based proof technique is to be generalized to apply to the class of multi-valued logics. A refutation proof procedure may then be used to check logical consequence for the base logic used in NML3 and to provide a decision procedure for the propositional case of NML3.
A survey of a number of non-standard logics used in knowledge representation is also provided. Various formalisms are analyzed in terms of persistence properties of formulas and their use of information structures.
This report describes the current state of work with PMON, a logic for reasoning about action and change, and its extensions. PMON has been assessed correct for the K-IA class using Sandewall's Features and Fluents framework which provides tools for assessing the correctness of logics of action and change. A syntactic characterization of PMON has previously been provided in terms of a circumscription axiom which is shown to be reducible to a first-order formula. This report introduces a number of new extensions which are also reducible and deal with ramification. The report is intended to provide a formal specification for the PMON family of logics and the surface language L(SD) used to represent action scenario descriptions. It should be considered a working draft. The title of the report has a version number because both the languages and logics used are continually evolving. Since this document is intended as a formal specification which is used by our group as a reference for research and implementation, it is understandably brief as regards intuitions and applications of the languages and logics defined. We do provide a set of benchmarks and comments concerning these which can serve as a means of comparing this formalism with others. The set of benchmarks is not complete and is only intended to provide representative examples of the expressivity and use of this particular family of logics. We describe its features and limitations in other publications by our group which can normally be found at "http://www.ida.liu.se/labs/kplab/".
This report describes the current state of work with PMON, a logic for reasoning about actions and change, and its extensions. PMON has been assessed correct for the K-IA class using Sandewall's Features and Fluents framework which provides tools for assessing the correctness of logics of action and change. A syntactic characterization of PMON has previously been provided in terms of a circumscription axiom which is shown to be reducible to a first-order formula. This report introduces a number of new extensions which are also reducible and deal with ramification. This report is intended to provide a formal specification of the PMON family of logics and the surface language L(SD) used to represent action scenario descriptions. It should be considered a working draft. The title of the report has a version number because both the languages and logics are continually evolving. Since this document is intended as a formal specification which is used by our group as a reference for research and implementation, it is understandably brief as regards intuitions and applications of the language and logics defined. We do provide a set of benchmarks and comments concerning these which can serve as a means of comparing this formalism with others. The set of benchmarks is not complete and is only intended to provide representative examples of the expressivity and use of this particular family of logics. We describe its features and limitations in other publications by our group which can normally be found at http://www.ida.liu.se/labs/kplab/.
The American Association for Artificial Intelligence, in cooperation with Stanford University’s Department of Computer Science, presented the 2003 Spring Symposium Series, Monday through Wednesday, 24–26 March 2003, at Stanford University. The titles of the eight symposia were Agent-Mediated Knowledge Management, Computational Synthesis: From Basic Building Blocks to High- Level Functions, Foundations and Applications of Spatiotemporal Reasoning (FASTR), Human Interaction with Autonomous Systems in Complex Environments, Intelligent Multimedia Knowledge Management, Logical Formalization of Commonsense Reasoning, Natural Language Generation in Spoken and Written Dialogue, and New Directions in Question-Answering Motivation.
In the context of collaborative robotics, distributed situation awareness is essential for supporting collective intelligence in teams of robots and human agents where it can be used for both individual and collective decision support. This is particularly important in applications pertaining to emergency rescue and crisis management. During operational missions, data and knowledge are gathered incrementally and in different ways by heterogeneous robots and humans. We describe this as the creation of Hastily Formed Knowledge Networks (HFKNs). The focus of this paper is the specification and prototyping of a general distributed system architecture that supports the creation of HFKNs by teams of robots and humans. The information collected ranges from low-level sensor data to high-level semantic knowledge, the latter represented in part as RDF Graphs. The framework includes a synchronization protocol and associated algorithms that allow for the automatic distribution and sharing of data and knowledge between agents. This is done through the distributed synchronization of RDF Graphs shared between agents. High-level semantic queries specified in SPARQL can be used by robots and humans alike to acquire both knowledge and data content from team members. The system is empirically validated and complexity results of the proposed algorithms are provided. Additionally, a field robotics case study is described, where a 3D mapping mission has been executed using several UAVs in a collaborative emergency rescue scenario while using the full HFKN Framework.
We consider the possibility of generalizing the notion of a fuzzy If-Then rule to take into account its context dependent nature. We interpret fuzzy rules as modeling a forward directed causal relationship between the antecedent and the conclusion, which applies in most contexts, but on occasion breaks down in exceptional contexts. The default nature of the rule is modeled by augmenting the original If-Then rule with an exception part. We then consider the proper semantic correlate to such an addition and propose a ternary relation which satisfies a number of intuitive constraints described in terms of a number of inference rules. In the rest of the paper, we consider implementational issues arising from the unless extension and propose the use of reason maintenance systems, in particular TMS's, where a fuzzy If-Then-Unless rule is encoded into a dependency net. We verify that the net satisfies the constraints stated in the inference schemes and conclude with a discussion concerning the integration of qualitative IN-OUT labelings of the TMS with quantitative degree of membership labelings for the variables in question.