Regressor selection can be viewed as the first step in the system identification process. The benefits of finding good regressors before estimating complex models are especially clear for nonlinear systems, where the class of possible models is huge. In this article, a structured way of using the tool analysis of variance (ANOVA) is presented and used for NARX model (nonlinear autoregressive model with exogenous input) identification with many candidate regressors.
Identification of non-linear dynamical models of a black box nature involves both structure decisions, i.e., which regressors to use, the selection of a regressor function, and the estimation of the parameters involved. The typical approach in system identification seems to be to mix all these steps, which for example means that the selection of regressors is based on the fits that is achieved for different choices. Alternatively one could then interpret the regressor selection as based on hypothesis tests (F-tests) at a certain confidence level that depends on the data. It would in many cases be desirable to decide which regressors to use independently of the other steps. In this paper we investigate what the well-known method of analysis of variance (ANOVA) can offer for this problem. System identification applications violate many of the ideal conditions for which ANOVA was designed and we study how the method performs under such non-ideal conditions. ANOVA is much faster than a typical parametric estimation method, using e.g. neural networks. It is actually also more reliable, in our tests, in picking the correct structure even under non-ideal conditions. One reason for this may be that ANOVA requires the data set to be balanced, that is, all parts of the regressor space are weighted equally. Just applying tests of fit for the recorded data may give, for structure identification, improper weight to areas with many, or few, samples.
We consider the situation where a non-linear physical system is identified from input-output data. In case no specific physical structural knowledge about the system is available, parameterized grey-box models cannot be used. Identification in black-box type of model structures is then the only alternative, and general approaches like neural nets, neuro-fuzzy models, etc., have to be applied. However, certain non-structural knowledge about the system is sometimes available. It could be known, e.g., that the step response is monotonic, or that the steady-state gain curve is monotonic. The main question is then how to utilize and maintain such information in an otherwise black-box framework. In this paper we show how this can be done, by applying a specific fuzzy model structure, with strict parametric constraints. The usefulness of the approach is illustrated by experiments on real-world data.
We present a novel method for Wiener system identification. The method relies on a semiparametric, i.e. a mixed parametric/nonparametric, model of a Wiener system. We use a state-space model for the linear dynamical system and a nonparametric Gaussian process model for the static nonlinearity. We avoid making strong assumptions, such as monotonicity, on the nonlinear mapping. Stochastic disturbances, entering both as measurement noise and as process noise, are handled in a systematic manner. The nonparametric nature of the Gaussian process allows us to handle a wide range of nonlinearities without making problem-specific parameterizations. We also consider sparsity-promoting priors, based on generalized hyperbolic distributions, to automatically infer the order of the underlying dynamical system. We derive an inference algorithm based on an efficient particle Markov chain Monte Carlo method, referred to as particle Gibbs with ancestor sampling. The method is profiled on two challenging identification problems with good results. Blind Wiener system identification is handled as a special case.
A general class of algorithms for recursive identification of (stochastic) dynamical systems is studied. In this class, the discrepancy between the measured output and the output, predicted from previous data according to a candidate model (‘the prediction error’) is minimized over the model set using a stochastic approximation approach. It is proved that this class of methods has the same convergence properties as its off-line counterparts under mild and general assumptions. These assumptions do not, for example, include stationary conditions or conditions that the true system can be exactly represented within the model set.
The considered class of methods contains as special cases several well-known algorithms. Other common recursive identification methods, like the extended least squares method and the extended Kalman filter can be interpreted as approximate prediction error methods with simplified gradient calculations. Therefore, the approach taken here, may also serve as a basis for unified description of many recursive identification methods.
Members of IFAC's Technical Committee on Theory prepared this review of papers published during 1984–1986 for the participants of the 10th IFAC World Congress in Munich. The review is limited to the scope of the Theory Committee. The idea is to give a status report on Control Theory and to give some guidance into the extensive literature on this topic.
This paper treats the close conceptual relationships between basic approaches to the estimation of transfer functions of linear systems. The classical methods of frequency and spectral analysis are shown to be related to the well-known time domain methods of prediction error type via a common “empirical transfer function estimate”. Asymptotic properties of the estimates obtained by the respective methods are also described and discussed. An important feature that is displayed by this treatment is a frequency domain weighting function that determines the distribution of bias in case the true system cannot be exactly described within the chosen model set. The choice of this weighting function is made in terms of noise models for time-domain methods. The noise model thus has a dual character from the system approximation point of view.
Using data from extensive vibrational tests of the new Saab 2000 aircraft, a combined method for vibration analysis is studied. The method is based on a realization algorithm followed by standard prediction error methods (PEM). We find that the realization algorithm gives good initial model parameter estimates that can be further improved by the use of PEM. We use the method to get insights into the vibrational eigenmodes.
A new identification algorithm which identifies low complexity models of infinite-dimensional systems from equidistant frequency-response data is presented. The new algorithm is a combination of the Fourier transform technique with the recent subspace techniques. Given noisefree data, finite-dimensional systems are exactly retrieved by the algorithm. When noise is present, it is shown that identified models strongly converge to the balanced truncation of the identified system if the measurement errors are covariance bounded. Several conditions are derived on consistency, illustrating the trade-offs in the selection of certain parameters of the algorithm. Two examples are presented which clearly illustrate the good performance of the algorithm.
Identification of systems operating in closed loop has long been of prime interest in industrial applications. The problem offers many possibilities, and also some fallacies, and a wide variety of approaches have been suggested, many quite recently. The purpose of the current contribution is to place most of these approaches in a coherent framework, thereby showing their connections and display similarities and differences in the asymptotic properties of the resulting estimates. The common framework is created by the basic prediction error method, and it is shown that most of the common methods correspond to different parameterizations of the dynamics and noise models. The so-called indirect methods, e.g., are indeed “direct” methods employing noise models that contain the regulator. The asymptotic properties of the estimates then follow from the general theory and take different forms as they are translated to the particular parameterizations. We also study a new projection approach to closed-loop identification with the advantage of allowing approximation of the open-loop dynamics in a given, and user-chosen frequency domain norm, even in the case of an unknown, nonlinear regulator.
It is a fundamental problem of identification to be able—even before the data have been analyzed—to decide if all the free parameters of a model structure can be uniquely recovered from data. This is the issue of global identifiability. In this contribution we show how global identifiability for an arbitrary model structure (basically with analytic non-linearities) can be analyzed using concepts and algorithms from differential algebra. It is shown how the question of global structural identifiability is reduced to the question of whether the given model structure can be rearranged as a linear regression. An explicit algorithm to test this is also given. Furthermore, the question of ‘persistent excitation’ for the input can also be tested explicitly is a similar fashion. The algorithms involved are very well suited for implementation in computer algebra. One such implementation is also described.
The basic techniques of time domain and frequency domain identification, including the maximum entropy methods, are outlined. Then connections and distinctions between the methods are explored. This includes the derivation of some analytic relationships together with a discussion of the restrictions inherent in choosing certain methods, and their ease of use in different experimental conditions. It is concluded that these are complementary rather than competing techniques.
To track the time-varying dynamics of a system or the time-varying properties of a signal is a fundamental problem in control and signal processing. Many approaches to derive such adaptation algorithms and to analyse their behaviour have been taken. This article gives a survey of basic techniques to derive and analyse algorithms for tracking time-varying systems. Special attention is paid to the study of how different assumptions about the true system's variations affect the algorithm. Several explicit and semi-explicit expressions for the mean square error are derived, which clearly demonstrate the character of the trade-off between tracking ability and noise rejection.
The standard continuous time state space model with stochastic disturbances contains the mathematical abstraction of continuous time white noise. To work with well defined, discrete time observations, it is necessary to sample the model with care. The basic issues are well known, and have been discussed in the literature. However, the consequences have not quite penetrated the practice of estimation and identification. One example is that the standard model of an observation, being a snapshot of the current state plus noise independent of the state, cannot be reconciled with this picture. Another is that estimation and identification of time continuous models require a more careful treatment of the sampling formulas. We discuss and illustrate these issues in the current contribution. An application of particular practical importance is the estimation of models based on irregularly sampled observations.
The numerical properties of implementations of the recursive least-squares identification algorithm are of great importance for their continuous use in various adaptive schemes. Here we investigate how an error that is introduced at an arbitrary point in the algorithm propagates. It is shown that conventional LS algorithms, including Bierman's UD-factorization algorithm are exponentially stable with respect to such errors, i.e. the effect of the error decays exponentially. The base of the decay is equal to the forgetting factor. The same is true for fast lattice algorithms. The fast least-squares algorithm, sometimes known as the ‘fast Kalman algorithm’ is however shown to be unstable with respect to such errors.
The framework of differential algebra, especially Ritts algorithm, has turned out to be a useful tool when analyzing the identifiability of certain nonlinear continuous-time model structures. This framework provides conceptually interesting means to analyze complex nonlinear model structures via the much simpler linear regression models. One difficulty when working with continuous-time signals is dealing with white noise in nonlinear systems. In this paper, difference algebraic techniques, which mimic the differential-algebraic techniques, are presented. Besides making it possible to analyze discrete-time model structures, this opens up the possibility of dealing with noise. Unfortunately, the corresponding discrete-time identifiability results are not as conclusive as in continuous time. In addition, an alternative elimination scheme to Ritts algorithm will be formalized and the resulting algorithm is analyzed when applied to a special form of the NFIR model structure.
In Ding et al. [A unified approach for circularity and spatial straightness evaluation using semi-definite programming, International Journal of Machine Tools & Manufacture 47(10) (2007) 1646–1650], the authors advocate semidefinite programming-based relaxations of quadratic optimization problems as a vehicle to solve two circularity and straightness evaluation problems. The purpose of this comment is to point out that the use of semidefinite relaxations for the problems at hand are redundant, since the problems are convex, or changed inter-alia to convex problems. We also take the opportunity to clarify some properties of the semidefinite relaxation, were it to be used for an actual nonconvex problem in this area.
One of the most fundamental problems in model predictive control (MPC) is the lack of guaranteed stability and feasibility. It is shown how Farkas Lemma in combination with bilevel programming and disjoint bilinear programming can be used to search for problematic initial states which lack recursive feasibility, thus invalidating a particular MPC controller. Alternatively, the method can be used to derive a certificate that the problem is recursively feasible. The results are initially derived for nominal linear MPC, and thereafter extended to the additive disturbance case.
In this paper we introduce a new parametrization for state-space systems: data driven local coordinates (DDLC). The parametrization is obtained by restricting the full state-space parametrization, where all matrix entries are considered to be free, to an affine plane containing a given nominal state-space realization. This affine plane is chosen to be perpendicular to the tangent space to the manifold of observationally equivalent state-space systems at the nominal realization. The application of the parametrization to prediction error identification is exemplified. Simulations indicate that the proposed parametrization has numerical advantages as compared to e.g. the more commonly used observable canonical form.
Input design is an important issue for classical system identification methods but has not been investigated for the kernel-based regularization method (KRM) until very recently. In this paper, we consider the input design problem of KRMs for LTI system identification. Different from the recent result, we adopt a Bayesian perspective and in particular make use of scalar measures (e.g., the A-optimality, D-optimality, and E-optimality) of the Bayesian mean square error matrix as the design criteria subject to power-constraint on the input. Instead of solving the optimization problem directly, we propose a two-step procedure. In the first step, by making suitable assumptions on the unknown input, we construct a quadratic map (transformation) of the input such that the transformed input design problems are convex, and the global minima of the transformed input design problem can thus be found efficiently by applying well-developed convex optimization software packages. In the second step, we derive the characterization of the optimal input based on the global minima found in the first step by solving the inverse image of the quadratic map. In addition, we derive analytic results for some special types of kernels, which provide insights on the input design and also its dependence on the kernel structure. (C) 2018 Elsevier Ltd. All rights reserved.
The kernel-based regularization method has two core issues: kernel design and hyperparameter estimation. In this paper, we focus on the second issue and study the properties of several hyperparameter estimators including the empirical Bayes (EB) estimator, two Steins unbiased risk estimators (SURE) (one related to impulse response reconstruction and the other related to output prediction) and their corresponding Oracle counterparts, with an emphasis on the asymptotic properties of these hyperparameter estimators. To this goal, we first derive and then rewrite the first order optimality conditions of these hyperparameter estimators, leading to several insights on these hyperparameter estimators. Then we show that as the number of data goes to infinity, the two SUREs converge to the best hyperparameter minimizing the corresponding mean square error, respectively, while the more widely used EB estimator converges to another best hyperparameter minimizing the expectation of the EB estimation criterion. This indicates that the two SUREs are asymptotically optimal in the corresponding MSE senses but the EB estimator is not. Surprisingly, the convergence rate of two SUREs is slower than that of the EB estimator, and moreover, unlike the two SUREs, the EB estimator is independent of the convergence rate of Phi(T)Phi/N to its limit, where Phi is the regression matrix and N is the number of data. A Monte Carlo simulation is provided to demonstrate the theoretical results. (C) 2018 Elsevier Ltd. All rights reserved.
The sequence of estimates formed by the LMS algorithm for a standard linear regression estimation problem is considered. It is known since earlier that smoothing these estimates by simple averaging will lead to, asymptotically, the recursive least-squares algorithm. In this paper, it is first shown that smoothing the LMS estimates using a matrix updating will lead to smoothed estimates with optimal tracking properties, also in case the true parameters are slowly changing as a random walk. The choice of smoothing matrix should be tailored to the properties of the random walk. Second, it is shown that the same accuracy can be obtained also for a modified algorithm, SLAMS, which is based on averages and requires much less computations.
This paper presents a disturbance decoupled fault reconstruction (DDFR) scheme using cascaded sliding mode observers (SMOs). The processed signals from a SMO are found to be the output of a fictitious system which treats the faults and disturbances as inputs; the ?outputs? are then fed into the next SMO. This process is repeated until the attainment of a fictitious system which satisfies the conditions that guarantee DDFR. It is found that this scheme is less restrictive and enables DDFR for a wider class of systems compared to previous work when only one or two SMOs were used. This paper also presents a systematic routine to check for the feasibility of the scheme and to calculate the required number of SMOs from the outset and also to design the DDFR scheme. A design example verifies its effectiveness.
We consider probabilistic methods for detecting conflicts as a function of predicted trajectory. A conflict is an event representing collision or imminent collision between vehicles or objects. The computations use state estimate and covariance from a target tracking filter based on sensor readings. Existing work is primarily concerned with risk estimation at a certain time instant, while the focus here is to compute the integrated risk over the critical time horizon. This novel formulation leads to evaluating the probability for level-crossing. The analytic expression involves a multi-dimensional integral which is hardly tractable in practice. Further, a huge number of Monte Carlo simulations would be needed to get sufficient reliability for the small risks that the applications often require. Instead, we propose a sound numerical approximation that leads to evaluating a one-dimensional integral which is suitable for real-time implementations.
The convergence properties of causal and current iteration tracking error (CITE) discrete time iterative learning control (ILC) algorithms are studied using time and frequency domain convergence criteria. Of particular interest are conditions for monotone convergence, and these are evaluated using a discrete-time version of Bode's integral theorem.
Anomaly detection in large populations is a challenging but highly relevant problem. It is essentially a multi-hypothesis problem, with a hypothesis for every division of the systems into normal and anomalous systems. The number of hypothesis grows rapidly with the number of systems and approximate solutions become a necessity for any problem of practical interest. In this paper we take an optimization approach to this multi-hypothesis problem. It is first shown to be equivalent to a non-convex combinatorial optimization problem and then is relaxed to a convex optimization problem that can be solved distributively on the systems and that stays computationally tractable as the number of systems increase. An interesting property of the proposed method is that it can under certain conditions be shown to give exactly the same result as the combinatorial multi-hypothesis problem and the relaxation is hence tight.
The presence of abrupt changes, such as impulsive and load disturbances, commonly occur in applications, but make the state estimation problem considerably more difficult than in the standard setting with Gaussian process disturbance. Abrupt changes often introduce a jump in the state, and the problem is therefore readily and often treated by change detection techniques. In this paper, we take a different approach. The state smoothing problem for linear state space models is here formulated as a constrained least-squares problem with sum-of-norms regularization, a generalization of l1-regularization. This novel formulation can be seen as a convex relaxation of the well known generalized likelihood ratio method by Willsky and Jones. Another nice property of the suggested formulation is that it only has one tuning parameter, the regularization constant which is used to trade off fit and the number of jumps. Good practical choices of this parameter along with an extension to nonlinear state space models are given.
This paper proposes a general convex framework for the identification of switched linear systems. The proposed framework uses over-parameterization to avoid solving the otherwise combinatorially forbidding identification problem, and takes the form of a least-squares problem with a sum-of-norms regularization, a generalization of the ℓ_{1}-regularization. The regularization constant regulates the complexity and is used to trade off the fit and the number of submodels.
Segmentation of time-varying systems and signals into models whose parameters are piecewise constant in time is an important and well studied problem. Here it is formulated as a least-squares problem with sum-of-norms regularization over the state parameter jumps. a generalization of L1-regularization. A nice property of the suggested formulation is that it only has one tuning parameter, the regularization constant which is used to trade-off fit and the number of segments.
The problem of estimating continuous-time model parameters of linear dynamical systems using sampled time-domain input and output data has received considerable attention over the past decades and has been approached by various methods. The research topic also bears practical importance due to both its close relation to first principles modelling and equally to linear model-based control design techniques, most of them carried in continuous time. Nonetheless, as the performance of the existing algorithms for continuous-time model identification has seldom been assessed and, as thus far, it has not been considered in a comprehensive study, this practical potential of existing methods remains highly questionable. The goal of this brief paper is to bring forward a first study on this issue and to factually highlight the main aspects of interest. As such, an analysis is performed on a benchmark designed to be consistent both from a system identification viewpoint and from a control-theoretic one. It is concluded that robust initialization aspects require further research focus towards reliable algorithm development. (C) 2019 Elsevier Ltd. All rights reserved.
Inspired by ideas taken from the machine learning literature, new regularization techniques have been recently introduced in linear system identification. In particular, all the adopted estimators solve a regularized least squares problem, differing in the nature of the penalty term assigned to the impulse response. Popular choices include atomic and nuclear norms (applied to Hankel matrices) as well as norms induced by the so called stable spline kernels. In this paper, a comparative study of estimators based on these different types of regularizers is reported. Our findings reveal that stable spline kernels outperform approaches based on atomic and nuclear norms since they suitably embed information on impulse response stability and smoothness. This point is illustrated using the Bayesian interpretation of regularization. We also design a new class of regularizers defined by "integral" versions of stable spline/TC kernels. Under quite realistic experimental conditions, the new estimators outperform classical prediction error methods also when the latter are equipped with an oracle for model order selection. (C) 2016 Elsevier Ltd. All rights reserved.
Most of the currently used techniques for linear system identification are based on classical estimation paradigms coming from mathematical statistics. In particular, maximum likelihood and prediction error methods represent the mainstream approaches to identification of linear dynamic systems, with a long history of theoretical and algorithmic contributions. Parallel to this, in the machine learning community alternative techniques have been developed. Until recently, there has been little contact between these two worlds. The first aim of this survey is to make accessible to the control community the key mathematical tools and concepts as well as the computational aspects underpinning these learning techniques. In particular, we focus on kernel-based regularization and its connections with reproducing kernel Hilbert spaces and Bayesian estimation of Gaussian processes. The second aim is to demonstrate that learning techniques tailored to the specific features of dynamic systems may outperform conventional parametric approaches for identification of stable linear systems.
Identification of nonlinear stochastic processes via differential neural networks is discussed. A new "dead-zone" type learning law for the weight dynamics is suggested. By a stochastic Lyapunov-like analysis the stability conditions for the identification error as well as for the neural network weights are established. The adaptive trajectory tracking using the obtained neural network model is realized for the subclass of stochastic completely controllable processes linearly dependent on control. The upper bounds for the identification and adaptive tracking errors are established.
Subspace identification methods (SIMs) for estimating state-space models have been proven to be very useful and numerically efficient. They exist in several variants, but all have one feature in common: as a first step, a collection of high-order ARX models are estimated from vectorized input-output data. In order not to obtain biased estimates, this step must include future outputs. However, all but one of the submodels include non-causal input terms. The coefficients of them will be correctly estimated to zero as more data become available. They still include extra model parameters which give unnecessarily high variance, and also cause bias for closed-loop data. In this paper, a new model formulation is suggested that circumvents the problem. Within the framework, the system matrices (A,B,C,D) and Markov parameters can be estimated separately. It is demonstrated through analysis that the new methods generally give smaller variance in the estimate of the observability matrix and it is supported by simulation studies that this gives lower variance also of the system invariants such as the poles.
Identification for robust control must deliver not only a nominal model, but also a reliable estimate of the uncertainty associated with the model. This paper addresses recent approaches to robust identification, that aim at dealing with contributions from the two main uncertainty sources: unmodeled dynamics and noise affecting the data. In particular, non-stationary Stochastic Embedding, Model Error Modeling based on prediction error methods and Set Membership Identification are considered. Moreover, we show how Set Membership Identification can be embedded into a Model Error Modeling framework. Model validation issues are easily addressed in the proposed framework. A discussion of asymptotic properties of all methods is presented. For all three methods, uncertainty is evaluated in terms of the frequency response, so that it can be handled by H8 control techniques. An example, where a nontrivial undermodeling is ensured by the presence of a nonlinearity in the system generating the data, is presented to compare these methods.
We propose a nonparametric approach for the identification of Wiener systems. We model the impulse response of the linear block and the static nonlinearity using Gaussian processes. The hyperparameters of the Gaussian processes are estimated using an iterative algorithm based on stochastic approximation expectation-maximization. In the iterations, we use elliptical slice sampling to approximate the posterior distribution of the impulse response and update the hyperparameter estimates. The same sampling is finally used to sample the posterior distribution and to compute point estimates. We compare the proposed approach with a parametric approach and a semi-parametric approach. In particular, we show that the proposed method has an advantage when a parametric model for the system is not readily available. (C) 2019 Elsevier Ltd. All rights reserved.
Recently, pathfollowing algorithms for parametric optimization problems with piecewise linear solution paths have been developed within the field of regularized regression. This paper presents a generalization of these algorithms to a wider class of problems. It is shown that the approach can be applied to the nonparametric system identification method, Direct Weight Optimization (DWO), and be used to enhance the computational efficiency of this method. The most important design parameter in the DWO method is a parameter (lambda) controlling the bias-variance trade-off, and the use of parametric optimization with piecewise linear solution paths means that the DWO estimates can be efficiently computed for all values of lambda simultaneously. This allows for designing computationally attractive adaptive bandwidth selection algorithms. One such algorithm for DWO is proposed and demonstrated in two examples.
This paper addresses the problem of identification of hybrid dynamical systems, by focusing the attention on hinging hyperplanes and Wiener piecewise affine autoregressive exogenous models, in which the regressor space is partitioned into polyhedra with affine submodels for each polyhedron. In particular, we provide algorithms based on mixed-integer linear or quadratic programming which are guaranteed to converge to a global optimum. For the special case where the estimation data only seldom switches between the different submodels, we also suggest a way of trading off between optimality and complexity by using a change detection approach.
A general framework for estimating nonlinear functions and systems is described and analyzed in this paper. Identification of a system is seen as estimation of a predictor function. The considered predictor function estimate at a particular point is defined to be affine in the observed outputs, and the estimate is defined by the weights in this expression. For each given point, the maximal mean-square error (or an upper bound) of the function estimate over a class of possible true functions is minimized with respect to the weights, which is a convex optimization problem. This gives different types of algorithms depending on the chosen function class. It is shown how the classical linear least squares is obtained as a special case and how unknown-but-bounded disturbances can be handled. Most of the paper deals with the method applied to locally smooth predictor functions. It is shown how this leads to local estimators with a finite bandwidth, meaning that only observations in a neighborhood of the target point will be used in the estimate. The size of this neighborhood (the bandwidth) is automatically computed and reflects the noise level in the data and the smoothness priors. The approach is applied to a number of dynamical systems to illustrate its potential.
This paper consists of two parts. In the first, more theoretic part, two Wiener systems driven by the same Gaussian noise excitation are considered. For each of these systems, the best linear approximation (BLA) of the output (in mean square sense) is calculated, and the residuals, defined as the difference between the actual output and the linearly simulated output is considered for both outputs. The paper is focused on the study of the linear relations that exist between these residuals. Explicit expressions are given as a function of the dynamic blocks of both systems, generalizing earlier results obtained by Brillinger [Brillinger, D. R. (1977). The identification of a particular nonlinear time series system. Biometrika, 64(3), 509–515] and Billings and Fakhouri [Billings, S. A., & Fakhouri, S. Y. (1982). Identification of systems containing linear dynamic and static nonlinear elements. Automatica, 18(1), 15–26]. Compared to these earlier results, a much wider class of static nonlinear blocks is allowed, and the efficiency of the estimate of the linear approximation between the residuals is considerably improved. In the second, more practical, part of the paper, this new theoretical result is used to generate initial estimates for the transfer function of the dynamic blocks of a Wiener–Hammerstein system. This method is illustrated on experimental data.
This paper is concerned with the parameter estimation of a general class of nonlinear dynamic systems in state-space form. More specifically, a Maximum Likelihood (ML) framework is employed and an Expectation Maximisation (EM) algorithm is derived to compute these ML estimates. The Expectation (E) step involves solving a nonlinear state estimation problem, where the smoothed estimates of the states are required. This problem lends itself perfectly to the particle smoother, which provides arbitrarily good estimates. The maximisation (M) step is solved using standard techniques from numerical optimisation theory. Simulation examples demonstrate the efficacy of our proposed solution.
A nonlinear black-box structure for a dynamical system is a model structure that is prepared to describe virtually any nonlinear dynamics. There has been considerable recent interest in this area, with structures based on neural networks, radial basis networks, wavelet networks and hinging hyperplanes, as well as wavelet-transform-based methods and models based on fuzzy sets and fuzzy rules. This paper describes all these approaches in a common framework, from a user's perspective. It focuses on what are the common features in the different approaches, the choices that have to be made and what considerations are relevant for a successful system-identification application of these techniques. It is pointed out that the nonlinear structures can be seen as a concatenation of a mapping form observed data to a regression vector and a nonlinear mapping from the regressor space to the output space. These mappings are discussed separately. The latter mapping is usually formed as a basis function expansion. The basis functions are typically formed from one simple scalar function, which is modified in terms of scale and location. The expansion from the scalar argument to the regressor space is achieved by a radial- or a ridge-type approach. Basic techniques for estimating the parameters in the structures are criterion minimization, as well as two-step procedures, where first the relevant basis functions are determined, using data, and then a linear least-squares step to determine the coordinates of the function approximation. A particular problem is to deal with the large number of potentially necessary parameters. This is handled by making the number of ‘used’ parameters considerably less than the number of ‘offered’ parameters, by regularization, shrinking, pruning or regressor selection.
The determination of resolution parameter when estimating frequency functions of linear systems is a trade-off between bias and variance. Traditional approaches, like `window-closing' employ a global resolution parameter - the window width - that is tuned by ad hoc methods, usually visual inspection of the results. Here we explore more sophisticated estimation methods, based on local polynomial modeling, that tune such parameters by automatic procedures. A further benefit is that the tuning can be performed locally, i.e., that different resolutions can be used in different frequency bands. The approach is thus a conceptually simple and useful alternative to more established multi-resolution techniques like wavelets. The advantages of the method are illustrated in a numerical simulation.
In this paper five different recursive identification methods will be analyzed and compared, namely recursive versions of the least squares method, the instrumental variable method, the generalized least squares method, the extended least squares method and the maximum likelihood method. They are shown to be similar in structure and need of computer storage and time. Making use of recently developed theory for asymptotic analysis of recursive stochastic algorithms, these methods are examined from a theoretical viewpoint. Possible convergence points and their global and local convergence properties are studied. The theoretical analysis is illustrated and supplemented by simulations.
Input saturation is inevitable in many engineering applications. Most existing iterative learning control (ILC) algorithms that can deal with input saturation require that the reference signal is realizable within the saturation bound. For engineering systems without precise models, it is hard to verify this requirement. In this note, a "reference governor" (RG) is introduced and is incorporated with the available ILC algorithms (primary ILC algorithms). The role of the RG is to re-design the reference signal so that the modified reference signal is realizable. Two types of the RG are proposed: one modifies the amplitude of the reference signal and the other modifies the frequency. Our main results provide design guidelines for two RGs. Moreover, a design trade-off between the convergence speed and tracking performance is also discussed. A simple simulation result verifies the effectiveness of the proposed methods.
In this contribution, variance properties of L2 model reduction are studied. That is, given an estimated model of high order we study the resulting variance of an L2 reduced approximation. The main result of the paper is showing that estimating a low-order output error (OE) model via L2 model reduction of a high-order model gives a smaller variance compared to estimating a low-order model directly from data in case of undermodeling. This has previously been shown to hold for Finite Impulse Response models, but is in this paper extended to general linear OE models.
In this contribution we examine certain variance properties of model reduction. The focus is on L_{2} model reduction, but some general results are also presented. These general results can be used to analyze various other model reduction schemes. The models we study are finite impulse response (FIR) and output error (OE) models. We compare the variance of two estimated models. The first one is estimated directly from data and the other one is computed by reducing a high order model, by L_{2} model reduction. In the FIR case we show that it is never better to estimate the model directly from data, compared to estimating it via L_{2} model reduction of a high order FIR model. For OE models we show that the reduced model has the same variance as the directly estimated one if the reduced model class used contains the true system.
A common approach to regulator design is to define an objective function, which is minimized with respect to adjustable regulator parameters. Here we discuss how such objective functions can be minimized on-line, thus providing adaptive control. Such an approach to adaptive control has its roots in early contributions to learning systems, and it is here further developed and discussed in the light of the recent development of the field. A general algorithm is given and special attention is paid to minimization of quadratic criteria. A key problem is to obtain information about the system dynamics in order to compute the derivatives of the criterion with respect to the regulator parameters. It is shown that the self-tuning regulator is obtained as a special case, corresponding to a particular quadratic criterion and a particular way of estimating the system dynamics. Explicit ways using instrumental variables techniques based on extra injected noise are also discussed. A specific feature of the algorithm is that it does not utilize specific knowledge about how to calculate the optimal regulator. The algorithm is the same for minimum phase as well as for non-minimum phase systems.
Using the parity-space approach, a residual is formed by applying a projection to a batch of observed data and this is a well established approach to fault detection. Based on a stochastic state space model, the parity-space residual can be put into a stochastic framework where conventional hypothesis tests apply. In an on-line application, the batch of data corresponds to a sliding window and in this contribution we develop an improved on-line algorithm that extends the parity-space approach by taking prior information from previous observations into account. For detection of faults, the Generalized Likelihood Ratio (GLR) test is used. This framework allows for including prior information about the initial state, yielding a test statistic with a significantly higher sensitivity to faults. Another key advantage with this approach is that it can be extended to nonlinear systems using an arbitrary nonlinear filter for state estimation, and a linearized model around a nominal state trajectory in the sliding window. We demonstrate the algorithm on data from an Inertial Measurement Unit (IMU), where small and incipient magnetic disturbances are detected using a nonlinear system model.
An algorithm is described for the selection of model structure for identifying state-space models of ‘black box’ character. The algorithm receives as ‘input’ a given system in a given parametrization. It is then tested whether this parametrization is suitable (well conditioned) for identification purposes. If not, a better one is selected and the transformation of the system to the new representation is performed. This algorithm can be used as a block both in an iterative, off-line identification procedure, and for recursive, on-line identification. It can be called whenever there is some indication that the model structure is ill-conditioned. It is discussed how the model structure selection algorithm can be interfaced with an off-line identification procedure. A complete procedure is described and tested on real and simulated data.