Automated Design of an FDI-System for the Wind Turbine Benchmark

Present paper proposes an FDI-system for the wind turbine benchmark designed by application of a generic automated design method, in which the number of required human decisions and assumptions are ...


Introduction
Wind turbines stand for a growing part of power production. The demands for reliability are high, since wind turbines are expensive and their off-time should be minimized. One potential way to meet the reliability demands is to adopt fault tolerant control (FTC), that is, prevent faults from developing into failures by taking appropriate actions. A typical action is reconfiguration of the control system. An essential part of an FTC system is the fault detection and isolation (FDI) system, see, for example, [1]. To obtain good detection and isolation of faults, model-based FDI is often necessary.
Design of a complete model-based FDI-system is a complex task and involves by necessity several decisions, for example, method choices, tuning of parameters, and assumptions regarding noise distributions and the nature of the faults to be diagnosed. In general, an optimal solution requires detailed knowledge of the behavior of the considered system, something that is rarely available for real applications. In this paper, inspired by the work with real-industrial applications, we propose an automated design method that minimizes the number of required human decisions and assumptions. Furthermore, we investigate the potential of designing an FDI system for the wind turbine benchmark, see [2], using this automated method.
The design method is composed of three main steps. In the first step, a large set of candidate residual generators are generated using the algorithm described in [3]. In the second step, the residual generators most suitable to be included in the final FDI system are selected and realized by means of a greedy selection algorithm, based on ideas elaborated in [4]. The realization, or construction, of residual generators is done by the use of the algorithms presented in [5]. In the third and final step, we design diagnostic tests based on the residuals obtained as output from the selected set of residual generators. The diagnostic test relies on a novel methodology based on a comparison of the probability distributions of nofault residuals, estimated offline using no-fault training data, and the distributions of residuals estimated online using current data.
As it turns out, the proposed FDI system performs well when evaluated on the test sequence described in [2]. A tailor-made FDI system perfectly tuned for the wind turbine benchmark would probably perform better than the one we propose. However, in relation to the minimal effort required for application of the automated design method, and in spite of no extra tuning or specific adaptation to the benchmark,

The Wind Turbine Model
The wind turbine system is described and modeled in [2], to which is referred for details. The considered wind turbine system has three rotor blades and the system contains four subsystems: blade and pitch system, drive train, generator and converter, and controller, see Figure 1 and Table 1.

State-Space Realization of Transfer
Functions. The pitch system and converter are modeled as frequency domain transfer functions. The residual generation algorithm we intend to apply, assumes a model described in differential and algebraic equations. To obtain a model in this form, the transfer functions are realized as time-domain state-space systems.
The relation between pitch angle reference β r and pitch angle output β i , for each of the three blades and thus for i = 1, 2, 3, can be realized in state-space form using observable canonical form, see, for example, [6], as follows: x βi1 (t) = −2ζω n x βi1 (t) + x βi2 (t), (1a) x βi2 (t) = −ω 2 n x βi1 (t) + ω 2 n β r (t), where ζ, ω n are parameters, and x βi1 , x βi2 , state variables. Using the same approach, the relation between converter reference τ g,r and output τ g can be written aṡ x τg (t) = −α gc x τg (t) + α gc τ g,r (t), (2a) where α gc is a parameter, and x τg is the state variable.
To incorporate fault information in the nominal model, we have chosen to model all faults as additive signals in corresponding equations. Thus, we are not taking into account all information regarding the nature of faults given in [2]. Consider, for example, fault Δβ 1 which represents an actuator fault in pitch system 1, see (1a)-(1c), resulting in changed dynamics of β 1 due to dropped main line pressure or high air content in the oil. One possible way to model this fault would be as a deviation in parameters ω n and ζ in (1a) and (1b). With the chosen approach, the fault is instead modeled as an additive signal in (1c) for i = 1, that is, Note that the adopted fault modeling approach is general and no assumptions are made regarding, for example, the time-behavior of faults. Thus, the approach is able to handle, for example, multiplicative faults even though the fault signal is assumed to be additive. Consider, for example, a multiplicative fault in β 1 given by β 1 = δ · x β11 , where δ / = 1, which can be equivalently described by β 1 = x β11 +Δβ 1 , where Science and Engineering   3 The main argument for using this, more general, approach is that we consider it hard, or even impossible, to know exactly how a faulty component behaves in reality. Furthermore, data from all fault cases for evaluation and validation of a more-detailed model are seldom available. Modeling faults in this way also results in a minimum of fault modes. This is beneficial since it gives a smaller model which simplifies several steps in model-based diagnosis, for example, residual generation and isolation. In addition, regarding how diagnosis information is utilized, for example, for fault tolerant control, it is unnecessary to distinguish between different fault modes if they are associated with the same action or consequence. Indeed, this applies to all sensor faults in the wind turbine, since the system should be reconfigured regardless of the type of sensor fault, that is, fixed value or gain factor, see [2, Table 2]. Last, but not least, an additional important motivator is simplicity, since extending the nominal model with additive fault signals in this way is straightforward and easy.

Model Extensions.
According to [2], the same pitch angle reference signal β r is fed to all three pitch systems (1a)-(1c), that is, β i,r = β r for i = 1, 2, 3. However, according to the provided Simulink model, see [7], the individual reference signals are instead calculated in a control loop outside the pitch system as where β i is given by (1a)-(1c), and β i,m1 and β i,m2 are sensor measurements. To incorporate this information in the design of the FDI system, the original wind turbine model is extended with the relations between β i,r and β r given by (4).

Overview of Design Method
The proposed FDI system for the wind turbine is comprised of three subsystems: residual generation, fault detection, and fault isolation, see Figure 2. Measurements, that is, sensor readings, from the wind turbine are fed to a bank of residual generators whose output is a set of residuals. The residuals are used as input to the fault detection block, which contains diagnostic tests based on the residuals. The output from this block, one signal for each residual, indicates if a fault has been detected in the part of the system monitored by the corresponding residual. The result from the fault detection is fed to the fault isolation block in which the detected fault(s) are isolated.
The proposed method supports design of the residual generation and fault detection blocks. Design of the fault isolation block is briefly discussed in Section 6.2. The method contains three essential steps: (1) generate candidate residual generators, (2) select and realize residual generators, (3) construct diagnostic tests, see Figure 3. In the first step, a large set of candidate residual generators are generated. In the second step, the residual generators most suitable to be included in the final FDI system are selected and realized. In the third and final step,  we design diagnostic tests based on the residuals obtained as output from the selected set of residual generators.
In the subsequent sections, we describe in detail the different steps of the design method used to create the proposed FDI system for the wind turbine benchmark system. As input to the design method, or prerequisites, we assume a model of the system and no-fault training data. The data is assumed to be expressed as measurements, either real or simulated, of the inputs and outputs of the model in realistic and representative no-fault operating conditions.

Residual Generation
The set of residual generators used in the FDI system are based upon the ideas originally described in [8], where unknown variables in a model are computed by solving equation sets one at a time in a sequence and a residual is obtained by evaluating a redundant equation. Similar approaches are described and exploited in, for example, [1,5,[9][10][11][12][13]. This class of residual generation methods, referred to as sequential residual generation, has shown to be successful for real applications and also has the potential to be automated to a high extent.

Sequential Residual Generation.
Some concepts and results of sequential residual generation given in [5], to which we also refer for technical details, will now be briefly recapitulated. We consider a model (E, X, D, Y ) to be a set of differential and algebraic equations E = {e 1 , e 2 , . . . , e nE } containing unknown variables X = {x 1 , x 2 , . . . , x nX }, differential variables D = {ẋ 1 ,ẋ 2 , . . . ,ẋ nX }, and known variables Y = {y 1 , y 2 , . . . , y nY }. The equations in E are, without loss of generality, assumed to be on the form whereẋ, x and y, are vectors of the variables in D, X, and Y , respectively. Note that the model of the wind turbine presented in Section 2.4 can trivially be cast into this form.

Computation Sequence.
As said above, the main idea in sequential residual generation is to compute unknown variables in the model by solving equation sets one at a time in a sequence and then evaluate a redundant equation to obtain a residual. An essential component in the design of a residual generator is therefore a computation sequence, which describes the order in which the variables should be computed. In [5], a computation sequence is defined as an ordered set of variable and equation pairs: where V i ⊆ X D and E i ⊆ E. The computation sequence C implies that first the variables in V 1 are computed from equations E 1 , then the variables in V 2 from equations E 2 , possibly using the already computed variables in V 1 , and so forth.
For an example, consider the computation sequence: for computation of a subset of the unknown variables in wind turbine model presented in Section 2.4. According to the computation sequence (8), the series of computations begins with computation of variable τ g using equation e 29 , then variable ω r is computed using equation e 24 , and so on, ending with computation of variableω g , or in fact ω g from equation e 12 .
By construction, see [5], it is guaranteed that no variable is needed before it has been computed. Hence, the series of computations described by the computation sequence exhibit an upper triangular structure. For the computation sequence (8), this series of computations is given by (9d) Whether it is possible or not to compute the specified variables from the corresponding equations depends naturally on the properties of the equations. Equally important are, however, prerequisites in terms of causality assumption, that is, regarding integral and/or derivative causality, and the properties of the computational tools, that are available for use, for a detailed discussion, see, for example, [5].
The computation sequence (8) makes use of solely integral causality when the variables θ Δ and ω g are computed using equations e 14 and e 12 , respectively.

Sequential Residual Generator.
Having computed the unknown variables in V 1 V 2 · · · V k , according to the Journal of Control Science and Engineering 5 computation sequence C in (7), a residual can be obtained by evaluating a redundant equation e, that is, where the operator var X (·) returns the unknown variables that are contained in an equation set. A residual generator based on a computation sequence C and redundant equation e is referred to as a sequential residual generator.
The computation sequence (8) together with equation e 26 constitutes a sequential residual generator for the wind turbine model. When all variables in the computation sequence (8) have been computed according to (9a)-(9d), the residual is computed as r = ω g,m1 − ω g .

Finding Sequential Residual Generators.
Regarding implementation aspects, for example, complexity and computational load, it is unnecessary to compute variables that are not contained in the residual equation, or not used to compute any of the variables contained in the residual equation. Furthermore, it is also desirable that computation of variables in each step is performed from as small equation sets as possible. It can be shown, see [5], that the equations in a computation sequence fulfilling the above properties, together with a redundant residual equation, in fact correspond to a minimal structurally overdetermined (MSO) set, see [3]. In other words, a necessary condition for the existence of a sequential residual generator for a model is that the model, or submodel, is an MSO set.

Candidate Residual Generators.
As indicated above, a first step when searching for a sequential residual generator for a model may be to find an MSO set in the model. Thus, an MSO set can be regarded as a candidate residual generator. There are efficient algorithms for finding all MSO sets in large equation sets, see, for example, [3].
Consider now the model of the wind turbine described in Section 2.4, with equations E = {e 1 , e 2 , . . . , e 33 }, unknown variables: and known, that is, measured, variables: In summary, the model contains 33 equations, 21 unknown variables, and 15 known variables. By utilizing the structure, that is, which unknown variables are contained in which equation, see, for example, [1], and a MATLAB implementation of the algorithm presented in [3], 1058 MSO sets were found in total.

Selecting Residual Generators
It is not feasible to implement and use all 1058 candidate residual generators, that is, MSO sets, in the final FDI system.
A more attractive approach is instead to pick, from the set of all candidate residual generators, a smaller set of residual generators with desired properties.

Desired Properties of Residual Generators. The desired
properties of the sought set of residual generators are as follows: (1) the set of residual generators should enable us to isolate all single faults from each other; (2) a set of residual generators of smaller cardinality is preferred before a larger one, given that the two sets have equal isolability properties; (3) a residual generator based on an MSO set of smaller cardinality is preferred before a residual generator based on an MSO set of larger cardinality, given that the two sets have equal detectability and isolability properties.
Properties 2 and 3 are mainly motivated by implementation aspects such as complexity, computational load, and numerical issues.
We will base the selection of residual generators on quantitative, structural properties of the MSO sets instead of more qualitative or analytical properties on the actual residual generators. The latter may result in better isolation performance but is considered intractable since it requires that residual generators are implemented, executed, and evaluated, and also access to representative measurement data for all fault cases.

Fault Detectability and Isolability.
To be able to formally state the selection problem, the notions of detectability and isolability are needed. Assuming that each fault occurs in only one equation, let e fi denote the equation in an equation set E containing fault f i , for example, e Δβ1,m1 = e 18 , see Section 2. Note that if a fault f j occurs in more than one equation, the fault f j can be replaced with a new variable x fj in these equations, and the equation x fj = f j added to the equation set. This added equation will then be the only equation where f j occurs. To proceed, let (·) + denote an operator extracting the overdetermined part of a set of equations. According to [14], a fault f i is structurally detectable in the equation set E if e fi ∈ (E) + and structurally isolable from fault f j in the equation set E if e fi ∈ (E) + and e fj / ∈ (E) + .
For an example, consider the equation set M = {e 26 , e 29 , e 24 , e 14 , e 12 } containing the residual equation and equations from the computation sequence (7), studied in Section 4.1.1. First, we note that the equation set M is an MSO set due to the property of sequential residual generators mentioned in Section 4.1.3. Further, since M is an MSO set, it holds that (M) + = M, see, for example, [3]. Thus, it can for instance be deduced that fault Δω g is structurally isolable from fault Δβ 1,m1 in M, since e Δωg = e 12 , e Δβ1,m1 = e 18 , and it holds that e 12 ∈ M and e 18 / ∈ M, see Section 2.4. By again utilizing the structure of the wind turbine model, the structural isolability properties of the model were calculated. All considered faults, see Section 2.2, can be (structurally) isolated from each other in the wind turbine model.

Selection Problem Formulation.
We will now formulate the selection problem in terms of properties on a set of MSO sets. To this end, let M denote the set of all MSO sets in the model, and F the set of considered faults. Let f i , f j ∈ F and define the isolation class for ( f i , f j ) as that is, I fi, fj contains the MSO sets in M in which fault f i is structurally isolable from fault f j . Further, let To be able to satisfy the isolability property 1 stated above, we want to find a set S ⊆ M with a nonempty intersection with all isolation classes, that is, The property (14) on S implies that we should find a socalled hitting set for I. To satisfy the property 2, we want to find an S so that |S| is minimized. Thus, the sought hitting set for I should be of minimal cardinality and we should find a so-called minimal cardinality hitting set (MHS) for I.
There are several possibilities for a metric that helps us find an S that satisfies property 3. We opt for simplicity and have, therefore, chosen to minimize M∈S |M|. As an additional requirement, on top of 1, 2, and 3 in Section 5.1 we require that at least one residual generator can be constructed from every M ∈ S.

Solving the Selection Problem.
The problem of finding a minimal cardinality hitting set is known to be NP-hard, see, for example, [15]. To overcome the complexity issues, we have chosen to compute an approximate solution to the problem in an iterative manner with a greedy selection approach as elaborated in [4].
To accomplish this, we need to specify a utility function, that is, a function that evaluates the usefulness of a given MSO set, and also state the properties of a complete solution to the selection problem. Following the greedy selection approach, we add to the solution the MSO set with the largest utility until the solution is complete. Furthermore, we only add MSO sets from which at least one residual generator can be constructed.

Characterization of a Solution.
We will now characterize a complete solution to the selection problem for use in the selection algorithm. First, we define the isolation class coverage of a set of MSO sets S ⊆ M as σ(S) = I fi, fj ∈ I : ∃M ∈ S, M ∈ I fi, fj , (15) which states which of the isolation classes in I that are covered by the MSO sets in S. The property 1 in Section 5.1, that is, the isolation or hitting set property, can with the isolation class coverage notion be formulated as σ(S) = I. This characterizes a complete solution of the selection problem.

Utility Function.
To evaluate a specific MSO set, we want to take into account the properties 1, 2, and 3, above. For a given MSO set M, we will use the utility function:  (16) hence corresponds to picking the MSO set with the smallest cardinality in M. This will help us satisfy property 3. The weighting factor γ is used to trade between the two properties reflected by these two terms.
Note that an MSO set maximizing one term in (16) may minimize the other since an MSO set of larger cardinality likely covers more isolation classes than an MSO set of smaller cardinality.

The Selection
Algorithm. The function selectResid-ualGenerators used for selecting residual generators by means of greedy selection is given in Algorithm 1. Input to the function is a set of MSO sets M, that is, a set of candidate residual generators, and a set of isolation classes I. The output is a set of MSO sets S ⊆ M and a set of residual generators G based on S. The function findComputationSequence, described in [5], is used to find a computation sequence in accordance with Section 4.1, given a just-determined set of equations. The function findComputationSequence can be found in Algorithm 2.
For a formal discussion regarding the qualification of using a greedy heuristic for solving the residual generation selection problem, as well as the complexity properties of such algorithms, please refer to [4] and references therein.

Selecting Residual Equation.
Note that the total number of sequential residual generators that potentially can be constructed from an MSO set equals the number of equations in the set. All residual generators created from the same MSO set, however, have equal fault detectability and isolability properties according to Section 5.2. Nevertheless, their actual fault detectability and isolability may differ due, for example, different sensitivity for noise, and so forth. To make the final selection of which of the residual generators created from an MSO set that should be included in the final diagnosis system, evaluation by means of execution using real measurements from different fault cases is needed. Since we in this work only assume that no-fault data is available, see Section 3, this is not possible.
In this work, the selection of which residual generator to create from a given MSO set is done so that the final deployment of the FDI system becomes as simple as possible. First of all, findComputationSequence was configured to prefer algebraic equations as residuals before differential equations, if possible. Second, in order to avoid implementation issues related to numerical differentiation, findComputationSequence was configured to prefer computation sequences using integral causality. Using this two-step heuristic, the selection of which residual generator to create from an MSO set, in practice, is more or less unambiguous. In those few cases where more than one candidate remains, we make an arbitrary selection.

Selected Residual Generators.
Both functions selectRe-sidualGenerators and findComputationSequence were implemented in Matlab. As computational tool, see [5], the algebraic equation solver MAPLE was utilized, which allows symbolic solving of algebraic loops. The input to the algorithm was the set of all 1058 MSO sets for the windturbine benchmark model, see  for i = 1, 2, . . . , |S| do (5): if not isInitCondKnown(Z i ) then (10): return ∅ (11): end If (12): To investigate the sensitivity of selectResidualGenerators to the parameter γ, that is, the tradeoff between properties 2 and 3 stated in Section 5.3 and reflected by |M| and M∈S |M|, the algorithm was run with the wind turbine model and 0 ≤ γ ≤ 1. The result is shown in Table 2, where S denotes the set returned by selectResidualGenerators. When γ = 1, the aim is to fulfill the isolation property with as few MSO sets as possible, no matter the size of the MSO sets. As seen in Table 2 this results in few, but large, MSO sets. The smaller the γ, the more attention is paid to the size of the MSO sets. It turns out that 0.1 ≤ γ ≤ 0.6 gives a decent tradeoff between |S| and M∈S |M| for the wind turbine model.
With γ = 0.5, the algorithm selected 16 MSO sets, that is, |S| = 16 and M∈S |M| = 61. Of the 16 selected MSO sets, 7 contain algebraic equations only. The other 9 MSO sets contain both algebraic and differential equations.
The FSM for the 16 MSO sets on which the selected residual generators are based is given in Table 3.

Fault Detection and Isolation
For fault detection and isolation, diagnostic tests based on the output from each of the 16 residual generators are constructed. Since no assumptions are made regarding the nature of the faults that should be detected, see Section 2.2, nothing is known about the fault's temporal properties, size, rate of occurrence, and so forth. Hence, we may not be able to fully exploit the potential of some general method for change detection as, for example, the CUSUM test, see, for example, [16]. As said in Section 3, we, however, assume that no-fault training data is available. To take advantage of this fact and also handle uncertainties in terms of modeling errors and measurement noise, we base our diagnostic tests on a comparison of the estimated probability distributions of no-fault and current residuals. The former probability distributions are estimated offline using the available nofault training data and the latter online using current data. A clear advantage with this approach is that changes in mean and variance are handled in a unified way, since we consider the complete distribution of the residual.

Diagnostic Test Design.
Let P NF be a discrete estimate of the probability distribution of a residual from no-fault data, and P a discrete estimate of the distribution of the same residual from present data, both having n bins. Then, the Kullback-Leibler (K-L) divergence [17] between P and P NF is given by where P( j) denotes the ith bin of the discrete distribution P.
To apply the K-L divergence for construction of a diagnostic test, we proceed as follows. Given a representative batch of no-fault data Z NF , that is, in our case measurements of the variables in the set Z which contains the inputs and outputs to the model, we run the set of residual generators and obtain a set of residuals. For each residual r i , we then estimate its probability distribution and obtain P NF i , that is, where R i is a stochastic variable, discretized in n bins, representing residual r i . As said, this procedure can be done offline. To estimate a probability distribution, we create a normalized histogram with n bins for the data from which the distribution should be estimated.
Online, we continuously estimate the distribution of the current residual r i using a sliding window containing N samples of r i . If we by P t i denote the estimated distribution of r i calculated at time t, that is, P t i ≈ P(R i |Z t ), where Z t denotes the batch of data in the sliding window at time t, the diagnostic test is designed as where J i is the threshold for alarm. The K-L divergence D(P t i P NF i ) is referred to as the test quantity of the diagnostic test T i .

Fault Isolation Strategy.
Due to uncertainties not captured by the given model nor present in the no-fault training data, the power of diagnostic tests is not ideal for all faults. That is, the probability of detection given a certain fault is not always 1. To take this into account, the isolation scheme will interpret an "x" in a certain row in Table 3 as if the test may respond if the corresponding fault occurs and consequently no conclusions are drawn if a test does not respond, see [18].
To obtain the total diagnosis statement from a set of alarming diagnostic tests, we simply match their fault signatures with the FSM given in Table 3. For example, if only test T 10 alarms, we look at the row corresponding to G 10 and conclude that either fault Δβ 1 or Δβ 1,m2 are present. If then also T 16 alarms, we combine the row corresponding to G 16 with the row corresponding to G 10 and conclude that fault Δβ 1 must be present.
To handle also multiple faults, we use the fault signatures in the original FSM in Table 3 to create an extended FSM with fault signatures also for multiple faults. This is done by column-wise OR operations in the original FSM. For instance, the column in the FSM for the double fault Δω g,m1 ∧ Δω g,m2 will get "x" in rows corresponding to G 1 , G 7 , G 11 , G 12 , and G 13 and zeros elsewhere. In the fault isolation scheme, we first attempt to isolate all single faults using the original FSM in Table 3. If this does not succeed, we try to isolate double faults, and so forth.

Implementation Details
The final FDI system was implemented in Simulink according to the structure in Figure 2. The 16 residual generators were implemented as embedded MATLAB functions (EMF) in which the code was automatically generated from the structures obtained from the functions findComputa-tionSequence and findResidualGenerators. The initial conditions for the states in the dynamic residual generators were derived from the corresponding sensor measurements, if available, otherwise, set to zero. For instance, θ Δ (t 0 ) = 0, x βi1 (t 0 ) = (β i,m1 (t 0 ) + β i,m2 (t 0 ))/2, and ω g (t 0 ) = (ω g,m1 (t 0 ) + ω g,m2 (t 0 ))/2. This may cause transients in the residuals, but this is not considered a problem.

Parameter Discussion.
Although the aim is to keep the number of parameters in the automated design method at a minimum, there are nevertheless some parameters that must be set. This section lists the needed parameters and discusses their influence on the performance of the FDI system.

Number of Histogram Bins and Size of Sliding Window.
The number of bins n in the histograms used as distribution estimates, is a tradeoff between detection time, noise sensitivity, and complexity, in terms of computational power and memory. A large n results in fast detection, but on the other hand also in increased sensitivity for noise. Also, a large n requires more memory and involves more computations, in comparison with a smaller n.
The size N of the sliding window used to batch data for creation of the histograms is a tradeoff between detection performance, noise sensitivity, and complexity. A large N will give the K-L test quantity lowpass characteristics, resulting in a smoothed K-L test quantity. This makes it possible to detect small changes in the estimated distributions. On the other hand, a large N requires more memory. The choice of N is also related to the number of bins n in the histograms and vice versa, since a small N, together with a large n, will result in a sparse histogram. Hence, the choices of N and n must match.
For the wind turbine benchmark model, investigations, however, indicate that the method is quite insensitive to the values of n and N if 15 ≤ n ≤ 50 and 2000 ≤ N ≤ 6000. A decent tradeoff, taking this into account and also the complexity issues discussed above, is n = 20 and N = 3000, which are the values used in the final FDI system.

Alarm
Thresholds. The choice of alarm thresholds J i , i = 1, 2, . . . , 16, is a tradeoff between detection time and the number of false detections. The higher the thresholds, the longer the detection time and the lower the rate of false alarms. The choice of alarm thresholds is related to the choices of n and N since both affect how sensitive a K-L test quantity is to noise, which in turn affects the rate of false detections. We aim at choosing the alarm thresholds so that the number of false detections is minimized, implying that the choice of J i must match the choices of n and N. For the wind turbine benchmark model, the alarm thresholds were computed as a safety factor α = 1.1 times the maximum value of the corresponding K-L test quantities from 100 simulations with no-fault data.

Isolation Validation Time.
The only parameter involved in the fault isolation is the isolation validation time t val I . This parameter is used to compensate for the fact that the power of diagnostic tests is not ideal, see Section 6.2. This may, for example, result in that the detection times, for the same fault, are different for different diagnostic tests. To handle this, we demand that the output from the isolation has been equal for t val I samples before reporting the isolation result. By choosing a large t val I , we decrease the probability of false isolation, but on the other hand, increase the isolation time. For the wind turbine benchmark model, the isolation validation time t val I was set to 4 samples.

Evaluation and Results
To evaluate the performance of the proposed FDI system, we use the test cases described in [2]. The test cases are based on measured wind data and a sequence of injected faults. The set of injected faults, their time of occurrence and description, is specified in Table 4. The sequence contains 5 sensor faults and 3 actuator faults. Note that two faults are injected at 1000-1100 s, that is, at this time, we have the double fault Δω r,m2 ∧ Δω g,m2 .
The no-fault distributions used in the evaluation were estimated from residual data stemming from 100 Monte Carlo simulations with no-fault data, that is, inputs, corresponding to the measured variables in Z. Each set of no-fault data was generated with the provided wind turbine model with different noise realizations according to the model.

Results and Analysis.
By means of Monte Carlo simulations, the FDI system was simulated 100 times with data from the provided wind turbine model setup according to the above-described test sequence.
Based on the results from the 100 runs, the mean time of detection T D , maximum time of detection T max D , minimum time of detection T min D , mean time of isolation T I , minimum time of isolation T min I , the total number of missed detections MD, and the total number of false detections FD, for each of the faults in the test sequence, were computed. The results along with the specified detection requirements [2], given in the row Req., are shown in Table 5, where all time values are given in seconds. Note that the specified requirements concern detection, and not isolation.
According to the row corresponding to T max D in Table 5, all faults in the test sequence could be detected. For faults Δω g,m2 ∧ Δω r,m2 , Δβ 1,m1 , Δβ 3,m1 , detection requirements are met, by means of both T D and T max D . All faults, except the double fault Δω g,m2 ∧ Δω r,m2 could also be isolated. However, the mean time of isolation, T I , Δτ g 3800-3900 τ g = τ g + 2000 Nm for some faults, for example, Δβ 2,m2 , is substantially longer than the corresponding mean time of detection. The main reason for this is that some tests respond slower to faults than other. As said, fault Δω g,m2 ∧ Δω r,m2 could not be isolated.
In fact, this fault is not uniquely isolable with the isolation strategy described in Section 6.2 since the test response of fault Δω g,m2 ∧ Δω r,m2 is a subset of the test response of fault Δω g,m2 ∧ Δω r,m1 , see Table 3. Both faults Δω g,m2 and Δω r,m2 are, however, contained in the diagnosis statement computed after the faults have been detected. It seems like sensor faults, for example, Δβ 3,m1 tend to be easier to detect than actuator faults as, for example, Δτ g and Δβ 2 . One possible explanation may be that actuator faults in general cause changes in dynamics, whose effects are attenuated by modeling errors, noise, and so forth.
As can be seen in the last two rows of Table 5, there are no missed or false detections in any of the 100 test runs.

Case Study of Fault
Δω r,m1 . To study in more detail how the FDI system handles faults, we consider the sensor fault Δω r,m1 . The fault corresponds to a fixed value of 1.4 rad/s being measured by sensor ω r,m1 and occurs at time t = 1500 s. According to the FSM in Table 3, the residuals sensitive to fault Δω r,m1 are r 2 and r 13 , obtained as output from the residual generators G 2 and G 13 , respectively. These residuals along with the corresponding K-L test quantities are shown in Figure 4. As can be seen, both the residuals and the test quantities respond distinctively to the fault.
To also illustrate the isolation procedure, we show in Figure 5 the result of the diagnostic tests T 2 and T 13 (a), the isolation result associated to faults Δω r,m1 (b) and Δω r,m2 (c), and also the signal that indicates when the isolation procedure is done (b, c). As can be seen in Figure 5, the first test that reacts to the fault is T 2 . This occurs at t = 1500.23 s. Since T 2 is sensitive to both fault Δω r,m1 and Δω r,m2 and no other test has alarmed, the diagnosis statement is that either Δω r,m1 or Δω r,m2 may be present, and no fault can be isolated. At t = 1502.55 s, test T 13 alarms. Test T 13 is sensitive to faults Δω g , Δω r,m1 , and Δω r,m2 , and the updated total diagnosis statement based on that both T 2 and T 13 have alarmed thus becomes Δω r,m1 , see Table 3. This occurs at time t = 1502.59 s.   Figure 4: Affected residuals r 2 (a) and r 13 (b), and the corresponding K-L test quantities D(P t 2 P NF 2 ) (c) and D(P t 13 P NF 13 ) (d) at the time of occurrence of fault Δω r,m1 .

Conclusions
We have proposed an FDI system for the wind turbine benchmark designed by application of a generic automated design method, in which the numbers of required human decisions and assumptions are minimized. No specific adaptation of the method for the wind turbine benchmark was needed. The method contains in essence three steps: generation of candidate residual generators; residual generator selection; diagnostic test construction. The second step is done by  means of greedy selection, and the third step is based on a novel method utilizing the K-L divergence. The performance of the proposed FDI system has been evaluated using the predefined test sequence for the wind turbine benchmark. The FDI system performs well; all faults in the test sequence were detected within feasible time and all faults, except a double fault, could be isolated shortly thereafter. In addition, there are no false or missed detections. A tailor-made, finely tuned, FDI system for the benchmark would probably perform better. However, in relation to the required design effort, and that no specific adaptation or tuning of the method to the benchmark was done, the performance is satisfactory.