Failure Manager

From FlightGear wiki
Jump to navigation Jump to search
This article describes content/features that may not yet be available in the latest stable version of FlightGear (2020.3).
You may need to install some extra components, use the latest development (Git) version or even rebuild FlightGear from source, possibly from a custom topic branch using special build settings: .

This feature is scheduled for FlightGear 3.2. 90}% completed

If you'd like to learn more about getting your own ideas into FlightGear, check out Implementing new features for FlightGear.

Failure Manager
Started in 02/2014
Description Failure Management Framework
Maintainer(s) galvedro, Hooray
Contributor(s) User:galvedro
Status First milestone merged to fgdata on 06/2014
Folders

[1]

[2]
Topic branches:
fgdata [3]

Objective

Design a framework to unify and simplify failure simulation, both from the system modeler and end user perspective.

Status (07/2014)

  • A first stable version of the framework is committed to git and will be publicly available on 3.2. See the project sidebar for pointers to the code.
  • Necolattis is working on a more capable Canvas based GUI to replace the old one.
  • Galvedro is revising the architecture to support aircraft provided wear/damage/failure models.

Current Situation (3.1 and earlier)

All systems and most instruments implemented in the C++ core support basic failure simulation by means of a serviceable property. Generally, when this property is set to false, the system will stop updating itself. Some of these systems may support additional, more elaborate types of failures.

Other than this convention of using a serviceable property, there is no framework on the C++ side with regards to failure simulation. There is, however, a Nasal submodule that can generate random failures by taking user input from the GUI and controlling the relevant properties.

The approach is good, but the main problem is that the supported types of failures are hardcoded both in the Nasal module and the GUI.

Limitations

  • The GUI presents a fixed set of failures that can be simulated, regardless of what systems are actually implemented in the aricraft.
  • Aircrafts can not add their own implemented failures to the set in a clean way.
  • Failures are considered boolean by the framework, i.e. either serviceable or not serviceable. There is no way to express intermediate states of failure.
  • Only random failures based on time or usage cycles are supported.
  • In general, the framework is not extensible.

Proposed improvements

The proposal is to maintain the current schemma of having a Nasal submodule dedicated to failure simulation, but overhaul it to overcome the limitations stated above. In order to accomplish that, we will raise its status to a full fledged Failure Manager.

Here are some desirable traits for the new module:

  • To start with, the failure manager should definitively _not_ implement any particular system failure by default, but provide the logic for managing random events or on demand failure mode activation.
  • It should also provide a subscription interface so aircraft systems can register their own failure modes. After all, only the aircraft "object" is really aware of what systems are being modeled.
  • It should not make any assumptions on how to trigger failure modes (i.e. do not assume a serviceable property). Instead, the Failure Manager should use an opaque interface for setting a "failure level" and leave the details of activation to user scripts.
  • The Failure Manager should also support a flexible set of trigger conditions, so failure modes can be programmed to be fired in different ways, for example at a certain altitude.
  • GUI dialogs should be generated procedurally, based on the set of supported failure modes that has been declared by the aircraft.

Implementation Details

The current prototype includes three components:

  1. A Nasal submodule that implements the core Failure Manager.
  2. A Nasal library of triggers and actuators for programming the Failure Manager.
  3. A compatibility script that programs the Failure Manager to emulate previous behavior. Currently loaded by default on startup.

The design revolves around the following concepts, all of them implemented as Nasal objects.

FailureMode
A failure mode represents one way things can go wrong, for example, a blown tire. A given system may implement more than one failure mode. They store a current failure level that is represented by a floating point number in the range [0, 1] so non boolean failure states can be supported.
FailureActuator
Actuators are attached to FailureModes and encapsulate a specific way to activate the failure simulation. They can be simple wrappers that change a property value, but they could also implement more complex operations. By encapsulating the way failure modes are activated, the Failure Manager does not depend on conventions like the serviceable property, and can be easily adapted to control systems designed in different ways.
Trigger
A Trigger represents a condition that makes a given FailureMode become active. The current prototype supports the following types: altitude, waytpoint proximity, timeout, MTBF (mean time between failures) and MCBF (mean cycles between failures). More can be easily implemented by extending the FailureMgr.Trigger Nasal interface.
FailureMgr
The Failure Manager itself. Keeps a list of supported failure modes that can be added or removed dynamically using a Nasal API. It also offers a Nasal interface for attaching triggers to failure modes (one trigger per failure mode). While enabled, the FailureMgr monitors trigger conditions, and fires the relevant failure modes through their actuators when their trigger becomes active. The FailureMgr can be enabled and disabled on command, both from Nasal and the property tree.

Examples

Cquote1.png Here is a speed trigger and a value actuator:
— Necolatis (Sat Jun 14). Gear fail at too high speed.
(powered by Instant-Cquotes)
Cquote2.png
##
    # Trigger object that will fire when aircraft air-speed is over
    # min, specified in knots. Probability of failing will
    # be 0% at min speed and 100% at max speed and beyond.
    # When the specified property is 0 there is zero chance of failing.
    var SpeedTrigger = {

        parents: [FailureMgr.Trigger],
        requires_polling: 1,

        new: func(min, max, prop) {
            if(min == nil or max == nil)
                die("SpeedTrigger.new: min and max must be specified");

            if(min >= max)
                die("SpeedTrigger.new: min must be less than max");

            if(min < 0 or max <= 0)
                die("SpeedTrigger.new: min must be positive or zero and max larger than zero");

            if(prop == nil or prop == "")
                die("SpeedTrigger.new: prop must be specified");

            var m = FailureMgr.Trigger.new();
            m.parents = [SpeedTrigger];
            m.params["min-speed-kt"] = min;
            m.params["max-speed-kt"] = max;
            m.params["property"] = prop;
            m._speed_prop = "/velocities/airspeed-kt";
            return m;
        },

        to_str: func {
            sprintf("Increasing probability of fails between %d and %d kt air-speed",
                int(me.params["min-speed-kt"]), int(me.params["max-speed-kt"]))
        },

        update: func {
            if(getprop(me.params["property"]) != 0) {
                var speed = getprop(me._speed_prop);
                var min = me.params["min-speed-kt"];
                var max = me.params["max-speed-kt"];
                var speed_d =  0;
                if(speed > min) {
                    speed_d = speed-min;
                    var delta_factor = 1/(max - min);
                    var factor = speed <= max ? delta_factor*speed_d : 1;
                    if(rand() < factor) {
                        return 1;
                    }
                }
            }
            return 0;
        }
    };


    ##
    # Returns an actuator object that will set a property at
    # a value when triggered.
    var set_value = func(path, value) {

        var default = getprop(path);

        return {
            parents: [FailureMgr.FailureActuator],
            set_failure_level: func(level) setprop(path, level > 0 ? value : default),
            get_failure_level: func { getprop(path) == default ? 0 : 1 }
        }
    }
Cquote1.png And a jsbsim example of how to use it on individual gears:
— Necolatis (Sat Jun 14). Gear fail at too high speed.
(powered by Instant-Cquotes)
Cquote2.png
#front gear locking mechanism might fail when deployed at too high speeds

    var prop = "gear/gear[0]/position-norm";
    var trigger_gear0 = SpeedTrigger.new(350, 500, prop);
    var actuator_gear0 = set_value("fdm/jsbsim/gear/unit[0]/z-position", 0.001);
    FailureMgr.add_failure_mode("controls/gear0", "Front gear locking mechanism", actuator_gear0);
    FailureMgr.set_trigger("controls/gear0", trigger_gear0);

Roadmap

  1. Replace Nasal/failures.nas with a new module implementing the design proposed above. Wire it to the exising GUI dialogs and ensure backwards compatibility 100}% completed
  2. Help the Canvas team to develop a Nasal GUI API.
  3. Replace the hardcoded dialogs with dynamic ones using whatever comes out from the step above.
  4. Do not load the compatibility layer globally (i.e. by default), but rather load it explicitly from every aircraft (this is gonna be some seriously boring and tedious work).
  5. Aircraft authors can now start customizing the failure features for their crafts in a clean way.
  6. Extend the feature set as needs arise (instructor console, additional triggers, ground equipment failure simulation, etc).

Under consideration

  1. Generalize the trigger system and make it available as a global service. Might be useful for missions/adventures, AI agents, etc.
  2. Introduce the concept of Wear Models.

Related

Quotes from the forum

Cquote1.png This is being worked on by galvedro, who's hoping to have most of this finished in time for 3.2 - it's basically a framework for registering triggers and allowing different types of failures to be modeled.
— Hooray (Wed Apr 30). Re: Engine wear.
(powered by Instant-Cquotes)
Cquote2.png
Cquote1.png A generalized wear/failure system is a good thing because this goes beyond just engine wear. With a generalized system we should be able to model things like brake, tire, propeller wear just to name a few.
— hvengel (Wed Apr 30). Re: Engine wear.
(powered by Instant-Cquotes)
Cquote2.png
Cquote1.png the other advantage is that this is going to be agnostic to the way it is controlled, i.e. there's a concept of a dedicated "failure manager", so that this can be hooked up to different front-ends, including a telnet/web-based front-end (e.g. instructor console), or even just an integrated Canvas/GUI dialog.
— Hooray (Wed Apr 30). Re: Engine wear.
(powered by Instant-Cquotes)
Cquote2.png
Cquote1.png I am doing some work related to this. Wear, as a concept, will not be supported in the first drop of this development, but I would like to include it eventually as part of the system. What I would like to know from those of you who create aircraft models or have an opinion on the subject is: how would you expect such a feature to work?
— galvedro (Sat May 03). Re: Engine wear.
(powered by Instant-Cquotes)
Cquote2.png
Cquote1.png This is all looking very promising, but you guys should really be aware of galvedro's work, and flug's bombable addon - there certainly is quite some overlapping code in all 3 efforts here, and it would make sense to generalize and unify things so that code can be better reused.
— Hooray (Mon Jun 09). Re: Better nort crash.
(powered by Instant-Cquotes)
Cquote2.png
Cquote1.png Once we start looking at combat hits, we'll almost certainly be comparing flug and dfaber's work on projectile hits so that our method of reporting submodel hits allows compatibility where possible. Tom has already built a method of seeing tracer from AI and MP models which also checks for collisions using submodels, so our next step is to address hit compatibility and how hits are passed over MP.
— Algernon (Mon Jun 09). Re: Better nort crash.
(powered by Instant-Cquotes)
Cquote2.png
Cquote1.png as far as failures go, I am quite keen to find galvedro's code to find out how the failures system is changing - my intention all along has been to keep pace with FG's built in failures and adapt the damage system accordingly to make the best use of it. I'm looking through the repository at the moment but haven't yet found it. The damage system is intended to be a stage between hits and failures - failures may happen anyway, but failures are more likely to result where there is damage; to what extent will be handled between the damage script and the built in failure system. That said, I still think there will be room, a need even, for more detailed modelling of individual aircraft's particular characteristics - as an example, I've been looking at the failure probabilities for an EE Lightning, they will be significantly more prone to engine fires than the Victor!
— Algernon (Mon Jun 09). Re: Better nort crash.
(powered by Instant-Cquotes)
Cquote2.png
Cquote1.png Since you are actually doing failure/damage/wear modeling, I am very interested in hearing your feedback about the new failure manager architecture and functionality.

I would suggest to read the wiki page Hooray posted first, as I tried to document the motivation for the change and the design principles there. The public interface for programming the failure manager from Nasal is at Nasal/FailureMgr/public.nas. It should be reasonably documented, but please let me know if you find something confusing or unclear.

On a side note, I don't recommend using the property tree interface directly for new developments, as it is currently half way between what it was and what I want it to be, so it is a bit dirty right now and it will change a bit in the future.
— galvedro (Thu Jun 12). Re: Better nort crash.
(powered by Instant-Cquotes)
Cquote2.png
Cquote1.png Algernon: Great to see that you're actually interested in collaborating here and using existing code - it is very frustrating to see other efforts whose contributors don't realize how heavily their work is related, and how much it would make sense to team up with others to collaborate in a more framework-centric fashion, rather some aircraft-specific feature. We've recently seen several efforts with little to zero communication and collaboration, where contributors could have save months of work had they spoken up earlier and had they shown willingness to collaborate.

The added advantage here is that galvedro's code is a good foundation to work with, i.e. his code is exceptionally clean and he's obviously very familiar with coding, so a joint effort can be a mutually beneficial experience for all parties involved, and you'll save a ton of work and time along the way, while also ensuring that your work is generic, i.e. can be easily reused by other aircraft/developers.
— Hooray (Thu Jun 12). Re: Better nort crash.
(powered by Instant-Cquotes)
Cquote2.png
Cquote1.png Regarding damage modeling WRT combat/bombable, I'd like to check flug's code at some point to see if/how certain parts of it could be generalized there - even just moving useful routines to a dedicated module in $FG_ROOT/Nasal or $FG_ROOT/Aircraft would be a good thing in my opinion. Flug has written some very clever Nasal code as part of the bombable addon, and we should really try to understand how to generalize and integrate the most useful parts so that people working on similar features can reuse his work.

EDIT: bombable.nas: https://github.com/bhugh/Bombable/blob/ ... mbable.nas
— Hooray (Thu Jun 12). Re: Better nort crash.
(powered by Instant-Cquotes)
Cquote2.png
Cquote1.png We're definitely keen on using existing code where possible, I will admit that I need to look outside my own development sphere more as it's too tempting just to code something for hours, for fun, which is probably already extant somewhere! I believe galvedro has mentioned somewhere in a post he's interested in overhauling the Electrical.nas script - that is somewhere I'd be very interested to collaborate - battery drain and charge, AC and DC circuits, reasonably realistic load characteristics... that's something I'm excited about! I'm also always keen to get a firmer grip on Nasal, mine is still extremely basic and fairly inelegant.
— Algernon (Thu Jun 12). Re: Better nort crash.
(powered by Instant-Cquotes)
Cquote2.png
Cquote1.png I have tested most of it.

New failures:
McbfTrigger, TimeoutTrigger, MtbfTrigger, AltitudeTrigger, set_unserviceable, set_readonly all works.

Old failures:
Engine, gear, flaps, electrical, aileron, rudder, elevator, asi, altimeter, attitude still works.

I have yet to write a custom trigger or actuator, but I would say this system seems good designed, and very powerful.
— Necolatis (Thu Jun 12). Re: How does serviceable and failures work?.
Cquote2.png
Cquote1.png We will talk about this a bit later on when the FGUK guys have played with the module as well. But one thing that aircraft developers are demanding in one way or another, is to go one step further and do a more complex system damage/wearing, where failures influence each other, i.e. a structural failure here produces a system failure there and so on.

This kind of modeling is likely to be fairly aircraft specific, but we will see. I have given some thinking to a next evolutionary step in this direction, and it is quite tricky to organize in a clean way actually.
— galvedro (Fri Jun 13). Re: How does serviceable and failures work?.
Cquote2.png
Cquote1.png A fired triggerd remains fired until removed, or rearmed by using trigger.reset()

Custom triggers will not play well with the gui at the moment. For this first release I just squeezed the new engine underneath the existing gui, and the compatibility layer responds to it by emulating the former behavior (if it doesn't, it is a bug). That means that only Mtbf and Mcbf triggers are currently supported through the gui.
— galvedro (Fri Jun 13). Re: How does serviceable and failures work?.
Cquote2.png