Nasal Unit Testing Framework

From FlightGear wiki
Jump to navigation Jump to search
Nasal Unit Testing Framework
Started in 02/2014 (stalled)
Description Unit Testing support for Nasal
Maintainer(s) F-JYL, dbelcham, Hooray, Philosopher
Contributor(s) dbelcham, F-JYL (since 02/2014),
Status RFC


Status 06/2013

We do not currently have any established unit testing framework for Nasal. For the time being, this whole discussion is just about coming up with the requirements and a possible design. The intent is to explore the idea of being able to run isolated unit tests against Nasal scripts. The idea was being able to do something like this:

See also FlightGear commit d7a680.

Concept

standalone-nasal.exe ATR72-FMC-tests.nas

and see output like this:

When_calculating_deflection_that_is_greater_than_the_provided_maximum
the_provided_maximum_should_be_returned (failed: expected 180 was 99)

When_calculating_deflection_from_33_degrees_to_90_degrees
the_result_should_be_67_degrees (failed: expected 67 was 66)


The standalone Nasal interpreter

Based on what I read about the current stand-alone interpreter this should be fairly easy to do. That should now be possible, the nasal-standalone branch builds successfully for Windows (make sure to have boost available for building cppbind):

Here's the Nasal standalone interpreter as part of SimGear: gitorious/fg/hoorays-simgear/topics/nasal-standalone Just check out the branch named "nasal-bin".

To build it:

cd $SG_SRC
mkdir BUILD
cd BUILD
cmake ../ -DENABLE_TESTS=ON -DSIMGEAR_HEADLESS=ON -DENABLE_SOUND=OFF -DENABLE_LIBSVN=OFF
make

Basically

  • create and use a separate build folder, separate from the source tree
  • configure the build via -DENABLE_TESTS=ON -DSIMGEAR_HEADLESS=ON -DENABLE_SOUND=OFF -DENABLE_LIBSVN=OFF
  • build
  • report back

Note that there's no need to actually install anything (make install), because we are just using the SimGear library to build a standalone nasal-bin binary, nothing else.

Let us know if there are still any windows-specific build errors so that we can fix the config file. It should give you a "nasal-bin.exe" in $SG_BUILD/simgear/nasal/ that runs just Nasal scripts, no FG APIs whatsoever - you need to pass a valid Nasal script when running the file.

You should be able to "make test" to run a bunch of standard Nasal tests from the original Nasal repository, which are to be found in $SG_SRC/nasal/tests: gitorious/fg/hoorays-simgear/topics/nasal-standalone/simgear/nasal/tests

Meanwhile, the build actually works for Windows using the MingW compiler - providing a nasal-bin.exe, which I cannot test currently because I don't have a Windows VM available: http://www.speedyshare.com/MbBTA/download/nasal-bin.exe[dead link]

To run it, open a shell (START/EXECUTE cmd/command) and go to the folder of the binary, add a simple Nasal script and run "nasal-bin script.nas", i.e. use one of the scripts in: gitorious/fg/hoorays-simgear/topics/nasal-standalone/simgear/nasal/tests

Roadmap

  • get the stand-alone interpreter compiling and running against both windows and Linux (currently there are a bunch of libraries used that aren't available on Windows) Done Done
  • build a set of Nasal scripts that provide stubs for native fg calls (getprop/setprop/etc) Not done Not done
  • build out testing script with the ability to verify values and report failures to the console Not done Not done
  • send a patch upstream, so that a standalone Nasal interpreter is included in each upcoming release Not done Not done

Once we have those three things we would be able to write and execute tests independent of FG and still have them be meaningful. The key thing to remember is that this would be only for isolated unit tests. For integration tests (verifying that different systems, whether 2 different scripts/methods/application, work together correctly) we would need to think about a different approach.

It should be doable to teach the nasal-bin.exe to check $FG_ROOT, and use that if available to load a semi-plausible FG environment (API-wise) - using some fancy meta-programming tricks, most of the default APIs could probably be wrapped, without too much manual work involved. Philosopher could be truly instrumental here, because he really has a deep understanding of some of the more esoteric tricks that can be done in Nasal space, referring to advanced uses of compile(), bind(), call(), closure() and caller() - which make meta-programming a fantastic experience. Basically, familiarity with this handful of APIs, can save tons of time: http://plausible.org/nasal/

There's quite a lot of stuff possible in Nasal, that nobody ever used in FG - Philosopher has started writing a bunch of tutorials, for example see: Nasal Meta-Programming

And you can take a look at some of the scripts in the standalone branch, which support fancy constructs like dependency resolution using import("foo"); but also completely sandboxed/wrapped environments: gitorious/fg/hoorays-simgear/topics/nasal-standalone/simgear/nasal/tests

These are regression tests developed by the Nasal developer himself, so not true unit testing - but only regression tests for the interpreter itself.

Contributing

If possible, new code should be contributed to the maintainers, ideally even a branch of FG_ROOT, because that will make it easier to directly integrate such a unit testing system with all the code we got in $FG_ROOT/Nasal.

Problem

One of the things that is most frustrating and time consuming when working with Nasal scripts is the brute force and manual nature of testing the scripts. A simple misspelling in a custom script can take 5-10, or more, minutes to fix from the point of finding it (shut down FG, change script, startup FG and get back to a point in the sim where the code will execute). I realize that Nasal is tightly coupled to FG at this point and that most scripts won't run without access to the property tree. That doesn't mean that it can't be done though (mocks, fakes, stubs, etc).

Most scripts in FlightGear use a plethora of APIs and FG-specific modules, so FG has become a runtime dependency (APIs, data structures like the property tree, and "live" state),

Test/Fail Passes

Just because a language is dynamic doesn't mean that the code-test-fail/pass feedback loop has to be a long one. Just look at dynamic language communities like Ruby and you'll see that automated testing (both behavioral and state), in combination with continuous integration, is used to try to move those failures from application run-time to test suite run-time. Dynamic language communities do this with a lot of success (both in tightening the feedback loop and improving quality).

Wrapping dependencies

FGs loading of scripts is stopping us from being able to change-reload right in app. What I was thinking is to remove the app (FG in this case) from the equation completely. For a lot of systems scripts the only interaction that they have with FG is through the property tree. Between these interactions the nasal systems scripts being developed for most aircraft are simply state based; they read some values, do some calculations and set some values. If we know the inputs (method parameters and getprop calls), we know the outputs (method return values and setprop calls). What I do in other languages is to replace the call to those "external dependencies" (in this case getprop and setprop) with known implementation. So a call to getprop("/orientation/yaw-deg") in a specific test scenario would be configured to always return "33.5". My understanding of the inner workings of Nasal are limited, but I would think that one should be able to override get/setprop due to the dynamic nature of the language. That said, I can't find any definitions for those in the nasal-standalone codebase.

If someone could point me to where the get/setprop stuff is then I'd be a step closer to exploring my theory of having standalone Nasal running against developer defined property tree values as a mechanism for automated testing.

Wrapping APIs is simple to do in Nasal, too - without even requiring C/C++ changes, a standalone testbed could be scripted in Nasal like this:

var tree = {};

var isalpha = func(n) n >= `a` and n <= `z` or n >= `A` and n <= `Z`;
var isdigit = func(n) n >= `0` and n <= `9`;

var sanitize = func(p) {
    if (!p) die();
    if (p[0] == `/`) p = substr(p, 1, nil);
    parts = split("/", p);
    for (var i=0; i<size(parts); i+=1) {
        if (parts[i] == "") {
            if (i == size(parts)-1)
                parts = parts[:i-1];
            else parts = parts[:i-1] ~ parts[i+1:];
            i-=1;
        } else {
            for (var j=0; j<size(parts[i]); j+=1) {
                if (parts[i][j] == `[`) break;
                if (parts[i][j] != `-` and !isalpha(parts[i][j]) and
                    parts[i][j] != `_` and !isdigit(parts[i][j]) and
                    parts[i][j] != `.`) die("bad character in name "~parts[i]~" at index "~j~".");
            }
            if (j == size(parts[i]))
                parts[i] ~= "[0]";
            elsif (parts[i][-1] != `]`) die("bad index specifier in string "~parts[i]~".")
        }
    }
    var p = "/";
    foreach (var part; parts)
        p ~= part;
    return p;
}
 
# wrappers for the FG setprop/getprop APIs:
var setprop = func(p, value) tree[ sanitize(p) ] = value;
var getprop = func(p) return tree[ sanitize(p) ];
 
# some tests:

var path = ["/foo/bar", "/foo[0]/bar[0]","/foo[0]/bar[0]/", "/foo/bar[0]","/foo[0]/bar/","/foo[0]/bar/"];
var value = "MyUniqueValue";
setprop(path[0], value);
foreach(var p; path)
  if (getprop(p) != value) die("sanitize() implementation is broken");
print("sanitize() looks good!\n");

# init your tree:
setprop("/orientation/yaw-deg", 33.5);
print("yaw-deg is:", getprop("/orientation/yaw-deg") );

Basically, you can override ANYTHING in Nasal - even library/extension functions - see above, you don't even need to look at the Nasal C code.

We would need to use custom script-specific wrappers, instead of the main FG/Nasal APIs and modules - so that your Nasal code *never* uses the APIs directly, that way you can easily have different implementations - i.e.

var debug_profile = {};
var runtime_profile = {};
var current_profile = nil;

runtime_profile.systime = systime;
debug_profile.systime = my_systime;

# set the API profile:
current_profile = runtime_profile;

# And then only ever make calls through active_profile.systime():

print( current_profile.systime() );

# or simply override the global symbols during initialization:
var systime = current_profile.systime;
print( systime() );

Test Suites

I'd rather not have to launch FG to run my tests. Ideally I'd like to be able to build up a suite of tests that I can run in an automated fashion to ensure that all are still operating as expected at any time. In my close to ideal world I would want to be able to execute the automate tests (and exercise the scripts as a result) dozens of times per hour. Launching FG and manually triggering this from the console would limit me to a few times per hour. Add in more complicate scripts which require extensive state in the property tree to test specific scenarios and I might be lucky to manually execute these tests a couple of times an hour. Manually launching the tests would also probably mean that I'd have to remember to configure and run each and every scenario...something I would never remember to do. Each time I forgot I'd possibly be introducing issues into my code.

The rigour around this isn't for everyone, but it is how I derive the most confidence that I'm delivering the highest quality code possible. This all came to light because of a defect in the ATR the Omega95 and I have built (mostly him) that could have easily been found if we had had some automate scenario tests written.

I'm working with the ATR's FMC. The time and difficulty with it is that if I want to change and retest any of the different segments in the flight plan (especially the SID and STAR) I need to be able to reset my self to a specific position and property tree state. If I'm testing the transition from waypoint 1->2 in the SID then I need to start pre waypoint 1 which means on the ground, starting up, keying in the flight plan, etc. Worse is the transition from the flight plan to the first STAR waypoint. If I could write automated tests for all of these scenarios plus dozens of others I can think of then my development-testing feedback loop would tighten immensely.

I had the idea, which you implemented above, of just overriding the get/setprop un the scripts. I guess taking that Sudafed might pay off in more ways than one. Ultimately what I'd like to have is an implementation of something like the jUnit/xUnit/nUnit testing frameworks. Tonight's goal will be to hack out a rough implementation that allows for isolation of the property tree.

Integration & Adoption

Unit testing support in Nasal would certainly be beneficial - but it would need to be added to FG at a library-level, i.e. in $FG_ROOT/Nasal, so that people have to use it, and have an advantage when using it - sort of like the RoR example you mentioned previously.

I could see that being useful for many things, even outside aircraft development - but thinking in th most generic terms, we need to find a compromise that will not just work for specialists who have a decade of unit testing experience, but also our average aircraft developers.

Scripting-wise, I think we really only got a handful of people here who regularly write Nasal code and who would also see the merits and potentially adopt the system.

People would only be likely to actually use that if they have a corresponding developers background, so it would need to be designed right into the framework and touch lots of places in $FG_ROOT and $FG_AIRCRAFT - I only see a handful of aircraft developers here who would go that route and actually have the mental capacity, and developer mentality to see the merits here.

Probably,a handful of people would be able to use it, but if it's well documented, and if it actually supports features not provided otherwise, it could gain traction - so it would need to be more compelling than the current workflow obviously.


A unit testing framework is definitely going to be useful for $FG_ROOT as a whole, not just aircraft/instrument developers. Obviously, one of the first steps will be documenting the whole thing with tutorials, so that people can start adopting it.