User:Philosopher/Howto:Write Optimized Nasal Code

From FlightGear wiki
Jump to navigation Jump to search
WIP.png Work in progress
This article or section will be worked on in the upcoming hours or days.
See history for the latest developments.

Paying attention to the code your writing can help speed it up and make it higher quality. This tutorial will teach programming good-practices and how to write more efficient code.

There are two main types of efficiency to think about: algorithmic and GC (garbage collector) efficiency. Like all programming languages, one can write more and less efficient algorithms to fix the same problem, and in particular Nasal does not have an optimizer, so the coder (you) have to fix these yourself. Since Nasal is a scripting language, it has automatic garbage collecting through a mark/sweep collector which has been shown to be an expensive part of running Nasal in a runtime environment. There are things that can be done about this, in particular writing code that better heeds this limitation. I will start with this issue.

Garbage Collector-conscious optimizations

Except for numbers, every data type in Nasal has to be stored in an allocated C structure (struct naCode, naFunc, naCCode, naHash, naVec, and naStr) which is pointed to when a reference is needed. Each "variable" that one interacts with when writing a program is essentially a reference to one of these if the variable is not a number. To find the ones that are not valid, the mark/sweep garbage collector has to trace all references that are deemed valid and trace those children too; the starting point for the snowballing is essentially anything that is accessible from a function on the call stack or from the global namespace (though of course listeners and such also have to protect their storage). Excepting scalars, every object references other objects within their internal structure; when the object is declared as "reachable" during the mark phase, the GC also has to declare any references in the object as "reachable" as well (since they are obviously crucial to the working of their parent object and need to exist with their parent). For hashes this means all of the key/value pairs; for vectors all of the values; and for function wrappers (naFunc) the code (naCode) and its namespace (naHash) and other closures (naFunc) or the Nasal wrapper for a C function pointer (naCCode) (besides any other member references in any of those latter datatypes, like the filename for a naCode).

There are a couple ways that a new object is created, which potentially requires calling the GC before the object is allocated. These are:

  1. String/vector concatenation with the ~ operator.
  2. Any [...] or {...} expression to create a vector or hash.
  3. Any func {...} expression, since a new wrapper (naFunc) needs to be generated for the underlying code (naCode).
  4. A call to a function written in Nasal will need to create a namespace for the function call (unless it is specified as part of call()). This creates a new object that will need to be GC'ed later
  5. Most extension functions, which often will create & return new objects (those using any naNew*() API), like substr, subvec, and caller, but usually not append (if sufficiently sized, i.e. not needing to resize() - since it modifies an existing object) or systime (because it returns a number).

These constructs that create new objects are definitely fine to have in your code, but the question to ask with each one is "if this garbage-creating expression is evaluated often, can I move this to an outer scope?". In other words: can you make the object once and still have the same functionality instead of making multiple copies? An example:

# BAD: creates multiple object in each iteration of the loop
for (var i=0; i<10; i+=1)
    foreach (var j; [1,2,3,4,5]) # creates new object at the start of this foreach loop
        print(j); # uses it as read-only and thus doesn't require a new object each time
# GOOD: creates one object that is used for every iteration of the loop
# without being part of the loop body, i.e. the list variable in the outer scope
# doesn't need to be GC'ed here
var list=[1,2,3,4,5];
for (var i=0; i<10; i+=1)
    foreach (var j; list) # references the same object each time
        print(j); # uses it as read-only

Now if we were to mutate the vector, by assigning to one of its elements or by changing its size, we would want a different object each time (otherwise the mutations would keep building on top of each other). Note that in cases where we can do this, it not only serves as a GC optimization but also as a algorithmic optimization if we do this with hashes, since we avoid several hash sets to initialize the new hash by substituting a single hash get to find the variable.

  • Use append(vec, a) instead of vec ~= [a].

In general, move object creation out of loops/functions, to a higher level scope, so that they don't add unnecessary garbage to the program - this applies in particular to those that are frequently used in callbacks invoked through timers and/or listeners. Also, instead of using anonymous objects in repeatedly called code, (vectors, hashes, functions) - lift their scope to a higher level, give them a symbolic name and reference them, so instead of this:

var l = settimer( func {
    var x = [1,2,3];
}, 0);


var x = [1,2,3];

var foo = func {
var l = settimer( foo, 0);

It is a good idea to set up your own environment (hash/namespace) for invocation-specific state, so that you can reuse/cache objects, rather than allocating/freeing them every time your callback is invoked.

Algorithmic optimizations

Here's a list of general things to keep in mind (in no particular order):

  • Use foreach/forindex where possible.
  • Don't create temporary variables.
  • Avoid multiple calls to a function if they will return the same thing.
  • If you set a variable and make use of it only once, then directly embed the right-hand side of the assignment.
  • In general, move the assignment of variables to the first place you use them, and if that is the only place you use that variable, then remove assignment of the variable entirely.

If you think some of these rules lead to complete illegibility, then you can either ignore it (at the expense of optimization) or write two versions of the same code, one optimized and one legible.