Nasal Meta-Programming

From FlightGear wiki
Jump to navigation Jump to search

A module full of different hacks has been created as an interest-driven tutorial to explain the workings of advanced Nasal programming, providing insights to help programmers become Nasal hackers.

Introduction

The module is named "gen.nas". Although it started with "generators", it covers various experimental utilities. The name implies a fair amount of ambiguity and convenience. It is based on a mixture of two different methods to load modules: the driver.nas import function and FlightGear's security-free io.load_nasal function. Andy Ross (the creator of Nasal) made a repository on GitHub (see [1]) that contains helpful Nasal libraries. One of these, driver.nas, provides an import() function which duplicates the global namespace to prevent modules loaded by import() from having write access to it, though they can still use extension functions like find(). Because access to the global namespace was required, the standard import() function could not be used exclusively. However, the idea of an EXPORTS vector to control (or at least manage) what could be used outside of the module was appealing, alongside allowing for example functions to make use of it. In the end, the module is loaded using a combination of both methods.

The whole file can be viewed here (updated 05/2013), and each section is copied below for detailed explanations. At the top, some initial comments cover the basic setup:

# gen.nas -- namespace "gen"

# Generators and mostly utilities using namespace hacking &c
# Quickly grew overboard ;)

# Note: the fundamental assertion that _the_globals is *the* globals
# could potentially cause problems depending on the loading method
# (driver.nas's import would not work, but FlightGear's io.load_nasal
# would work; which is funny, given that I am using EXPORTS :D).

The EXPORTS Vector and Public Functions

At the top of the file, a minimal EXPORTS vector is defined:

var EXPORTS = ["_the_globals", "_global_func",
               "public", "namespace", "global",
               "bind_to_caller", "bind_to_namespace",
               "bind_to_namespaces"];

To avoid manually entering every function that needs to be made public, a dynamic solution was implemented via the public() function:

# For each symbol created by the function <fn> or
# for each symbol in <fn> (if it is either a hash
# or vector), add the name of the symbol to the
# caller's EXPORT vector. Returns a vector of the
# added symbols and adds the symbols to the caller's
# local namespace if possible (i.e. when <fn> is not
# a vector).
#
# The anonymous function argument is so that you can
# use exactly the same syntax, versus having to
# convert it to or write in hash-style syntax (after
# all, Nasal just splits off another codegen to handle
# func{}s...)
var public = func(fn) {
   var c = caller(1)[0];
   var names = []; var hash = {};
   if (typeof(fn) == 'func') {
      call(fn, nil, nil, hash);
      var names = keys(hash);
   } elsif (typeof(fn) == 'hash') {
      var names = keys(hash = fn);
   } elsif (typeof(fn) == 'vector') {
      var names = fn;
   } else die("invalid/unrecognized argument to public()");
   foreach(var sym; keys(hash)) {
      c[sym] = hash[sym];
   }
   if (!contains(c, "EXPORTS"))
      c["EXPORTS"] = [];
   return foreach (var sym; names) {
      append(c["EXPORTS"], sym);
   };
   # In case the behavior changes (they are equivalent):
   #foreach (var sym; names) {
   #   append(c["EXPORTS"], sym);
   #};
   #return names;
};

This function splits whatever variable is received into names and hash, where hash holds both the variable name and value, whereas names only holds the names in a vector. It is condensed code, but understandable to a reader familiar with Nasal. Note the unique return of a foreach loop: this loop (and forindex, which is equivalent) leaves the vector on the stack, meaning it "returns" that value. If this behavior ever changes, the alternative commented code should be used instead.

As an exercise for the reader: Given a manual return of the names vector (i.e., no return foreach(){} hack), what is a really easy optimization to make instead of a foreach/append() loop?

Global and Namespace Utilities

Next are two functions that work well for simple use cases:

# Basically the same. FIXME: should we use bind() instead?
var global = func(fn) {
   var c = _the_globals;
   var names = []; var hash = {};
   if (typeof(fn) == 'func') {
      call(fn, nil, nil, hash);
      var names = keys(hash);
   } elsif (typeof(fn) == 'hash') {
      var names = keys(hash = fn);
   } else die("invalid/unrecognized argument to global()");
   foreach(var sym; keys(hash)) {
      c[sym] = hash[sym];
   }
   return names;
};

# Runs the function in the namespace, like public().
# Essentially says that the function "describes"
# that namespace (after it runs, of course).
# Usage:
#   gen.namespace("foo", func {
#       ... # code goes here, just write normally 
#   });
# Which roughly translates into C++ as:
#   namespace foo
#   {
#       ... // code goes here
#   }
var namespace = func(namespc, fn) {
   if (typeof(namespc) == 'scalar')
      var namespc = _the_globals[namespc];
   bind(fn, _the_globals);
   call(fn, nil, nil, namespc);
};

These rely on the assumption that the global namespace can be accessed and modified. To understand how this works, a different part of the file demonstrates how the global namespace is captured and assigned to a variable named _the_globals.

Capturing the Global Namespace

var _level = 0;
while (closure(caller(0)[1], _level)) != nil) _level += 1;
var _the_globals = closure(caller(0)[1], _level-=1);
var _global_func = bind(func{}, _the_globals);
bind = (func{
	var _bind = bind;
	func(fn, namespace, enclosure=nil) {
		if (fn != _global_func)
			return _bind(fn, namespace, enclosure);
		#protect it from getting rebound by returning an equivalent but duplicate function:
		return _bind(_bind(func{}, _the_globals), namespace, enclosure);
	}
})();

This short section checks all namespaces above the current one. The expression caller(0)[1] returns the currently running function (the one creating this namespace), and using closure() on that returns the namespace above it (level=0), the one above that (level=1), and so on, until it returns nil. At that point, it goes back down one level and caches the assumed "global" namespace. Then, an empty function bound to the global namespace is declared, which becomes highly useful later on for advanced namespace assignment.

Advanced Namespace Binding

One property of namespaces in Nasal is that they are fundamentally tied to functions, not to the hashes that make up the variables of the namespace. While it is easy to give a function both an outer namespace and a namespace to run in (using bind and call respectively), it is difficult to provide an outer-outer namespace due to the chain of functions that must be created and the uncertainty of where to find the correct function.

The solution is to manually create a chain of functions that link to the next one and represent the correct namespace. The bind function takes two arguments: a namespace and an enclosure.

There are three namespace-related parts of a function:

  1. The first namespace is stored with the function and acts as the outer namespace or the first level of recursion for looking up variables. This is the second argument to bind().
  2. The next namespace is not stored with the function and is only set when the function is called. This is the namespace that the function runs in and is the fourth argument to call (or a new hash otherwise). This is where variables set using the var keyword are placed.
  3. The last attribute, which is stored with the function, is a function from which to retrieve both another namespace and another function in the chain (or nil if none exists).

A new function can be manually created and placed into this chain using the following utilities:

# Lexically bind the function to the caller
var bind_to_caller = func(fn, level=1) {
   if (level < 1) return;
   bind(fn, caller(level)[0], caller(level)[1]);
};
# Bind the function to the namespace and then globals
var bind_to_namespace = func(fn, namespace) {
   if (typeof(namespace) == 'scalar')
      var namespace = _the_globals[namespace];
   bind(fn, namespace, _global_func));
};
# Bind the function to each namespace in turn (the
# first is the top-level one, after globals). Each
# item can be a scalar (name of the sub-namespace)
# or a hash (the namespace itself). If create is
# true, then any names that are not present in a
# namespace are created as a new hash; else this
# returns nil.
var bind_to_namespaces = func(fn, namespaces, create=1) {
	if (typeof(namespace) == 'scalar')
		var namespaces = split(".", namespaces);
	var namespace = _the_globals;
	var save = pop(namespaces);
	var _fn = _global_func;
	foreach (var i; namespaces) {
		if (typeof(i) == 'scalar') {
			if (!contains(namespace, i))
				if (create)
					namespace[i] = {};
				else return;
			var i = namespace[i];
		}
		var _fn = bind(func{}, var namespace = i, _fn);
	}
	if (typeof(save) == 'scalar') {
		if (!contains(namespace, save))
			if (create)
				namespace[save] = {};
			else return;
		var save = namespace[save];
	}
	bind(fn, save, _fn);
};

The first function replicates what occurs when the Nasal VM evaluates a func{} expression, though in this case it receives a naFunc instead of an naCode. The second function binds the target function to a namespace and then to the assumed globals.

The third function is the most intricate. The namespaces argument can be a list of names, hashes, or a single string to be split at each dot character. It saves the final namespace from the end of the list and processes the remainder. A temporary variable called `_fn` starts as `_global_func` to ensure the function ultimately recurses into the global namespace. It is then reassigned to a new function expression bound to the previous function, building the desired namespace chain.

Note that using bind(_fn, namespace, _fn) would be incorrect, as it would not create a new function but rather bind `_fn` to itself, creating an infinite recursion loop whenever a non-local variable is looked up. Processing must start at the top level and work downward because each function chains "upward" in its namespaces, meaning the upward reference must exist at the time of binding. One namespace is saved for the final step because the last action requires binding the ultimate target function.

Namespace and Vector Evaluation Utilities

The following private utilities are defined for internal tracking:

var _defined = func(sym) {
   # Check the frame->locals hash/namespace first
   # (since closure(fn, 0) returns the namespace/closure
   # above it, i.e. PTR(frame->func).func->namespace vs
   # PTR(PTR(frame->func).func->next).func->namespace).
   if(contains(caller(1)[0], sym)) return 1;
   var fn = caller(1)[1]; var l = 0;
   while((var frame = closure(fn, l)) != nil) {
      if(contains(frame, sym)) return 1;
      l += 1;
   }
   return 0;
};
var _ldefined = func(sym) {
   return contains(caller(1)[0], sym);
};
var _fix_rest = func(sym) {
    var val = caller(1)[0][sym];
    if (typeof(val) == 'vector' and
        size(val) == 1 and
        typeof(val[0]) == 'vector')
        caller(1)[0][val] = val[0];
};

These include "defined", "locally defined", and "fixup the rest vector" functions. They are kept private as they are lightweight and often embedded directly. The standard defined function found in FlightGear's `globals.nas` only checks caller entries instead of closure entries, missing the actual inheritance of namespaces. The version above corrects this by using caller(1)[1] to obtain the running function and access the namespace chain via closure(), while also verifying caller(1)[0] to see if the symbol was defined locally via the `var` keyword.

The final function checks if a rest vector (such as `arg...`) consists of a single vector element. If it does, it replaces the wrapper vector with that first element, allowing functions to pass a pre-built vector argument to other functions without needing a manual call() structure.

High-Level Object Hacking

These tools are declared using the public() helper to automatically join the namespace and the EXPORT vector:

   # Create a new hash from the symbols of the caller
   # if they are not listed in the ignore vector.
   var new_hash = func(ignore...) {
      var c = caller(1)[0];
      var m = {};
      foreach (SYM; var sym; keys(c)) {
         foreach (var s; ignore) {
            if (sym == s) continue SYM;
         }
         m[sym] == c[sym];
      }
      return m;
   };

   # Create a new object instance (similar to above,
   # but uses the 'me' symbol for the parents vector
   # and ignores the arg and me symbols)
   var new_obj = func(ignore...) {
      var c = caller(1)[0];
      var m = { parents: [c.me] };
      foreach (SYM; var sym; keys(c)) {
         if (sym == "me" or sym == "arg") continue SYM;
         foreach (var s; ignore) {
            if (sym == s) continue SYM;
         }
         m[sym] == c[sym];
      }
      return m;
   };

   #ifdef globals.props.Node:
   if (contains(_the_globals, "props") and contains(_the_globals.props, "Node")) {
   # Same as new_hash but returns a props.Node object using setValues()
   var new_prop = func(ignore...) {
      var c = caller(1)[0];
      var m = {};
      foreach (SYM; var sym; keys(c)) {
         foreach (var s; ignore) {
            if (sym == s) continue SYM;
         }
         m[sym] == c[sym];
      }
      return props.Node.new(m);
   };
   } #endif

   # The opposite of new_hash, this takes a hash and expands the key/values
   # contained in it into the caller (overwriting any possible duplicates)
   var expand_hash = func(hash, ingore...) {
      var c = caller(1)[0];
      foreach (SYM; var sym; keys(hash)) {
         foreach (var s; ignore) {
            if (sym == s) continue SYM;
         }
         c[sym] == hash[sym];
      }
      return c;
   };

These functions allow working with object members as local variables. This is useful when a constructor function accepts a large number of arguments that share names with object members, avoiding repetitive lines like `m.foo = foo;`. These helper functions compress such constructors down to one-liners. Temporary variables that should not be copied as members can be passed as arguments:

var Warper = {
    # Create a class to warp an input, giving
    # it an initial position of <pos>.
    new : func(pos, power, offset) {
        var tmp = pos+offset;
        var curr = math.pow(tmp, power); #current position
        var m = gen.new_obj("tmp", "pos");
    }
};

Note that `new_obj` derives its parents vector from the `me` variable of the caller. This works well for regular patterns (e.g., calling `Warper.new()` bases the instance on `Warper`), and allows instances to be based on other instances (e.g., `Warper.new().new()`), but it will not work when using bracket syntax (e.g., `Warper["new"]()` causes an error).

Key-Value Mapping and Namespace Extensions

This function automatically pairs a list of keys with a list of values:

   # Associate respective keys with values stored in the second vector
   # and return the resulting hash. It is recursive, so something like
   # this works as syntactic sugar (the first index specifies the name):
   #    var clamp_template = ["property", ["range", "min", "max"]];
   #    var aileron = ["/controls/flight/aileron", [-1, 1]];
   #    vec2hash(clamp_template, aileron) == {
   #       "property": "/controls/flight/aileron",
   #       range: { "min": -1, "max": 1 } }
   var vec2hash = func(_keys, list) {
      var result = {};
      forindex (var i; _keys) {
         if (typeof(_keys[i]) == 'vector') {
            result[_keys[i][0]] = list2hash(_keys[i][1:], list[i]);
         } elsif (typeof(_keys[i]) == 'scalar') {
            result[_keys[i]] = list[i];
         }
      };
      return result;
   };

This was modeled after how C extension functions in Nasal are initialized using simple lists where items receive a retrieval name matching their index. It allows nested hashes if the key list contains an embedded vector; the first element of that vector names the sub-hash, and the subsequent elements define the nested keys.

The following function inserts an extension into a namespace:

   # Make an extension in the namespace, inside any objects
   # or sub-namespaces specified in objs, with the name
   # of fname, and where fn is written like it was in the file
   # (i.e. no prefixing of the namespace before every variable).
   # It only defines it if the namespace exists and a variable
   # with the name does not exist or is nil.
   var provide_extension = func(namespc, fname, fn, objs...) {
      if (typeof(namespc) == 'scalar')
         var _n = _the_globals[namespc];
      foreach (var name; objs) {
         if (_n == nil) return;
         _n = _n[name];
      }
      if (_n[fname] != nil) return; #only define it if it does not exist
      if (typeof(fn) == 'scalar') fn = compile(fn);
      _n[fname] = bind(fn, _the_globals[namespc], _global_func);
   };

This creates a function inside a target namespace and its child objects only if the namespace exists and the target identifier is open. The function binds solely to the namespace and globals. To see why, consider `props.nas`, which defines a Node class:

# $FG_ROOT/Nasal/props.nas
var Node = {
    getNode        : func wrap(_getNode       (me._g, arg)),
    #...
};
#...
Node.getValues = func {
    #...
};
#...

The Nasal VM binds both `Node.getNode` and `Node.getValues` directly inside the broader `props` namespace rather than deep inside the class block. For this reason, `provide_extension` does not need `bind_to_namespaces` and simply traces the target objects to perform a flat namespace binding (a design originally intended for adding custom extensions to `props.nas` from an external context). Thus, `bind_to_namespaces` remains specialized for nested namespaces rather than classes or objects inside namespaces.

Additional Features

Other features present in the module that are not covered in this tutorial text include:

  • Mutable functions
  • "Macro" functions
  • Namespace consolidation (currently named `accumulate`): allows paying a one-time processing cost to accelerate subsequent hash lookups (generally not recommended for standard use).
  • Two custom method-overloading configurations designed for specialized use cases.
  • Standard utilities for duplication (`duplicate`), recursive equality (`equals`), and extended containment checks (`econtains`).
  • End-of-file structural classes: `Hash`, `Func`, and `Class`.

Related content