Nasal Namespaces in-depth: Difference between revisions

Jump to navigation Jump to search
The author is here!
m (Link to special directory articles)
(The author is here!)
Line 3: Line 3:
{{Draft}}
{{Draft}}


= Nasal Namespaces: in-depth =
<--
= Nasal Namespaces: In-depth =


I am not an expert in scripting languages, but I do claim that Nasal has a unique notion of namespaces. Rather than dwell on that, I'll tell you how they work in Nasal (and FlightGear) and leave you to decide for yourself.
I am not an expert in scripting languages, but I do claim that Nasal has a unique notion of namespaces. Rather than dwell on that, I'll tell you how they work in Nasal (and FlightGear) and leave you to decide for yourself.-->


== Namespaces in FlightGear ==
== Namespaces in FlightGear ==
If you look at the graph on the right, you will see a representation of how namespaces look in FlightGear. At the top there is the global namespace, called "globals", and various other namespaces branch down from there. On the left there are the namespaces created from the modules in [[$FG_ROOT]] and [[$FG_HOME]], e.g. controls.nas makes the namespace "controls". In the center there are the ‘special’ namespaces, for the joystick(s) and the keyboard (there is only support for one keyboard hence only one namespace). Everything found in the joystick or keyboard files are executed in their respective namespaces – the <nasal><script> section at the top and all of the <binding>s. Next are the GUI dialogs, which have "__dialog:" as a prefix in front of the dialog's name (which I believe comes from the filename).
If you look at the graph up above, you will see a representation of how namespaces look in FlightGear. At the top there is the global namespace, called "globals", and various other namespaces branch down from there. On the left there are the namespaces created from the modules in [[$FG_ROOT]] and [[$FG_HOME]], e.g. controls.nas makes the namespace "controls". In the center there are the ‘special’ namespaces, for the joystick(s) and the keyboard (there is only support for one keyboard hence only one namespace). Everything found in the joystick or keyboard files are executed in their respective namespaces – the <nasal><script> section at the top and all of the <binding>s. Next are the GUI dialogs, which have "__dialog:" as a prefix in front of the dialog's name (which I believe comes from the filename).


As you look at the tree structure, notice how one can move around on it. One can go up to a more global namespace or one can go down to a sub-namespace of the current namespace. Each "leaf" or "branch" only has one parent (e.g. the only thing above "controls" is "globals") while they can contain multiple children (the reverse of the previous comment applies, globals contains not only controls but io, gui, etc., as well).
As you look at the tree structure, notice how one can move around on it. One can go up to a more global namespace or one can go down to a sub-namespace of the current namespace. Each "leaf" or "branch" only has one parent (e.g. the only thing above "controls" is "globals") while they can contain multiple children (the reverse of the previous comment applies, globals contains not only controls but io, gui, etc., as well).


== Hands-on example ==
== Hands-On Example ==
Let's consider a common operation to perform. As I'm sure you all know, to put the gear down from a joystick button (which is in the namespace __js0 for example), you call a script like this:
Let's consider a common operation to perform. As I'm sure you all know, to put the gear down from a joystick button (which is in the namespace __js0 for example), you call a script like this:
<syntaxhighlight lang="php">
controls.gearDown(1);
controls.gearDown(1);
The "controls" means that namespace and the "[dot] gearDown" means that we want to access it's member "gearDown" (and then call it with the parenthesis). But how does this work? There clearly isn't a line between __js0 and controls that could take us there! But the key here is the other lines – we can first go up to globals from __js0 and then back down to controls, then we search for "gearDown" there. With this in mind, let me tell you what happens when Nasal interprets the above script:
</syntaxhighlight>
The "controls" means that namespace and the "<nowiki>[dot]</nowiki> gearDown" means that we want to access it's member "gearDown" (and then call it with the parenthesis). But how does this work? In the graph there clearly isn't a line between __js0 and controls that could take us there! But the key here is the other lines – we can first go up to globals from __js0 and then back down to controls, then we search for "gearDown" there. With this in mind, let me tell you what happens when Nasal interprets the above script:


== Namespace Lookup ==
== Namespace Lookup ==
Nasal starts looking for the leftmost side of the name – "controls." First it checks our current namespace, __js0, and since it doesn't find it there (we hope!) it then has to recurse up into the "parent" namespace, which is globals. It finds a controls namespace there and goes into it and looks for "gearDown." Upon finding it, Nasal then executes that script with "1" as an argument, and the gear lowers. Thus it is really two separate lookups – Nasal looks for controls and then gearDown inside that. For the first lookup, it is looking for a variable and can recurse into higher namespaces, for the second it can only go down from the namespace it found. This recursion into other namespaces is kind of like the lookup into hashes' "parents" vector, any "get" operations go look in the hash proper, then into the first "parent" index (though namespaces only have one parent), and any parents of the parents, etc., until something is found. Unlike hashes, however, "set" operations do not always stay in the actual hash, Nasal first tries to find the variable the same way as it does with "get" operations, but then if it isn't found it creates one in the current namespace. With the 'var' keyword, however, it doesn't look anywhere but just creates a new one in the current namespace. Namespaces also resemble hashes in another way – they in fact are hashes! Nasal thinks of namespaces as just another hash, and this challenges a common notion of programmers, that of "lvalues." Typically (I think?), programmers can only declare named variables that are legal lvalues – that is, they have to fit a certain pattern. The pattern is usually something like this: a name can start with an underscore or alpha character and all remaining characters must be an underscore or an alphanumeric character. The lvalue ends at the first character that doesn't fit that pattern, e.g. a punctuation mark. But in Nasal hashes, one can use arbitrary scalars (strings and/or numbers) as indexes:
Nasal starts looking for the leftmost side of the name – "controls". First it checks inside our current namespace, __js0, and since it doesn't find it there (we hope!) it then has to recurse up into the "parent" namespace, which is globals. It then finds a controls namespace there and then goes into it and looks for "gearDown". Upon finding it, Nasal then executes that script with "1" as an argument, and the gear lowers. Thus it is really two separate lookups – Nasal looks for controls and then gearDown inside that. For the first lookup, it is looking for a variable and can recurse into higher namespaces; for the second it is looking for a member can only go down from the namespace it found. This recursion into other namespaces is kind of like the lookup into hashes' "parents" vector, any "get" operations go look in the hash proper, then into the first "parent" index (though namespaces only have one parent), and any parents of the parents, etc., until something is found, checking the second parent if the first fails. Unlike hashes, however, "set" operations do not always stay in the actual hash: Nasal first tries to find the variable the same way as it does with "get" operations and will set it if it is found, and only if it isn't found it creates one in the current namespace. With the 'var' keyword, however, it doesn't look anywhere but just creates a new one in the current namespace.
a_hash["(foo)"] = 78;
 
This means that namespaces can contain ‘variables’ that aren't valid lvalues and thus can't be used in typical code. Also notice that the dialog's namespaces are not lvalues as they have a colon and thus can't be used like the controls namespace can. Instead, one must use various methods to access non-lvalues. For more, see the "Doing more with your namespace – Nasal style" section.
Namespaces also resemble hashes in another way – they in fact are hashes! Nasal thinks of namespaces as just another hash, and this can challenge a common notion of programmers, that of "lvalues". Typically, programmers can only declare named variables that are legal lvalues – that is, they have to fit a certain pattern. The pattern is usually something like this: a name can start with an underscore or alpha character and all remaining characters must be an underscore or an alphanumeric character. The lvalue ends at the first character that doesn't fit that pattern, e.g. a punctuation mark or bit of white space. But in Nasal hashes, one can use arbitrary scalars (strings or numbers) as hash indexes:
<syntaxhighlight lang="php">
a_hash["(illegal lvalue)"] = 78;
</syntaxhighlight>
This means that namespaces can contain ‘variables’ that aren't valid lvalues and thus can't be used in typical code. Also notice that the dialog's namespaces are not lvalues, as they have punctuation in them, and thus can't be used like the controls namespace can. Instead, one must use various methods to access non-lvalues, like indexing the globals namespace (though this is ''not'' recommended as they are supposed to be "private"):
<syntaxhighlight lang="php">
globals["__dlg:foo"].setName("foo");
</syntaxhighlight>


== Namespaces in Functions ==
== Namespaces in Functions ==
Every single namespace in Nasal is just a hash. The most common namespaces are the ones inside the Nasal modules ([[$FG_ROOT]]/Nasal) and the globals namespace, these are probably the easiest to understand. In fact everything needs to run in namespace, and that includes joystick bindings. But less understood is the fact that every function creates it's own anonymous namespace to run in, e.g. gearDown in controls.nas has its own namespace that it runs in. These namespaces are not assigned a name like the controls namespace is. Instead, they simply exist as a part of the function and can only be fetched using closure() level 0. Anyways, this has the obvious implications that any variables created inside the function stay in that function (and get destroyed after the function returns). A function can also modify variables in the outer scope, either by leaving out the 'var' qualifier or by using caller(). Local variables cannot be accessed from the outer scope, as far as I know, though I am not sure what closure returns for such things. This notion of functions creating new namespaces has some interesting implications, see "Advanced namespace hacking: security wrappers".
Every single namespace in Nasal is just a hash. The most common namespaces are the ones inside the Nasal modules ([[$FG_ROOT]]/Nasal) and the globals namespace, these are probably the easiest to understand. In fact everything needs to run in namespace, and that includes joystick bindings. But less understood is the fact that every function creates it's own anonymous namespace to run in for each call, e.g. each call to controls.gearDown gives it a new namespace. These namespaces are not assigned a name like the controls namespace is and in fact are only stored in Nasal's call stack. This has the obvious implications that any variables created inside the function stay in that function (and possibly get destroyed after the function returns). A function can also modify variables in its outer scope, either by leaving out the 'var' qualifier or by using closure().
 
This notion of functions creating new namespaces has some interesting implications, see "Advanced namespace hacking: security wrappers". It also has implications in creating closures for func{} expressions; more on this later as well.


== Dealing with Different Namespaces at Once ==
== Dealing with Different Objects that have the Same Name ==
Anyways, back to the graph. Each portion of the graph has its own variables, some of which might be functions with their own namespaces, and some are hashes that are used as objects – not namespaces. We learned that the Nasal parser looks first at the current namespace for a symbol, then the next higher namespace, until it reaches the top. But what if we define a variable in this scope and we want to access another variable that has the same name but lives in a different namespace? This is what happens in debug.nas; it defines a string function which obscures the string namespace (in computer science this is called [[en.wikipedia.org/Variable_Shadowing_|shadowing]]). To use the string namespace, there are two options: use caller() to manually "hop" up the namespace tree or prefix "globals[dot]" to the symbol like this:
Back to the graph: each portion of the graph has its own variables, some of which might be functions with their own namespaces, and some are hashes that are used as objects or namespaces. We learned that the Nasal code executor looks first at the current namespace for a symbol, then the next higher namespace, until it reaches the top. But what if we define a variable in this scope and we want to access another variable that has the same name but lives in a different namespace? This is what happens in debug.nas; it defines a string function which obscures the string namespace (in computer science this is called [en.wikipedia.org/Variable_Shadowing shadowing]). To use the string namespace, there are two options: use caller() to manually "hop" up the namespace tree or prefix "globals<nowiki>[dot]</nowiki>" to the symbol like this:
<syntaxhighlight lang="php">
globals.string.isascii(n);
globals.string.isascii(n);
</syntaxhighlight>
The caller method is not recommended because it requires a different level for just being in debug.nas and being inside a function in debug.nas, and it also obfuscates the purpose of the code unless a comment is put in. Arguably neither solution looks good, but the only other option is to avoid using those symbols – which could sacrifice the usability and clarity of the API to access those functions in that namespace. That said, if a namespace declares a globals variable (e.g. like props.nas), then it will have to use caller() to get into the global namespace, as any reference to "globals" will stop in the current namespace not the actual global namespace.
The caller method is not recommended because it requires a different level for just being in debug.nas and being inside a function in debug.nas, and it also obfuscates the purpose of the code unless a comment is put in. Arguably neither solution looks good, but the only other option is to avoid using those symbols – which could sacrifice the usability and clarity of the API to access those functions in that namespace. That said, if a namespace declares a globals variable (e.g. like props.nas), then it will have to use caller() to get into the global namespace, as any reference to "globals" will stop in the current namespace not the actual global namespace.


Line 41: Line 55:
     hashset(_globals, "globals", _globals);
     hashset(_globals, "globals", _globals);
</syntaxhighlight>
</syntaxhighlight>
Since it is recursive, one can prefix globals as many times as they want to without any change, it also leads to a stack overflow when debug.dump() tries to dump the global namespace (there are a couple hacks to fix this, though).
Since it is a symbol in the global namespace (which is actually kept in C++-space), code that needs it ends up recursing to there and then getting it again (but without any more recursions possible). Also since it is recursive, one can prefix globals as many times as they want to without any change; it also leads to a stack overflow when debug.dump() tries to dump the global namespace (there are a couple hacks to fix this, though).


== Namespaces and Security ==
== Namespaces and Security ==
Line 59: Line 73:
     })();
     })();
</syntaxhighlight>
</syntaxhighlight>
It redefines the removelistener() function inside of an anonymous namespace. It first copies the function that was previously defined, whether it was implemented in C or if it was another wrapper. Notice how this cannot be accessed outside of the namespace. It then defines a function (which is implicitly returned) that checks some conditions and either passes the argument and returns the result of the call to the original removelistener function, or it dies and denies access to the original function. This cleverly removes any access to the original function but it still allows access to it – after adding a layer of security.
It redefines the removelistener() function inside of an anonymous namespace. It first copies the function that was previously defined, whether it was implemented in C or if it was another wrapper. Notice how this cannot be accessed outside of the namespace. It then defines a function (which is implicitly returned) that checks some conditions and either passes the argument and returns the result of the call to the original removelistener function, or it dies and denies access to the original function. This cleverly removes any access to the original function but it still allows access to it – after adding a layer of security. Note however that calling <tt>closure(removelistener, 0)</tt> will return our "anonymous" namespace – which can give access to the old removelistener via the _removelistener handle. This can be a security flaw, but redefining closure() can make the environment fully secure – see io.nas for how this is done.


== Doing more with your namespaces, Nasal style. ==
== Doing More with Your Namespaces, Nasal Style. ==


The Nasal library has a number of interesting functions to deal with namespaces and calling functions. These are:
The Nasal library has a number of interesting functions to deal with namespaces and calling functions. These are:
Line 68: Line 82:
* closure()
* closure()
* bind()
* bind()
These all deal with namespaces is some way, shape, or form. The functions caller() and closure() are the most alike, to illustrate their relationship, one can say the first argument to closure() is the second index (the function) or what is returned by caller() and the return of closure() is the first index of caller() with a level one greater than the previous evaluation. In other words (FIXME):
These all deal with namespaces is some way, shape, or form: closure() gives the namespace (closure) of the function at the specified level – more on this later; bind() can modify those closure (somewhat indirectly); caller() gives info about the caller: locals namespace, function, name of source file, currently executing line; and call() gives enormous control over how a function is executed, from the 'me' variable and the arguments to the namespace it will be executed in. The ‘official’ prototype given for the latter is this:
closure(caller(0)[1], level) == caller(level)[0];
<syntaxhighlight lang="php">
The function call() gives enormous control over how a function is executed, from the 'me' variable and the arguments to the namespace it will be executed in. The ‘official’ prototype given is this:
call(fn, args=[], me=nil, namespace=nil, error=nil)
call(fn, args=[], me=nil, namespace=nil, error=nil)
</syntaxhighlight>
However this is not quite accurate, as in fact the error vector has a movable position. For each of the optional arguments, if they are not the right type (vector for "args" and "error", hash for "me" and "namespace") then it is treated as nil/non-existent. The error vector is in fact the last argument if there are more than two arguments, and thus all of these are valid:
However this is not quite accurate, as in fact the error vector has a movable position. For each of the optional arguments, if they are not the right type (vector for "args" and "error", hash for "me" and "namespace") then it is treated as nil/non-existent. The error vector is in fact the last argument if there are more than two arguments, and thus all of these are valid:
* call(fn, arg);
* call(fn, arg);
Line 80: Line 94:
* call(fn, arg, nil, nil, var err=[]);
* call(fn, arg, nil, nil, var err=[]);
* call(fn, arg, me, nil, var err=[]);
* call(fn, arg, me, nil, var err=[]);
For the record, I am not sure how namespaces recurse when set by call()/naCall (meaning what namespace they search in next when they don't find the symbol in this namespace). The last function – bind() – allows manual setting of the namespace and parent function that the function runs in (the parent function, called ‘next’, is where the function is originally defined (I think), i.e. it would be the controls namespace for gearDown). Let's see some of these in action:


Let's see some of these in action:
=== Example: Class Constructor ===
Let's say you are making your own class in Nasal and you have a gazillion members you new() method needs to populate. Each member is passed in as an argument which gets tiring for you (also limits you to 16 members, as functions can't have more arguments than that). This is what a first attempt might look like:
Let's say you are making your own class in Nasal and you have a gazillion members you new() method needs to populate. Each member is passed in as an argument which gets tiring for you (also limits you to 16 members, as functions can't have more arguments than that). This is what a first attempt might look like:
<syntaxhighlight lang="php">
<syntaxhighlight lang="php">
Line 96: Line 112:
};
};
</syntaxhighlight>
</syntaxhighlight>
And it goes on and on. What if you could just do it automatically? Your first attempt might go like this:<syntaxhighlight lang="php">
And it goes on and on. What if you could just do it automatically? Your first attempt might go like this:
<syntaxhighlight lang="php">
var class = {
var class = {
     new: func(a, b, c, d, e) {
     new: func(a, b, c, d, e) {
Line 136: Line 153:
};
};
</syntaxhighlight>
</syntaxhighlight>
The advantage of the second is that you can use it for any new() method without duplicating code, it also has less symbols it has to avoid.
The advantage of the second is that you can use it for any new() method without duplicating code; it also has less symbols it has to avoid.
 
== Wrap-Up: Namespaces as Closures ==
 
Finally we will consider how namespaces actually recurse. It turns out that recursion is specified in the function, not the namespace. This means that controls doesn't necessarily go to globals always, but it is up to the creator of the function object to decide.
 
In the C code, there is a naFunc object that represents a function. In addition to specifying the executable part (which is a naCode or naCCode) it specifies the namespace and another naFunc to get the next namespace from. These naFunc’s end up being chained together to form the namespace hierarchy – each link is referred to as a "closure". The builtin closure() function follows the chain through the links (where 0 specifies the first link, etc.) and returns thay closure – a small slice of the variables the function has access to. The builtin bind() does the opposite: it specifies the first link in the chain for that function (its embedded link to the namespace) and possibly another function that spcifies the next namespace and the next function, etc. (or nil if it only has one closure). See [[Nasal Meta-programing]] for more on how they can be used together, but the key thing to remember is that namespaces are built on the backs of functions by using hashes as their key datatype.
395

edits

Navigation menu