Improving Nasal

From FlightGear wiki
Revision as of 15:51, 5 February 2012 by Hooray (Talk | contribs)

Jump to: navigation, search

Last update: 10/2011

As more and more code in FlightGear is moved to the base package and thus implemented in Nasal space, some Nasal related issues have become increasingly obvious.

On the other hand, Nasal has a proven track record of success in FlightGear, and has shown remarkably few significant issues so far. Most of the more prominent issues are related to a wider adoption in FlightGear, and thus more complex features being implemented in Nasal overall.

So, rather than having Nasal flame wars and talking about "alternatives" like Perl, Python, Javascript or Lua, the idea is to document known Nasal issues so that they can hopefully be addressed eventually.

If you are aware of any major Nasal issues that are not yet covered here, please feel free to add them here, however it is also a good idea to use the FlightGear bug tracker in such cases:

Get rid of the global interpreter context

Source: Andy Ross, Nasal author

Year: 2007-2011


Problem: New nasal objects are added to a temporary bin when they are created, because further allocation might cause a garbage collection to happen before the code that created the object can store a reference to it where the garbage collector can find it. For performance and simplicity, this list is stored per-context. When the context next executes code, it clears this list.

Here's the problem: we do a lot of our naNewXX() calls in FlightGear using the old "global context" object that is created at startup. But this context is no longer used to execute scripts* at runtime, so as Csaba discovered, it's temporaries are never flushed. That essentially causes a resource leak: those allocations (mostly listener nodes) will never be freed. And all the extra "reachable" Nasal data floating around causes the garbage collector to take longer and longer to do a full collection as time goes on, causing "stutter". And scripts that use listeners extensively (the cmdarg() they use was one of the affected allocations) will see the problem more seriously.

(That's a feature, not a bug. Once listeners were added, scripts could be recursive: (script A sets property X which causes listener L to fire and cause script B to run) We need two or more contexts on the stack to handle that; a single global one won't work.)

I didn't like the fix though. Exposing the temporary bin as part of the Nasal public API is ugly; it's an internal design feature, not something users should tune. Instead, I just hacked at the FlightGear code to reinitialize this context every frame, thus cleaning it up. A "proper" fix would be to remove the global context entirely, but that would touch a bunch of code.

Also see: (in FGNasalSys::update)

   // The global context is a legacy thing.  We use dynamically
   // created contexts for naCall() now, so that we can call them
   // recursively.  But there are still spots that want to use it for
   // naNew*() calls, which end up leaking memory because the context
   // only clears out its temporary vector when it's *used*.  So just
   // junk it and fetch a new/reinitialized one every frame.  This is
   // clumsy: the right solution would use the dynamic context in all
   // cases and eliminate _context entirely.  But that's more work,
   // and this works fine (yes, they say "New" and "Free", but
   // they're very fast, just trust me). -Andy

Improve the garbage collector

Year: 2011


Problem: Nasal has a garbage collection problem. One solution to it is - we avoid Nasal code wherever possible and try to hard-code everything. But Nasal crops up on a lot of places - complex aircraft such as the Concorde come to my mind, interactive AI models, lots of really nifty and useful applications... - so instead of fixing things in a lot of places, one could also think about it the other way and fix just one thing, i.e. the garbage collection such that it doesn't hit a single frame. I fully well realize that dragging out complicated operations across many frames while everything else keeps changing is at least an order of magnitude more complicated (about 1/3 of Local Weather deal with precisely that problem...) - but I don't believe it can't be done at all. It sort of bugs me a bit that somehow the fault is always supposed to be in using Nasal...

I think it's great if we have a discussion where the issues are placed on the table to give everyone the change to learn and understand more, and then reasonably decide what to do. Nasal has advantages and disadvantages, so has C++, sometimes accessibility and safety are worth a factor 3 performance (to me at least), sometimes not. But I don't really want to discuss dogmatics where 'truth' is a priori clear. There is a case for having high-level routines in Nasal, there's a case to be made to switch low level workhorses to C++ - and there's always the question of what is the most efficient way of doing something. But I'm clearly not considering Nasal-based systems immature or experimental per se.


As discussed in "Stuttering at 1 Hz rate" we now know that regular and unpleasant stuttering is caused by Nasals garbage collector. So I thought about possibilities to improve it. What if we could decouple the following function as a separate thread, so that it runs *asynchronously* from the main thread? This way it would not interfere (or much less) with the main thread and our fps would be more consistent.

This is the function causing the jitter: In "simgear/nasal/gc.c" static void garbageCollect()

The thread will need to share some of the global variables from the main thread.


I'm not an expert in nasal garbage collection, but I think the problem is that garbage collection is not something we can divide up into chunks (which is essentially what threading would do.) In addition, threading adds a lot of potential order dependent bugs.

In the case of nasal, I believe the garbage collection pass must be done in a single atomic step, otherwise it would leave the heap in an inconsistent state and adversely affect the scripts.

URL: I don't know much about our Nasal implementation, but I suspect that the garbage collector could be changed to trace only a portion of Nasal's heap at each invocation, at the risk of increased memory use.


There are algorithms for incremental and/or concurrent and/or parallel garbage collection out there. They most likely not easy to implement and as far as I have seen so far would require (at least for concurrent and /or parallel GC) all writes of pointers to the Nasal heap (and possibly reads) to be redirected via wrapper functions (also known as (GC) read/write barriers).

This will not be an easy task but in my opinion it would be a promising option. It might be possible to use a GC module from a GPL:d Java vm or similar.

Btw, just running the normal (mutually exclusive) Nasal GC in another thread than the main loop is not hard - but since it is mutually exclusive to executing Nasal functions it doesn't help much when it comes to reducing the worst case latency.

The small changes needed to add a separate GC thread are available here:

Also, I had a brief look at exactly which Nasal timers caused a jitter. And the winner is... ... well, any. Any Nasal timer, even if it's almost empty, will every now and then consume a much larger amount of time than normal. Seems to be a general issue with the Nasal execution engine: could be triggered by Nasal's garbage collector, which every now and then needs to do extra work - and runs within the context of a normal Nasal call. It could also be a result of Nasal's critical sections: other threads may acquire a temporary lock to alter Nasal data structures - which may block the execution of Nasal timers at certain points. Hmm... Best practices for debugging a multi-threaded program anyone? :)

Concerning the frequency of the jitter: I guess it isn't related to the FDM at all. It's probably just a result of Nasal complexity. The more Nasal code is running, the more often/likely garbage collection / blocking may occur. Frame rate may also influce it: many Nasal timers run at delay 0 (in every update loop).

Separate GC implementations