How the Nasal GC works: Difference between revisions

From FlightGear wiki
Jump to navigation Jump to search
(→‎Contexts: https://forum.flightgear.org/viewtopic.php?f=38&t=33831&p=328188#p328188)
(44 intermediate revisions by 4 users not shown)
Line 1: Line 1:
{{WIP}}
{{Template:Nasal Internals}}
{{Multicore}}
 
= High Level Architecture =
{{Main article|High-Level Architecture}}
{{FGCquote
|1= Regarding Nasal as a scripting language and AW, I'm hopeful that the work I'm doing on HLA will allow us to run Nasal in a separate thread from the FDM and display, so Nasal GC no-longer can impact frame-rates.  It would also allow for writing a weather simulation completely external to the FlightGear instance, which could be quite neat.
|2= {{cite web
  | url    = http://forum.flightgear.org/viewtopic.php?p=265721#p265721
  | title  = <nowiki>Re: the real cost of the Nasal Garbage Collector</nowiki>
  | author = <nowiki>stuart</nowiki>
  | date  = Nov 24th, 2015
  | added  = Nov 24th, 2015
  | script_version = 0.23
  }}
}}
 
{{FGCquote
|1= I do like the idea of a stand-alone script interpreter communicating via HLA or some similar IPC, but we still have to consider fragments of script code embedded in models to create or enhance fancy animations.  Not all script fragments make sense to run externally.
|2= {{cite web
  | url    = http://forum.flightgear.org/viewtopic.php?p=270412#p270412
  | title  = <nowiki>Re: FGPython an propose for Python as an nasal alternative</nowiki>
  | author = <nowiki>curt</nowiki>
  | date  = Dec 28th, 2015
  | added  = Dec 28th, 2015
  | script_version = 0.23
  }}
}}
 
{{FGCquote
|1= if scripting is a major performance hog, off-load it to a different thread. I don't think we have many situation where it is, but that's what HLA is going to do.
|2= {{cite web
  | url    = http://forum.flightgear.org/viewtopic.php?p=270463#p270463
  | title  = <nowiki>Re: FGPython an propose for Python as an nasal alternative</nowiki>
  | author = <nowiki>Thorsten</nowiki>
  | date  = Dec 29th, 2015
  | added  = Dec 29th, 2015
  | script_version = 0.23
  }}
}}
 
{{FGCquote
|1= haven't made too many tests with Nasal active as anyone can do that (adjust paths to have no terrain if necessary).
But most of this becomes less relevant with the HLA changes that Stuart's working on.
|2= {{cite web
  | url    = http://forum.flightgear.org/viewtopic.php?p=265726#p265726
  | title  = <nowiki>Re: the real cost of the Nasal Garbage Collector</nowiki>
  | author = <nowiki>Richard</nowiki>
  | date  = Nov 24th, 2015
  | added  = Nov 24th, 2015
  | script_version = 0.23
  }}
}}
 
= Status (02/2016) =
<!--
As of 06/2012, a new incremental Nasal GC is being worked on [http://www.mail-archive.com/flightgear-devel@lists.sourceforge.net/msg37579.html]:
 
<pre>
I have been working on a 4-color incremental mark/sweep collector with
the intention of merging it into the Nasal interpreter.
The work so far can be found at http://github.com/chrisforbes/incgc;
There's still quite a lot to do, but the path is clear.
</pre>
-->
 
The main Nasal repository (maintained by Andy Ross) can also be found at github: https://github.com/andyross/nasal
 
The latest version in FlightGear is maintained in the {{simgear source|path=simgear/nasal|text=SimGear repository}}.
 
The garbage collector in Nasal is pretty self-contained and entirely implemented in the file {{simgear source|path=simgear/nasal/gc.c|text=gc.c}}.
 
{{FGCquote
  |Nasal's garbage collector requires that no other Nasal threads are <br/>
running, and achieves this by keeping a count of active Nasal threads <br/>
(nThreads, simgear/nasal/code.h), and having the thread that needs GC <br/>
ask all other threads to stop and wait until nThreads-1 say they have <br/>
(bottleneck(), simgear/nasal/gc.c).  This will hang if there are any <br/>
threads that Nasal thinks are active but actually aren't, as they will <br/>
never reply to the stop request.
  |{{cite web |url=http://sourceforge.net/p/flightgear/mailman/message/32969200/
    |title=<nowiki>[Flightgear-devel] Nasal threading, naCall() vs naCallMethodCtx()</nowiki>
    |author=<nowiki>Rebecca N. Palmer</nowiki>
    |date=<nowiki>2014-10-25</nowiki>
  }}
}}
 
 
 
 
{{FGCquote
|1= Such things may look simple at first. Easy to convert a simple "hello world". But it's very complex when supporting all the features of an interpreted script language. And the funny thing is: you'd still need to worry about automatic garbage collection and count references (though that'd be a lesser issue compared to many others then). So, time wake up...
|2= {{cite web
  | url    = http://sourceforge.net/p/flightgear/mailman/message/27262122/
  | title  = <nowiki>Re: [Flightgear-devel] Stuttering at 1 Hz rate</nowiki>
  | author = <nowiki>ThorstenB</nowiki>
  | date  = Mar 26th, 2011
  | added  = Mar 26th, 2011
  | script_version = 0.23
  }}
}}
 
=== Patches ===
 
{{FGCquote
|1= there is still the issue that you can't execute Nasal while the GC is running (for that you'd need a concurrent GC which is most likely not so easily implemented). The implication is that if the time required for a collection exceeds the frame dealy you'll still notice it even when the GC runs in a different thread. I have updated my experimental patch from last year, which should also work on Windows now (and it has even been tested there):
* http://sleipner.gidenstam.org/users/anders/FlightGear/sg-gc-4.diff
* http://sleipner.gidenstam.org/users/anders/FlightGear/fg-gc-3.diff
 
Again: It doesn't solve the bigger problem of the current Nasal GC needing to run uninterrupted and in mutual exclusion - but I see very few GC runs in the main loop here (the patch prints "** Nasal GC in main thread **" in the console when that happens) except a few during startup.
|2= {{cite web
  | url    = http://sourceforge.net/p/flightgear/mailman/message/29306592/
  | title  = <nowiki>Re: [Flightgear-devel] Nasal performance</nowiki>
  | author = <nowiki>Anders Gidenstam</nowiki>
  | date  = May 23rd, 2012
  | added  = May 23rd, 2012
  | script_version = 0.23
  }}
}}
 
{{FGCquote
|1= There are algorithms for incremental and/or concurrent and/or parallel garbage collection out there. They most likely not easy to implement and as far as I have seen so far would require (at least for concurrent and /or parallel GC) all writes of pointers to the Nasal heap (and possibly reads) to be redirected via wrapper functions (also known as (GC) read/write barriers). This will not be an easy task but in my opinion it would be a promising option. It might be possible to use a GC module from a GPL:d Java vm or similar. Btw, just running the normal (mutually exclusive) Nasal GC in another thread than the main loop is not hard - but since it is mutually exclusive to executing Nasal functions it doesn't help much when it comes to reducing the worst case latency. The small changes needed to add a separate GC thread are available here:
* http://www.gidenstam.org/FlightGear/misc/test/sg-gc-2.diff
* http://www.gidenstam.org/FlightGear/misc/test/fg-gc-1.diff
|2= {{cite web
  | url    = http://sourceforge.net/p/flightgear/mailman/message/27338998/
  | title  = <nowiki>Re: [Flightgear-devel] Nasal Garbage Collector (was Stuttering at 1
Hz rate)</nowiki>
  | author = <nowiki>Anders Gidenstam</nowiki>
  | date  = Apr 10th, 2011
  | added  = Apr 10th, 2011
  | script_version = 0.23
  }}
}}
 
{{FGCquote
|1= There were some serious issues with my patch - namely that the destruction of some C++ side FG objects kept alive by Nasal references must not be invoked on a different thread. I didn't see the crashes, but they appeared for people using Canvas heavy aircraft.
 
James did some work to fix these problems but I'm not sure how completely he managed. I still run the patch with his updates, however, and for the aircraft (I don't think I use any canvases, though) I use it works.
Running the Nasal GC in a different thread could certainly be revisited, and if the GC updated to an incremental one it could be done without impacting the main loop much.
|2= {{cite web
  | url    = http://forum.flightgear.org/viewtopic.php?p=265773#p265773
  | title  = <nowiki>Re: the real cost of the Nasal Garbage Collector</nowiki>
  | author = <nowiki>AndersG</nowiki>
  | date  = Nov 24th, 2015
  | added  = Nov 24th, 2015
  | script_version = 0.23
  }}
}}
 
{{FGCquote
|1= Running the Nasal GC outside the main thread is already (seemingly) possible but code for it is not in the next branch. I think James fixed the remaining issue a few years back, namely that Nasal/C++ ghost objects must be handed back to the main thread for destruction since they invoke C++-side destructors that touch shared FG state. However, since the Nasal GC (currently) is mutual exclusive to running any Nasal context (using the same heap) this is still not a satisfactory solution once the time needed for a GC approaches the frame duration as the main thread will be blocked waiting for the GC to finish. I've been told this frequently happens for more Nasal heavy use cases, though I have not experienced it on my own ancient (by far GPU-bound) system in my use cases. There are some script use cases that require script code to run in sync with the viewer, e.g. script-based fly by, walk and MP/AI views. For such tightly coupled script code Lua (which, IIRC, offers "real" threads and an incremental GC) or an overhauled incremental and/or concurrent Nasal GC (not a negligible amount of work, though) would seem like options for this type of script code.
|2= {{cite web
  | url    = http://sourceforge.net/p/flightgear/mailman/message/34797127/
  | title  = <nowiki>Re: [Flightgear-devel] Designing a thread-safe property tree API
(was Re: A FGPythonSys implementation: ...)</nowiki>
  | author = <nowiki>Anders Gidenstam</nowiki>
  | date  = Jan 26th, 2016
  | added  = Jan 26th, 2016
  | script_version = 0.23
  }}
}}
 
 
{{FGCquote
  |The built-in osgviewer stats can be extended with custom stats, that works by subclassing [http://trac.openscenegraph.org/documentation/OpenSceneGraphReferenceDocs/a00804.html osg::StatsHandler], this is already done in {{flightgear source|commit=130f581b18711b63138580d306a39c27b891178c|path=src/Viewer/FGEventHandler.hxx|line=14|text=$FG_SRC/Viewer/FGEventHandler.?xx}}<br/>
The class can be  extended to add your own stats via [http://trac.openscenegraph.org/documentation/OpenSceneGraphReferenceDocs/a00804.html#a0472f445cdd72838f107766b2f67a220 osgViewer::StatsHandler::addUserStatsLine()]<br/>
<br/>
You can even register totally custom stats via [http://trac.openscenegraph.org/documentation/OpenSceneGraphReferenceDocs/a00801.html osg::Stats]<br/>
<br/>
The main suspects in this  case are probably 1) scenery, 2) cockpit, 3) Nasal (GC - bottleneck() function in simgear/nasal )<br/>
For the Nasal stats, you'll probably want to register them in FGNasalSys::FGNasalSys() by accessing globals-&gt;viewer-&gt; <br/>
That should give you a better idea about what's going on there, and it's been suggested by core developers, too:<br/>
<br/>
[http://www.mail-archive.com/flightgear-devel%40lists.sourceforge.net/msg37823.html http://www.mail-archive.com/flightgear- ... 37823.html]<br/>
{{cquote|Another goal is to add more node bits (and a GUI dialog to control them) so <br/>
various categories of objects can be disabled during the update pass. This will <br/>
mean the direct hit of, say, AI models vs particles vs random trees can be <br/>
measured. Of course it won't account for resources (memory, textures) burned by <br/>
such things, but would still help different people identify slowness on their <br/>
setups.}}
  |{{cite web |url=http://forum.flightgear.org/viewtopic.php?p=201742#p201742
    |title=<nowiki>Re: New Aircraft: the Extra500</nowiki>
    |author=<nowiki>Hooray</nowiki>
    |date=<nowiki>Sun Feb 23</nowiki>
  }}
}}


= Problem =
= Problem =
{{FGCquote
  |we now know that regular and unpleasant stuttering is caused by Nasals garbage collector
  |{{cite web |url=http://sourceforge.net/p/flightgear/mailman/message/27338887/
    |title=<nowiki>[Flightgear-devel] Nasal Garbage Collector (was Stuttering at 1 Hz
rate)</nowiki>
    |author=<nowiki>Robert</nowiki>
    |date=<nowiki>2011-04-10</nowiki>
  }}
}}
{{FGCquote
  | the garbage collection pass must be done in a single atomic step, otherwise it would leave the heap in an inconsistent state and adversely affect the scripts.
  |{{cite web |url=http://sourceforge.net/p/flightgear/mailman/message/27338915/
    |title=<nowiki>Re: [Flightgear-devel] Nasal Garbage Collector (was Stuttering at 1
Hz rate)</nowiki>
    |author=<nowiki>Curtis Olson</nowiki>
    |date=<nowiki>2011-04-10</nowiki>
  }}
}}
{{FGCquote
  |There are algorithms for incremental and/or concurrent and/or parallel garbage collection out there. They most likely not easy to implement and as far as I have seen so far would require (at least for concurrent and /or parallel GC) all writes of pointers to the Nasal heap (and possibly reads) to be redirected via wrapper functions (also known as (GC) read/write barriers).
  |{{cite web |url=http://sourceforge.net/p/flightgear/mailman/message/27338998/
    |title=<nowiki>Re: [Flightgear-devel] Nasal Garbage Collector (was Stuttering at 1
Hz rate)</nowiki>
    |author=<nowiki>Anders Gidenstam</nowiki>
    |date=<nowiki>2011-04-10</nowiki>
  }}
}}
{{FGCquote
  |This will not be an easy task but in my opinion it would be a promising option. It might be possible to use a GC module from a GPL:d Java vm or similar.
  |{{cite web |url=http://sourceforge.net/p/flightgear/mailman/message/27338998/
    |title=<nowiki>Re: [Flightgear-devel] Nasal Garbage Collector (was Stuttering at 1
Hz rate)</nowiki>
    |author=<nowiki>Anders Gidenstam</nowiki>
    |date=<nowiki>2011-04-10</nowiki>
  }}
}}
{{FGCquote
  |just running the normal (mutually exclusive) Nasal GC in another thread than the main loop is not hard - but since it is mutually exclusive to executing Nasal functions it doesn't help much when it comes to reducing the worst case latency.
The small changes needed to add a separate GC thread are available here:* http://www.gidenstam.org/FlightGear/misc/test/sg-gc-2.diff
* http://www.gidenstam.org/FlightGear/misc/test/fg-gc-1.diff
  |{{cite web |url=http://sourceforge.net/p/flightgear/mailman/message/27338998/
    |title=<nowiki>Re: [Flightgear-devel] Nasal Garbage Collector (was Stuttering at 1
Hz rate)</nowiki>
    |author=<nowiki>Anders Gidenstam</nowiki>
    |date=<nowiki>2011-04-10</nowiki>
  }}
}}
{{FGCquote
  | the result is very similar to the standard FG since even if the GC is run on a different thread it still takes the same amount of time and while it is running the main thread will be blocked waiting for it as soon as it tries to run some Nasal code.
Note that there is a spurious change in simgear/nasal/code.c that makes variable creation without the var keyword illegal. I use that to make sure my Nasal scripts are clean in this regard but there are plenty of scripts in fgdata that do not pass that test (and, well, they are still valid Nasal). I'll update the patch file ASAP but please revert that file to the standard one in the meanwhile.
  |{{cite web |url=http://sourceforge.net/p/flightgear/mailman/message/27340770/
    |title=<nowiki>Re: [Flightgear-devel] Nasal Garbage Collector (was Stuttering at 1
Hz rate)</nowiki>
    |author=<nowiki>Anders Gidenstam</nowiki>
    |date=<nowiki>2011-04-11</nowiki>
  }}
}}
{{FGCquote
  |there is still the issue that you can't execute Nasal while the GC is running (for that you'd need a concurrent GC which is most likely not so easily implemented). The implication is that if the time required for a collection exceeds the frame dealy you'll still notice it even when the GC runs in a different thread.
I have updated my experimental patch from last year, which should also work on Windows now (and it has even been tested there):
* http://sleipner.gidenstam.org/users/anders/FlightGear/sg-gc-4.diff
* http://sleipner.gidenstam.org/users/anders/FlightGear/fg-gc-3.diff
  |{{cite web |url=http://sourceforge.net/p/flightgear/mailman/message/29306592/
    |title=<nowiki>Re: [Flightgear-devel] Nasal performance</nowiki>
    |author=<nowiki>Anders Gidenstam</nowiki>
    |date=<nowiki>2012-05-23</nowiki>
  }}
}}
{{FGCquote
  |It doesn't solve the bigger problem of the current Nasal GC needing to run uninterrupted and in mutual exclusion - but I see very few GC runs in the main loop here (the patch prints "** Nasal GC in main thread **" in the console when that happens) except a few during startup.
  |{{cite web |url=http://sourceforge.net/p/flightgear/mailman/message/29306592/
    |title=<nowiki>Re: [Flightgear-devel] Nasal performance</nowiki>
    |author=<nowiki>Anders Gidenstam</nowiki>
    |date=<nowiki>2012-05-23</nowiki>
  }}
}}
The [[Nasal]] language is a dynamic programming language, where memory is not manually allocated and freed by the programmer, but instead the Nasal virtual machine automatically allocates memory from so called "memory pools", every so often these memory pools must be checked, to see how many references (i.e. objects like variables) are still reachable, so that the unreachable ones can be removed from the memory pool and/or so that memory blocks in each pool can be allocated or freed. This is called "garbage management" or "garbage collection".
The [[Nasal]] language is a dynamic programming language, where memory is not manually allocated and freed by the programmer, but instead the Nasal virtual machine automatically allocates memory from so called "memory pools", every so often these memory pools must be checked, to see how many references (i.e. objects like variables) are still reachable, so that the unreachable ones can be removed from the memory pool and/or so that memory blocks in each pool can be allocated or freed. This is called "garbage management" or "garbage collection".


The current Nasal garbage collector (GC) is known to affect the frame rate in FlightGear and occasionally causes stutter (increased frame spacing) under certain circumstances (i.e. under heavy load, due to complex scripts - such as the [[Bombable]] addon or the local weather system.  
The current Nasal garbage collector (GC) is known to affect the frame rate in FlightGear and occasionally causes stutter (increased frame spacing) under certain circumstances (i.e. under heavy load, due to complex scripts - such as the [[Bombable]] addon or the local weather system.  


This is largely attributed to the GC being a fairly simple sequential [http://en.wikipedia.org/wiki/Garbage_collection_(computer_science)#Copying_vs._mark-and-sweep_vs._mark-and-don.27t-sweep mark/sweep collector], which needs to stop all operations, mark all reachable objects and free unreachable objects in an exclusive, single-threaded, fashion.
This is largely attributed to the GC being a fairly simple sequential [http://en.wikipedia.org/wiki/Garbage_collection_(computer_science)#Copying_vs._mark-and-sweep_vs._mark-and-don.27t-sweep mark/sweep collector], which needs to stop all operations, recursively mark all reachable objects and free unreachable objects in an exclusive, single-threaded, fashion.


In other words, the time spent in the GC is proportial to the number of active objects (i.e.  lines of code), even for objects which may not be running or active - just loaded. For additional background information, please see [[Improving Nasal]].
In other words, the time spent in the GC is proportial to the number of active objects and references held (i.e.  lines of code), even for objects which may not be running or active - just loaded. For additional background information, please see [[Improving Nasal]].
 
{{cquote|allocating objects into generations and only mark/reaping from the most recent one on most iterations is a straightforward optimization and will definitely make things faster.| (Andy Ross) FlightGear Devel mailing list<ref>{{cite web |url=http://www.mail-archive.com/flightgear-devel@lists.sourceforge.net/msg37385.html |title=Nasal performance |author=Andy Ross |date=22 May 2012 }}</ref>}}
 
{{#ev:youtube|HBd7yVzJllw|600}} {{#ev:youtube|zksIj9O8_jc|600}}


= Objective =
= Objective =
Line 14: Line 290:


Also, come up with an approach to generalize the Nasal GC API such that additional GC schemes can be easily implemented next to the native Nasal GC, providing an option to select a custom/experimental GC implementation at startup. Ideally, this should be possible by using a switch property during fgfs startup - such as --prop:/sim/nasal-gc=custom
Also, come up with an approach to generalize the Nasal GC API such that additional GC schemes can be easily implemented next to the native Nasal GC, providing an option to select a custom/experimental GC implementation at startup. Ideally, this should be possible by using a switch property during fgfs startup - such as --prop:/sim/nasal-gc=custom
One of the most straightforward changes would be turning the existing mark/sweep GC into a generational GC using 2-3 generations, where past GC survivors are promoted to the next generation, the GC manager '''naGC_get()''' would then by default only request new memory in generation 0, whereas resize() events in gen0 would result in the next higher generation being subject to GC (and so on). The most senior data structures and objects would over time end up in the "eden" generation.
Some of the first steps would include:
* turning naGC_get() into a context-aware void function, so that callers do not need to be aware of the return type
* moving GC specific data structures from the globals struct to a GC struct, and only keep a void* GC pointers in globals
* move GC initialization to gc.c and allow different GCs to register their own init callbacks
* move GC specific declarations from data.h to gc.h (naPool etc)
* move GC specific stuff from misc.c to gc.c (none of the GC internals should be exposed to the rest of the code)
* customize the existing GC to support multiple generations


= Suggested readings =
= Suggested readings =
Line 20: Line 306:
In addition, the following online presentations provide a basic introduction to garbage collection:
In addition, the following online presentations provide a basic introduction to garbage collection:


* [http://lua-users.org/wiki/GarbageCollectionInRealTimeGames Garbage Collection in realtime games]
* [http://msdn.microsoft.com/en-us/library/ms973837.aspx Garbage Collector Basics and Performance Hints]
* http://www.memorymanagement.org/glossary/g.html#garbage.collection
* http://www.memorymanagement.org/glossary/g.html#garbage.collection
* [http://www.slideshare.net/toantvo/garbage-collector-9434996 Garbage Collection Basics]
* [http://www.slideshare.net/toantvo/garbage-collector-9434996 Garbage Collection Basics]
Line 28: Line 316:
* [http://www.iecc.com/gclist/GC-algorithms.html GC algorithms]
* [http://www.iecc.com/gclist/GC-algorithms.html GC algorithms]
* http://www.cs.kent.ac.uk/people/staff/rej/gc.html
* http://www.cs.kent.ac.uk/people/staff/rej/gc.html
* http://blogs.msdn.com/b/abhinaba/archive/tags/garbage+collection/


{{#ev:youtube|rp8PvFvSO_c|600|Lecture on GC (45 minutes)}}
{{#ev:youtube|rp8PvFvSO_c|600}}


= Mark/Sweep =
= Mark/Sweep =
Line 41: Line 331:
The mark-and-sweep algorithm consists of two phases: In the first phase, it finds and marks all accessible objects. The first phase is called the mark phase. In the second phase, the garbage collection algorithm scans through the heap and reclaims all the unmarked objects. The second phase is called the sweep phase.
The mark-and-sweep algorithm consists of two phases: In the first phase, it finds and marks all accessible objects. The first phase is called the mark phase. In the second phase, the garbage collection algorithm scans through the heap and reclaims all the unmarked objects. The second phase is called the sweep phase.


The mark/sweep algorithm can also be easily expressed using pseudo code, i.e. in Nasal tself:
The mark/sweep algorithm can also be easily expressed using pseudo code, i.e. in Nasal itself:


<syntaxhighlight lang="php">
<syntaxhighlight lang="nasal">
   
   
  var garbageCollect = func (roots) {
  var garbageCollect = func (roots) {
Line 56: Line 346:
The mark() function would just recursively check (i.e. calling itself) if the "marked" flag is set or not, and make sure to set the mark bit for all referenced objects:
The mark() function would just recursively check (i.e. calling itself) if the "marked" flag is set or not, and make sure to set the mark bit for all referenced objects:


<syntaxhighlight lang="php">
<syntaxhighlight lang="nasal">
   
   
  var mark = func (obj) {
  var mark = func (obj) {
Line 69: Line 359:
Finally, the sweep() function (reap() in the Nasal GC), will free all unreachable objects and reclaim the memory:
Finally, the sweep() function (reap() in the Nasal GC), will free all unreachable objects and reclaim the memory:


<syntaxhighlight lang="php">
<syntaxhighlight lang="nasal">
   
   
  var sweep = func {
  var sweep = func {
Line 83: Line 373:


= Pool storage =
= Pool storage =
A [http://en.wikipedia.org/wiki/Memory_pool memory pool] is basically a preallocated region of memory, which is dynamically resized. The Nasal GC works such that it manages a handful of global memory pools for all native Nasal types (strings, functions, vectors, hashes etc). At the moment, the hard coded defaults ensure that 25-50% of additional object "slots" (memory blocks) are kept available during each execution of reap() [https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/gc.c#line287].
A [http://en.wikipedia.org/wiki/Memory_pool memory pool] is basically a preallocated region of memory, which is dynamically resized. The Nasal GC works such that it manages a handful of global memory pools for all native Nasal types (strings, functions, vectors, hashes etc). At the moment, the hard coded defaults ensure that 25-50% of additional object "slots" (memory blocks) are kept available during each execution of reap() [{{simgear source|path=simgear/nasal/gc.c|line=287|full=1}}].


Whenever new memory is requested to create a new type (such as a vector or a string), the available memory in the corresponding pool will be checked, reachable objects will be marked, and dead objects will be removed from all pools using a mark/sweep collector, new memory blocks will be allocated if necessary. All of this happens atomically, i.e. single-threaded.
Whenever new memory is requested to create a new type (such as a vector or a string), the available memory in the corresponding pool will be checked, reachable objects will be marked, and dead objects will be removed from all pools using a mark/sweep collector, new memory blocks will be allocated if necessary. All of this happens atomically, i.e. single-threaded, in  a "stop-the-world" fashion.


== Memory blocks ==
== Memory blocks ==
Line 95: Line 385:
A Nasal memory pool consists of memory blocks.
A Nasal memory pool consists of memory blocks.


A memory block is a [http://en.wikipedia.org/wiki/Linked_list#Linear_and_circular_lists singly linked list]: https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/gc.c#line10
A memory block is a [http://en.wikipedia.org/wiki/Linked_list#Linear_and_circular_lists singly linked list]: {{simgear source|path=simgear/nasal/gc.c|line=10}}.


[[File:NasalNewBlockCallgraph.png|thumb|400px|Callgraph for newBlock()]]
[[File:NasalNewBlockCallgraph.png|thumb|400px|Callgraph for newBlock()]]
Line 109: Line 399:
</syntaxhighlight>
</syntaxhighlight>


New memory blocks are allocated using newBlock() and the memory is initialized with 0: https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/gc.c#line150
New memory blocks are allocated using newBlock() and the memory is initialized with 0: {{simgear source|path=simgear/nasal/gc.c|line=167}}


<syntaxhighlight lang="C">
<syntaxhighlight lang="C">
Line 145: Line 435:
For each of the 7 Nasal data types, there is a separate storage pool available, declared as part of the "Globals" structure.
For each of the 7 Nasal data types, there is a separate storage pool available, declared as part of the "Globals" structure.


As can be seen, each storage pool is addressed by its enum index (0..6): https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/data.h#line65
As can be seen, each storage pool is addressed by its enum index (0..6): {{simgear source|path=simgear/nasal/data.h|line=65}}
<syntaxhighlight lang="C">
<syntaxhighlight lang="C">
enum { T_STR, T_VEC, T_HASH, T_CODE, T_FUNC, T_CCODE, T_GHOST, NUM_NASAL_TYPES }; // V. important that this come last!
enum { T_STR, T_VEC, T_HASH, T_CODE, T_FUNC, T_CCODE, T_GHOST, NUM_NASAL_TYPES }; // V. important that this come last!
Line 156: Line 446:
</syntaxhighlight>
</syntaxhighlight>


The 7 storage pools are declared in code.h as part of the "Globals" structure, line 39: https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/code.h#line39
The 7 storage pools are declared in code.h as part of the "Globals" structure: {{simgear source|path=simgear/nasal/code.h|line=40}}


The relevant part is:
The relevant part is:
Line 233: Line 523:
[[File:NasalnaPoolStruct.png|thumb|400px|Nasal naPool structure]]
[[File:NasalnaPoolStruct.png|thumb|400px|Nasal naPool structure]]


The structure keeping all pool-related information is named "naPool", the layout of the naPool structure can be seen in data.h, line 172: https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/data.h#line172
The structure keeping all pool-related information is named "naPool", the layout of the naPool structure can be seen in data.h: {{simgear source|path=simgear/nasal/data.h|line=181}}
<syntaxhighlight lang="C">
<syntaxhighlight lang="C">
struct naPool {
struct naPool {
Line 247: Line 537:
</syntaxhighlight>
</syntaxhighlight>


Each memory pool is initialized using naGC_init(): https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/gc.c#line175
Each memory pool is initialized using naGC_init(): {{simgear source|path=simgear/nasal/gc.c|line=192}}


<syntaxhighlight lang="C">
<syntaxhighlight lang="C">
Line 263: Line 553:
As can be seen, "elemsz"  just stores the size of each type-specific structure, so that it can be used for new allocations.
As can be seen, "elemsz"  just stores the size of each type-specific structure, so that it can be used for new allocations.


naGC_init is called in initGlobals() in code.c for each Nasal type (T_STR, T_VEC etc): https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/code.c#line170
naGC_init is called in initGlobals() in code.c for each Nasal type (T_STR, T_VEC etc): {{simgear source|path=simgear/nasal/code.c|line=172}}
<syntaxhighlight lang="C">
<syntaxhighlight lang="C">
for(i=0; i<NUM_NASAL_TYPES; i++)
for(i=0; i<NUM_NASAL_TYPES; i++)
Line 270: Line 560:




The size of a memory pool is determined using poolsize(): https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/gc.c#line186
The size of a memory pool is determined using poolsize(): {{simgear source|path=simgear/nasal/gc.c|line=203}}


<syntaxhighlight lang="C">
<syntaxhighlight lang="C">
Line 286: Line 576:
= Allocating Nasal types =
= Allocating Nasal types =


All Nasal types (scalars, vectors, hashes, funcs, ghosts) are allocated via wrappers in [https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/misc.c misc.c].
All Nasal types (scalars, vectors, hashes, funcs, ghosts) are allocated via wrappers in {{simgear source|path=simgear/nasal/misc.c|text=misc.c}}.


These wrappers are:
These wrappers are:
* naNewString(): https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/misc.c#line77
* naNewString(): {{simgear source|path=simgear/nasal/misc.c|line=77}}
* naNewVector(): https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/misc.c#line87
* naNewVector(): {{simgear source|path=simgear/nasal/misc.c|line=87}}
* naNewHash(): https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/misc.c#line94
* naNewHash(): {{simgear source|path=simgear/nasal/misc.c|line=94}}
* naNewCode(): https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/misc.c#line101
* naNewCode(): {{simgear source|path=simgear/nasal/misc.c|line=101}}
* naNewCCode(): https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/misc.c#line106
* naNewCCode(): {{simgear source|path=simgear/nasal/misc.c|line=106}}
* naNewFunc(): https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/misc.c#line113
* naNewFunc(): {{simgear source|path=simgear/nasal/misc.c|line=131}}
* naNewGhost(): https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/misc.c#line122
* naNewGhost(): {{simgear source|path=simgear/nasal/misc.c|line=140}}
* naNewGhost2(): https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/misc.c#line135
* naNewGhost2(): {{simgear source|path=simgear/nasal/misc.c|line=154}}


All of these wrappers are really just helpers on top of naNew(), they just set up type-specific information and initialize each Nasal type properly.
All of these wrappers are really just helpers on top of naNew(), they just set up type-specific information and initialize each Nasal type properly.
Line 315: Line 605:
= naNew() =
= naNew() =


As can be seen, the key allocator function is [https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/misc.c#line66 naNew()].  
As can be seen, the key allocator function is {{simgear source|path=simgear/nasal/misc.c|line=66|text=naNew()}}.
All memory allocations done via the naNew*() helpers will first of all set up memory using naNew().
All memory allocations done via the naNew*() helpers will first of all set up memory using naNew().


naNew() is really just a fancy malloc() replacement, i.e. it's just getting passed the Nasal execution context, and the nasal type (which is mapped to the size of the requested memory region using naTypeSize()).
naNew() is really just a fancy malloc() replacement, i.e. it's just getting passed the Nasal execution context, and the nasal type (which is mapped to the size of the requested memory region using naTypeSize()).


The naNew() function is also implemented in misc.c, see [https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/misc.c#line122 line 66]:
{{FGCquote|1= New nasal objects are added to a temporary bin when they are created, because further allocation might cause a garbage collection to happen before the code that created the object can store a reference to it where the garbage collector can find it. For performance and simplicity, this list is stored per-context. When the context next executes code, it clears this list. Here's the problem: we do a lot of our naNewXX() calls in FlightGear using the old "global context" object that is created at startup. But this context is no longer used to execute scripts* at runtime, so as Csaba discovered, it's temporaries are never flushed. That essentially causes a resource leak: those allocations (mostly listener nodes) will never be freed. And all the extra "reachable" Nasal data floating around causes the garbage collector to take longer and longer to do a full collection as time goes on, causing "stutter". And scripts that use listeners extensively (the cmdarg() they use was one of the affected allocations) will see the problem more seriously.|2= {{cite web  | url    = http://sourceforge.net/p/flightgear/mailman/message/15645369/  | title  = <nowiki>[Flightgear-devel] Stutter/Nasal issue resolved (was: FlightGear/Plib periodic stutter notes)</nowiki>  | author = <nowiki>Andy Ross</nowiki>  | date  = Oct 24th, 2007  }}}}
 
{{FGCquote|1= Once listeners were added, scripts could be recursive: (script A sets property X which causes listener L to fire and cause script B to run) We need two or more contexts on the stack to handle that; a single global one won't work. I didn't like the fix though. Exposing the temporary bin as part of the Nasal public API is ugly; it's an internal design feature, not something users should tune. Instead, I just hacked at the FlightGear code to reinitialie this context every frame, thus cleaning it up. A "proper" fix would be to remove the global context entirely, but that would touch a bunch of code.|2= {{cite web  | url    = http://sourceforge.net/p/flightgear/mailman/message/15645369/ | title  = <nowiki>[Flightgear-devel] Stutter/Nasal issue resolved (was: FlightGear/Plib periodic stutter notes)</nowiki>  | author = <nowiki>Andy Ross</nowiki>  | date  = Oct 24th, 2007  }}}}
 
The naNew() function is also implemented in misc.c, see {{simgear source|path=simgear/nasal/misc.c|line=66}}:


<syntaxhighlight lang="C">
<syntaxhighlight lang="C">
Line 339: Line 633:
'''Note:''' naGC_get() is the only function which really manages all memory pools and which triggers the GC. Allocations via naNew()* will end up calling naNew() and naNew() ends up calling naGC_get() '''always'''.
'''Note:''' naGC_get() is the only function which really manages all memory pools and which triggers the GC. Allocations via naNew()* will end up calling naNew() and naNew() ends up calling naGC_get() '''always'''.


For the time being, OBJ_CACHE_SZ can be ignored, it's defined in code.h, line 12: https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/code.h#line12
For the time being, OBJ_CACHE_SZ can be ignored, it's defined in code.h: {{simgear source|path=simgear/nasal/code.h|line=12}}
<syntaxhighlight lang="C">
<syntaxhighlight lang="C">
// Number of objects (per pool per thread) asked for using naGC_get().
// Number of objects (per pool per thread) asked for using naGC_get().
Line 379: Line 673:




TODO: https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/data.h#line204
TODO: {{simgear source|path=simgear/nasal/data.h|line=204}}


* void naGC_init(struct naPool* p, int type);
* void naGC_init(struct naPool* p, int type);
Line 395: Line 689:
All high level type allocators (naNewString, naNewVector, naNewHash etc) make use of the naNew() call introduced earlier.
All high level type allocators (naNewString, naNewVector, naNewHash etc) make use of the naNew() call introduced earlier.


naNew() doesn't directly allocate new memory from the heap using naAlloc(), but instead use the naGC_get() helper which manages memory from the global pools (i.e. preallocated memory for each type).
naNew() doesn't directly allocate new memory from the heap using naAlloc(), but instead uses the naGC_get() helper which manages memory from the global pools (i.e. preallocated memory for each type).


For each native Nasal data type (scalars, vectors, hashes, funcs etc) , there are separate memory pools in globals->pools[n]. The index (n) is one of T_VEC, T_HASH, T_FUNC etc.
For each native Nasal data type (scalars, vectors, hashes, funcs etc) , there are separate memory pools in globals->pools[n]. The index (n) is one of T_VEC, T_HASH, T_FUNC etc.


The naGC_get() pool "manager" is located in gc.nas line 194: https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/gc.c#line194
The naGC_get() pool "manager" is located in gc.nas: {{simgear source|path=simgear/nasal/gc.c|line=211}}


<syntaxhighlight lang="C">
<syntaxhighlight lang="C">
Line 423: Line 717:
</syntaxhighlight>
</syntaxhighlight>


As can be seen, the garbage collector itself is triggered by setting the global "needGC" flag to 1 in the globals structure and then calling the bottleneck() function: [https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/gc.c#line199].
As can be seen, the garbage collector itself is triggered by setting the global "needGC" flag to 1 in the globals structure and then calling the bottleneck() function: {{simgear source|path=simgear/nasal/gc.c|line=105}}.


In line 209, the memory region from the global pool is cast back to a pointer (struct naObj**): https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/gc.c#line209
The memory region from the global pool is cast back to a pointer (struct naObj**): {{simgear source|path=simgear/nasal/gc.c|line=226}}
the "nout" paramter is just there to update c->nfree[type] with the amount of free memory.
the "nout" parameter is just there to update c->nfree[type] with the amount of free memory.


In other words, whenever new Nasal types are allocated, new memory from one of the 7 Nasal pools is requested. All of these allocation requests are channeled through naNew() and finally naGC_get(). naGC_get() is the de facto memory manager in Nasal, all memory management is triggered here - new memory block allocations for each pool, as well as garbage collection via the bottleneck() call.
In other words, whenever new Nasal types are allocated, new memory from one of the 7 Nasal pools is requested. All of these allocation requests are channeled through naNew() and finally naGC_get(). naGC_get() is the de facto memory manager in Nasal, all memory management is triggered here - new memory block allocations for each pool, as well as garbage collection via the bottleneck() call.


This also means that we basically just need to provide an alternate naGC_get() function, next to the existing one, to add support for alternate GC schemes.
This also means that we basically just need to provide an alternate naGC_get() function, next to the existing one, to add support for alternate GC schemes.
= Supporting additional GC implementations =
It would probably make sense to change the pool manager interface/signature such that additional implementations can be easily provided and registered, so that merely a GC_CALLBACK pointer must be set during initialization in order to use a different GC scheme:
<syntaxhighlight lang="diff">
diff --git a/simgear/nasal/data.h b/simgear/nasal/data.h
index deb4fbd..f3fcaa1 100644
--- a/simgear/nasal/data.h
+++ b/simgear/nasal/data.h
@@ -202,7 +202,7 @@ int naiHash_sym(struct naHash* h, struct naStr* sym, naRef* out);
void naiHash_newsym(struct naHash* h, naRef* sym, naRef* val);
void naGC_init(struct naPool* p, int type);
-struct naObj** naGC_get(struct naPool* p, int n, int* nout);
+void naGC_alloc(struct Context* c, int type, int n);
void naGC_swapfree(void** target, void* val);
void naGC_freedead();
void naiGCMark(naRef r);
diff --git a/simgear/nasal/gc.c b/simgear/nasal/gc.c
index e9d9b7b..3bae313 100644
--- a/simgear/nasal/gc.c
+++ b/simgear/nasal/gc.c
@@ -191,8 +191,10 @@ static int poolsize(struct naPool* p)
    return total;
}
-struct naObj** naGC_get(struct naPool* p, int n, int* nout)
+void naGC_alloc(struct Context* c, int type, int n)
{
+    struct naPool* p = &globals->pools[type]; // type-specific memory pool
+    int* nout = &c->nfree[type]; //
    struct naObj** result;
    naCheckBottleneck();
    LOCK();
@@ -208,7 +210,7 @@ struct naObj** naGC_get(struct naPool* p, int n, int* nout)
    globals->allocCount -= n;
    result = (struct naObj**)(p->free + p->nfree);
    UNLOCK();
-    return result;
+    c->free[type]=result;
}
static void markvec(naRef r)
diff --git a/simgear/nasal/misc.c b/simgear/nasal/misc.c
index dae0bfe..5caeb76 100644
--- a/simgear/nasal/misc.c
+++ b/simgear/nasal/misc.c
@@ -66,9 +66,7 @@ naRef naStringValue(naContext c, naRef r)
naRef naNew(struct Context* c, int type)
{
    naRef result;
-    if(c->nfree[type] == 0)
-        c->free[type] = naGC_get(&globals->pools[type],
-                                OBJ_CACHE_SZ, &c->nfree[type]);
+    if(! c->nfree[type] ) naGC_alloc(c, type, OBJ_CACHE_SZ);
    result = naObj(type, c->free[type][--c->nfree[type]]);
    naTempSave(c, result);
    return result;
</syntaxhighlight>
= Disabling the current GC =
The overhead caused by managing the memory pools and constantly scanning the  heap (mark/reap) can be examined by modifying the naGC_get() function such that it no longer uses any pool memory, but directly allocates via malloc()/naAlloc() - without ever freeing any memory (i.e. leaking). While the fgfs process will eventually get killed this way, the GC will be basically switched off.
<syntaxhighlight lang="C">
struct naObj** naGC_get(struct naPool* p, int n, int* nout)
{
    struct naObj** result;
    naCheckBottleneck();
    LOCK();
#define LEAKY
#ifndef LEAKY
    while(globals->allocCount < 0 || (p->nfree == 0 && p->freetop >= p->freesz)) {
        globals->needGC = 1;
        bottleneck();
    }
    if(p->nfree == 0)
        newBlock(p, poolsize(p)/8);
    n = p->nfree < n ? p->nfree : n;
    *nout = n;
    p->nfree -= n;
    globals->allocCount -= n;
    result = (struct naObj**)(p->free + p->nfree);
#else
    result = (struct naObj**) naAlloc( naTypeSize(n) );
#endif
    UNLOCK();
    return result;
}
</syntaxhighlight>


= bottleneck() =
= bottleneck() =
Line 439: Line 824:
Whenever new memory from one of the global pools is being requested via naGC_get(), naCheckBottleneck() will also be invoked:
Whenever new memory from one of the global pools is being requested via naGC_get(), naCheckBottleneck() will also be invoked:


https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/gc.c#line121
{{simgear source|path=simgear/nasal/gc.c|line=131}}


<syntaxhighlight lang="C">
<syntaxhighlight lang="C">
Line 452: Line 837:
[[File:NasalBottleneckCallgraph.png|thumb|550px|Callgraph of bottleneck()]]
[[File:NasalBottleneckCallgraph.png|thumb|550px|Callgraph of bottleneck()]]


https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/gc.c#line99
{{simgear source|path=simgear/nasal/gc.c|line=100}}
<syntaxhighlight lang="C">
<syntaxhighlight lang="C">
// Must be called with the main lock.  Engages the "bottleneck", where
// Must be called with the main lock.  Engages the "bottleneck", where
Line 477: Line 862:
</syntaxhighlight>
</syntaxhighlight>


The actual garbage collection takes place in [https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/gc.c#line114 line 114] where dead objects from the memory pool are freed, and the garbageCollect() function is called if globals.needGC is true.
The actual garbage collection takes place in {{simgear source|path=simgear/nasal/gc.c|line=114|text=line 114}} where dead objects from the memory pool are freed, and the garbageCollect() function is called if globals.needGC is true.


= garbageCollect() =
= garbageCollect() =
Line 485: Line 870:
This is the function that does the mark/sweep part using mark() to recursively mark reachable objects, and reap() to collect unreachable objects.
This is the function that does the mark/sweep part using mark() to recursively mark reachable objects, and reap() to collect unreachable objects.


https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/gc.c#line35
{{simgear source|path=simgear/nasal/gc.c|line=35}}


<syntaxhighlight lang="C">
<syntaxhighlight lang="C">
Line 532: Line 917:
For each invocation of garbageCollect(), the following takes place:
For each invocation of garbageCollect(), the following takes place:
* For each context:
* For each context:
** The counter for each type-specific pool will be reset to 0 in line 43: https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/gc.c#line43
** The counter for each type-specific pool will be reset to 0: {{simgear source|path=simgear/nasal/gc.c|line=43}}
** for each stack frame of the given context, all reachable locals and functions will be marked as reachable in line 45: https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/gc.c#line45
** for each stack frame of the given context, all reachable locals and functions will be marked as reachable: {{simgear source|path=simgear/nasal/gc.c|line=45}}
** All instructions in the opcode stack will be marked as reachable from top to bottom in line 49: https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/gc.c#line49
** All instructions in the opcode stack will be marked as reachable from top to bottom: {{simgear source|path=simgear/nasal/gc.c|line=49}}
** mark(die->Arg)
** mark(die->Arg)
** temporary variables are marked reachable
** temporary variables are marked reachable
* root references (constants) are marked reachable in line56https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/gc.c#line56
* root references (constants) are marked reachable:  {{simgear source|path=simgear/nasal/gc.c|line=56}}
* foreach Nasal data type:
* foreach Nasal data type:
** unreachable objects are then collected in line 62: https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/gc.c#line62
** unreachable objects are then collected: {{simgear source|path=simgear/nasal/gc.c|line=63}}
* dead blocks are freed and new memory is allocated in line: 66 https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/gc.c#line66
* dead blocks are freed and new memory is allocated: {{simgear source|path=simgear/nasal/gc.c|line=67}}
* the needGC flag is set to 0
* the needGC flag is set to 0


Line 547: Line 932:
[[File:NasalMarkCallgraph.png|thumb|400px|Callgraph of mark()]]
[[File:NasalMarkCallgraph.png|thumb|400px|Callgraph of mark()]]


All reachable objects are recursively marked using tagged naRef structures [https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/data.h#line50], so that unreachable objects can be collected (i.e. because the reftag bit is not set) and freed.
All reachable objects are recursively marked using tagged naRef structures {{simgear source|path=simgear/nasal/data.h|line=50}}, so that unreachable objects can be collected (i.e. because the reftag bit is not set) and freed.


TOOD: https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/gc.c#line223
TODO: {{simgear source|path=simgear/nasal/gc.c|line=223}}
<syntaxhighlight lang="C">
<syntaxhighlight lang="C">


Line 605: Line 990:
[[File:NasalFreeelemCallgraph.png|thumb|Callgraph of freeelem()]]
[[File:NasalFreeelemCallgraph.png|thumb|Callgraph of freeelem()]]


Frees individual elements and adds freed pointers to the globals free list, implemented in gc.c: https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/gc.c#line137
Frees individual elements and adds freed pointers to the globals free list, implemented in gc.c: {{simgear source|path=simgear/nasal/gc.c|line=153}}
<syntaxhighlight lang="C">
<syntaxhighlight lang="C">
static void freeelem(struct naPool* p, struct naObj* o)
static void freeelem(struct naPool* p, struct naObj* o)
Line 624: Line 1,009:
Will be called by freelem(), implemented for all native Nasal types:
Will be called by freelem(), implemented for all native Nasal types:


* strings (T_STR):  naStr_gcclean [https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/string.c#line113]
* strings (T_STR):  naStr_gcclean {{simgear source|path=simgear/nasal/string.c|line=113}}
* vectors (T_VEC):  naVec_gcclean [https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/vector.c#line22]
* vectors (T_VEC):  naVec_gcclean {{simgear source|path=simgear/nasal/vector.c|line=22}}
* hashes (T_HASH):  naiGCHashClean [https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/hash.c#line221]
* hashes (T_HASH):  naiGCHashClean {{simgear source|path=simgear/nasal/hash.c|line=221}}
* code  (T_CODE):  naCode_gcclean [https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/gc.c#line126]
* code  (T_CODE):  naCode_gcclean {{simgear source|path=simgear/nasal/gc.c|line=136}}
* ghosts (T_GHOST): naGhost_gcclean [https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/gc.c#line131]
* ghosts (T_GHOST): naGhost_gcclean {{simgear source|path=simgear/nasal/gc.c|line=147}}


= Contexts =
= Contexts =
{{cquote|<nowiki>New nasal objects are added to a temporary
bin when they are created, because further allocation might cause a
garbage collection to happen before the code that created the object
can store a reference to it where the garbage collector can find it.
For performance and simplicity, this list is stored per-context.  When
the context next executes code, it clears this list.
Here's the problem: we do a lot of our naNewXX() calls in FlightGear
using the old "global context" object that is created at startup.  But
this context is no longer used to execute scripts* at runtime, so as
Csaba discovered, it's temporaries are never flushed.  That
essentially causes a resource leak: those allocations (mostly listener
nodes) will never be freed.  And all the extra "reachable" Nasal data
floating around causes the garbage collector to take longer and longer
to do a full collection as time goes on, causing "stutter".  And
scripts that use listeners extensively (the cmdarg() they use was one
of the affected allocations) will see the problem more seriously.
(That's a feature, not a bug.  Once listeners were added, scripts
could be recursive: (script A sets property X which causes listener
L to fire and cause script B to run) We need two or more contexts on
the stack to handle that; a single global one won't work.)
I didn't like the fix though.  Exposing the temporary bin as part of
the Nasal public API is ugly; it's an internal design feature, not
something users should tune.  Instead, I just hacked at the FlightGear
code to reinitialize this context every frame, thus cleaning it up.  A
"proper" fix would be to remove the global context entirely, but that
would touch a bunch of code.
</nowiki><ref>{{cite web |url=http://www.mail-archive.com/flightgear-devel@lists.sourceforge.net/msg13028.html|title=<nowiki>[Flightgear-devel] Stutter/Nasal issue resolved (was: FlightGear/Plib periodic stutter notes)</nowiki>|author=<nowiki>Andy Ross</nowiki>|date=<nowiki>Wed, 24 Oct 2007 11:33:36 -0700</nowiki>}}</ref>|<nowiki>Andy Ross</nowiki>}}
<references/>
Also see: {{flightgear source|path=src/Scripting/NasalSys.cxx}} (in FGNasalSys::update)
<syntaxhighlight lang="c">
    // The global context is a legacy thing.  We use dynamically
    // created contexts for naCall() now, so that we can call them
    // recursively.  But there are still spots that want to use it for
    // naNew*() calls, which end up leaking memory because the context
    // only clears out its temporary vector when it's *used*.  So just
    // junk it and fetch a new/reinitialized one every frame.  This is
    // clumsy: the right solution would use the dynamic context in all
    // cases and eliminate _context entirely.  But that's more work,
    // and this works fine (yes, they say "New" and "Free", but
    // they're very fast, just trust me). -Andy
</syntaxhighlight>
'''Consider using coccinelle here to rewrite naContext handling in all places at once'''
[[Category:Developer Plans]]


[[File:NasalContextStruct.png|thumb|500px|Nasal Context structure]]
[[File:NasalContextStruct.png|thumb|500px|Nasal Context structure]]


https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/code.h#line71
{{simgear source|path=simgear/nasal/code.h|line=76}}
<syntaxhighlight lang="C">
<syntaxhighlight lang="C">
struct Context {
struct Context {
Line 704: Line 1,143:


= Stack frames =
= Stack frames =
https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/code.h#line32
{{simgear source|path=simgear/nasal/code.h|line=33}}


= Tagged structures =
= Tagged structures =
https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/data.h#line84
{{simgear source|path=simgear/nasal/data.h|line=84}}


= Core allocations =
= Core allocations =
All core memory allocation takes place via wrappers in [https://gitorious.org/fg/simgear/blobs/next/simgear/nasal/misc.c#line8 misc.c]:
All core memory allocation takes place via wrappers in {{simgear source|path=simgear/nasal/misc.c|line=8|text=misc.c}}:


<syntaxhighlight lang="C">
<syntaxhighlight lang="C">
Line 719: Line 1,158:
</syntaxhighlight>
</syntaxhighlight>


<references/>


[[Category:Core developer documentation]]  
[[Category:Core developer documentation]]
[[Category:Core development projects]]  
[[Category:Nasal]]
[[Category:Nasal]]

Revision as of 12:02, 18 February 2018

High Level Architecture

1rightarrow.png See High-Level Architecture for the main article about this subject.

Cquote1.png Regarding Nasal as a scripting language and AW, I'm hopeful that the work I'm doing on HLA will allow us to run Nasal in a separate thread from the FDM and display, so Nasal GC no-longer can impact frame-rates. It would also allow for writing a weather simulation completely external to the FlightGear instance, which could be quite neat.
— stuart (Nov 24th, 2015). Re: the real cost of the Nasal Garbage Collector.
(powered by Instant-Cquotes)
Cquote2.png
Cquote1.png I do like the idea of a stand-alone script interpreter communicating via HLA or some similar IPC, but we still have to consider fragments of script code embedded in models to create or enhance fancy animations. Not all script fragments make sense to run externally.
Cquote2.png
Cquote1.png if scripting is a major performance hog, off-load it to a different thread. I don't think we have many situation where it is, but that's what HLA is going to do.
Cquote2.png
Cquote1.png haven't made too many tests with Nasal active as anyone can do that (adjust paths to have no terrain if necessary). But most of this becomes less relevant with the HLA changes that Stuart's working on.
— Richard (Nov 24th, 2015). Re: the real cost of the Nasal Garbage Collector.
(powered by Instant-Cquotes)
Cquote2.png

Status (02/2016)

The main Nasal repository (maintained by Andy Ross) can also be found at github: https://github.com/andyross/nasal

The latest version in FlightGear is maintained in the SimGear repository.

The garbage collector in Nasal is pretty self-contained and entirely implemented in the file gc.c.

Cquote1.png Nasal's garbage collector requires that no other Nasal threads are

running, and achieves this by keeping a count of active Nasal threads
(nThreads, simgear/nasal/code.h), and having the thread that needs GC
ask all other threads to stop and wait until nThreads-1 say they have
(bottleneck(), simgear/nasal/gc.c). This will hang if there are any
threads that Nasal thinks are active but actually aren't, as they will
never reply to the stop request.


Cquote2.png



Cquote1.png Such things may look simple at first. Easy to convert a simple "hello world". But it's very complex when supporting all the features of an interpreted script language. And the funny thing is: you'd still need to worry about automatic garbage collection and count references (though that'd be a lesser issue compared to many others then). So, time wake up...
— ThorstenB (Mar 26th, 2011). Re: [Flightgear-devel] Stuttering at 1 Hz rate.
(powered by Instant-Cquotes)
Cquote2.png

Patches

Cquote1.png there is still the issue that you can't execute Nasal while the GC is running (for that you'd need a concurrent GC which is most likely not so easily implemented). The implication is that if the time required for a collection exceeds the frame dealy you'll still notice it even when the GC runs in a different thread. I have updated my experimental patch from last year, which should also work on Windows now (and it has even been tested there): Again: It doesn't solve the bigger problem of the current Nasal GC needing to run uninterrupted and in mutual exclusion - but I see very few GC runs in the main loop here (the patch prints "** Nasal GC in main thread **" in the console when that happens) except a few during startup.
— Anders Gidenstam (May 23rd, 2012). Re: [Flightgear-devel] Nasal performance.
(powered by Instant-Cquotes)
Cquote2.png
Cquote1.png There are algorithms for incremental and/or concurrent and/or parallel garbage collection out there. They most likely not easy to implement and as far as I have seen so far would require (at least for concurrent and /or parallel GC) all writes of pointers to the Nasal heap (and possibly reads) to be redirected via wrapper functions (also known as (GC) read/write barriers). This will not be an easy task but in my opinion it would be a promising option. It might be possible to use a GC module from a GPL:d Java vm or similar. Btw, just running the normal (mutually exclusive) Nasal GC in another thread than the main loop is not hard - but since it is mutually exclusive to executing Nasal functions it doesn't help much when it comes to reducing the worst case latency. The small changes needed to add a separate GC thread are available here: Cquote2.png
Cquote1.png There were some serious issues with my patch - namely that the destruction of some C++ side FG objects kept alive by Nasal references must not be invoked on a different thread. I didn't see the crashes, but they appeared for people using Canvas heavy aircraft.

James did some work to fix these problems but I'm not sure how completely he managed. I still run the patch with his updates, however, and for the aircraft (I don't think I use any canvases, though) I use it works.

Running the Nasal GC in a different thread could certainly be revisited, and if the GC updated to an incremental one it could be done without impacting the main loop much.
— AndersG (Nov 24th, 2015). Re: the real cost of the Nasal Garbage Collector.
(powered by Instant-Cquotes)
Cquote2.png
Cquote1.png Running the Nasal GC outside the main thread is already (seemingly) possible but code for it is not in the next branch. I think James fixed the remaining issue a few years back, namely that Nasal/C++ ghost objects must be handed back to the main thread for destruction since they invoke C++-side destructors that touch shared FG state. However, since the Nasal GC (currently) is mutual exclusive to running any Nasal context (using the same heap) this is still not a satisfactory solution once the time needed for a GC approaches the frame duration as the main thread will be blocked waiting for the GC to finish. I've been told this frequently happens for more Nasal heavy use cases, though I have not experienced it on my own ancient (by far GPU-bound) system in my use cases. There are some script use cases that require script code to run in sync with the viewer, e.g. script-based fly by, walk and MP/AI views. For such tightly coupled script code Lua (which, IIRC, offers "real" threads and an incremental GC) or an overhauled incremental and/or concurrent Nasal GC (not a negligible amount of work, though) would seem like options for this type of script code.
Cquote2.png


Cquote1.png The built-in osgviewer stats can be extended with custom stats, that works by subclassing osg::StatsHandler, this is already done in $FG_SRC/Viewer/FGEventHandler.?xx

The class can be extended to add your own stats via osgViewer::StatsHandler::addUserStatsLine()

You can even register totally custom stats via osg::Stats

The main suspects in this case are probably 1) scenery, 2) cockpit, 3) Nasal (GC - bottleneck() function in simgear/nasal )
For the Nasal stats, you'll probably want to register them in FGNasalSys::FGNasalSys() by accessing globals->viewer->
That should give you a better idea about what's going on there, and it's been suggested by core developers, too:

http://www.mail-archive.com/flightgear- ... 37823.html

Cquote1.png Another goal is to add more node bits (and a GUI dialog to control them) so

various categories of objects can be disabled during the update pass. This will
mean the direct hit of, say, AI models vs particles vs random trees can be
measured. Of course it won't account for resources (memory, textures) burned by
such things, but would still help different people identify slowness on their
setups.

Cquote2.png

— Hooray (Sun Feb 23). Re: New Aircraft: the Extra500.
(powered by Instant-Cquotes)
Cquote2.png

Problem

Cquote1.png we now know that regular and unpleasant stuttering is caused by Nasals garbage collector
Cquote2.png
Cquote1.png the garbage collection pass must be done in a single atomic step, otherwise it would leave the heap in an inconsistent state and adversely affect the scripts.
Cquote2.png
Cquote1.png There are algorithms for incremental and/or concurrent and/or parallel garbage collection out there. They most likely not easy to implement and as far as I have seen so far would require (at least for concurrent and /or parallel GC) all writes of pointers to the Nasal heap (and possibly reads) to be redirected via wrapper functions (also known as (GC) read/write barriers).
Cquote2.png
Cquote1.png This will not be an easy task but in my opinion it would be a promising option. It might be possible to use a GC module from a GPL:d Java vm or similar.
Cquote2.png
Cquote1.png just running the normal (mutually exclusive) Nasal GC in another thread than the main loop is not hard - but since it is mutually exclusive to executing Nasal functions it doesn't help much when it comes to reducing the worst case latency.

The small changes needed to add a separate GC thread are available here:* http://www.gidenstam.org/FlightGear/misc/test/sg-gc-2.diff


Cquote2.png
Cquote1.png the result is very similar to the standard FG since even if the GC is run on a different thread it still takes the same amount of time and while it is running the main thread will be blocked waiting for it as soon as it tries to run some Nasal code.

Note that there is a spurious change in simgear/nasal/code.c that makes variable creation without the var keyword illegal. I use that to make sure my Nasal scripts are clean in this regard but there are plenty of scripts in fgdata that do not pass that test (and, well, they are still valid Nasal). I'll update the patch file ASAP but please revert that file to the standard one in the meanwhile.


Cquote2.png
Cquote1.png there is still the issue that you can't execute Nasal while the GC is running (for that you'd need a concurrent GC which is most likely not so easily implemented). The implication is that if the time required for a collection exceeds the frame dealy you'll still notice it even when the GC runs in a different thread.

I have updated my experimental patch from last year, which should also work on Windows now (and it has even been tested there):


— Anders Gidenstam (2012-05-23). Re: [Flightgear-devel] Nasal performance.
(powered by Instant-Cquotes)
Cquote2.png
Cquote1.png It doesn't solve the bigger problem of the current Nasal GC needing to run uninterrupted and in mutual exclusion - but I see very few GC runs in the main loop here (the patch prints "** Nasal GC in main thread **" in the console when that happens) except a few during startup.
— Anders Gidenstam (2012-05-23). Re: [Flightgear-devel] Nasal performance.
(powered by Instant-Cquotes)
Cquote2.png

The Nasal language is a dynamic programming language, where memory is not manually allocated and freed by the programmer, but instead the Nasal virtual machine automatically allocates memory from so called "memory pools", every so often these memory pools must be checked, to see how many references (i.e. objects like variables) are still reachable, so that the unreachable ones can be removed from the memory pool and/or so that memory blocks in each pool can be allocated or freed. This is called "garbage management" or "garbage collection".

The current Nasal garbage collector (GC) is known to affect the frame rate in FlightGear and occasionally causes stutter (increased frame spacing) under certain circumstances (i.e. under heavy load, due to complex scripts - such as the Bombable addon or the local weather system.

This is largely attributed to the GC being a fairly simple sequential mark/sweep collector, which needs to stop all operations, recursively mark all reachable objects and free unreachable objects in an exclusive, single-threaded, fashion.

In other words, the time spent in the GC is proportial to the number of active objects and references held (i.e. lines of code), even for objects which may not be running or active - just loaded. For additional background information, please see Improving Nasal.

Cquote1.png allocating objects into generations and only mark/reaping from the most recent one on most iterations is a straightforward optimization and will definitely make things faster.
— (Andy Ross) FlightGear Devel mailing list[1]
Cquote2.png

Objective

Provide a basic introduction to garbage collection and document how the existing Nasal GC scheme works, so that it can be fully understood, before alternatives (fixes/improvements) or additional implementations are added (feedback and contributions are encouraged and appreciated!).

Also, come up with an approach to generalize the Nasal GC API such that additional GC schemes can be easily implemented next to the native Nasal GC, providing an option to select a custom/experimental GC implementation at startup. Ideally, this should be possible by using a switch property during fgfs startup - such as --prop:/sim/nasal-gc=custom

One of the most straightforward changes would be turning the existing mark/sweep GC into a generational GC using 2-3 generations, where past GC survivors are promoted to the next generation, the GC manager naGC_get() would then by default only request new memory in generation 0, whereas resize() events in gen0 would result in the next higher generation being subject to GC (and so on). The most senior data structures and objects would over time end up in the "eden" generation.

Some of the first steps would include:

  • turning naGC_get() into a context-aware void function, so that callers do not need to be aware of the return type
  • moving GC specific data structures from the globals struct to a GC struct, and only keep a void* GC pointers in globals
  • move GC initialization to gc.c and allow different GCs to register their own init callbacks
  • move GC specific declarations from data.h to gc.h (naPool etc)
  • move GC specific stuff from misc.c to gc.c (none of the GC internals should be exposed to the rest of the code)
  • customize the existing GC to support multiple generations

Suggested readings

The wikipedia article on garbage collection contains some pretty good general GC information: http://en.wikipedia.org/wiki/Garbage_collection_(computer_science)

In addition, the following online presentations provide a basic introduction to garbage collection:


Mark/Sweep

When using a mark-and-sweep collector, unreachable objects are not immediately reclaimed. Instead, garbage is allowed to accumulate until all available memory has been exhausted. When that happens, the execution of the program is suspended temporarily while the mark-and-sweep algorithm collects all the garbage. Once all unreferenced objects have been reclaimed, the normal execution of the program can resume.

Obviously, the main disadvantage of the mark-and-sweep approach is the fact that that normal program execution is suspended while the garbage collection algorithm runs. In particular, this can be a problem in a program that must satisfy real-time execution constraints (like a flight simulator). For example, an interactive application that uses mark-and-sweep garbage collection becomes unresponsive periodically.

The mark-and-sweep algorithm is called a tracing garbage collector because is traces out the entire collection of objects that are directly or indirectly accessible by the program. The objects that a program can access directly are those objects which are referenced by local variables on the processor stack as well as by any static variables that refer to objects. In the context of garbage collection, these variables are called the roots . An object is indirectly accessible if it is referenced by a field in some other (directly or indirectly) accessible object. An accessible object is said to be live . Conversely, an object which is not live is garbage.

The mark-and-sweep algorithm consists of two phases: In the first phase, it finds and marks all accessible objects. The first phase is called the mark phase. In the second phase, the garbage collection algorithm scans through the heap and reclaims all the unmarked objects. The second phase is called the sweep phase.

The mark/sweep algorithm can also be easily expressed using pseudo code, i.e. in Nasal itself:

 
 var garbageCollect = func (roots) {
 foreach(var root; roots) {
   mark(root);
  }
  sweep();
 }

The mark() function would just recursively check (i.e. calling itself) if the "marked" flag is set or not, and make sure to set the mark bit for all referenced objects:

 
 var mark = func (obj) {
  if (!obj.marked) {
   obj.marked=1;
   foreach( var referenced; referenced_by(obj) )
     mark(referenced);
  }
 }

Finally, the sweep() function (reap() in the Nasal GC), will free all unreachable objects and reclaim the memory:

 
 var sweep = func {
  foreach(var obj; allocated_objects) {
   if (obj.marked) obj.marked=0;
   else reclaim(obj);
  }
 }

Nasal's implementation of sweep() ( called reap() ) works such that it is always executed for a handful of different type-specific memory pools. In addition, it also makes sure to allocate new memory if required.

Pool storage

A memory pool is basically a preallocated region of memory, which is dynamically resized. The Nasal GC works such that it manages a handful of global memory pools for all native Nasal types (strings, functions, vectors, hashes etc). At the moment, the hard coded defaults ensure that 25-50% of additional object "slots" (memory blocks) are kept available during each execution of reap() [1].

Whenever new memory is requested to create a new type (such as a vector or a string), the available memory in the corresponding pool will be checked, reachable objects will be marked, and dead objects will be removed from all pools using a mark/sweep collector, new memory blocks will be allocated if necessary. All of this happens atomically, i.e. single-threaded, in a "stop-the-world" fashion.

Memory blocks

Nasal Block structure


A Nasal memory pool consists of memory blocks.

A memory block is a singly linked list: flightgear/simgear/next/simgear/nasal/gc.c#l10.

Callgraph for newBlock()

Each block contains a field to store its size, a pointer to the allocated memory, and another pointer to the next block.

struct Block {
    int   size; // size of the block 
    char* block; // pointer to the memory region
    struct Block* next;	// pointer to the next block
};

New memory blocks are allocated using newBlock() and the memory is initialized with 0: flightgear/simgear/next/simgear/nasal/gc.c#l167

static void newBlock(struct naPool* p, int need)	
{
    int i;
    struct Block* newb;
    if(need < MIN_BLOCK_SIZE) need = MIN_BLOCK_SIZE;
    // allocate a new Block
    newb = naAlloc(sizeof(struct Block));
    // initialize the Block
    newb->block = naAlloc(need * p->elemsz); // number of elements * size of element
    newb->size = need; // set block size 
    // memory blocks are circular linked lists: 
    newb->next = p->blocks;
    p->blocks = newb;
    naBZero(newb->block, need * p->elemsz);

    if(need > p->freesz - p->freetop) need = p->freesz - p->freetop;
    p->nfree = 0;
    p->free = p->free0 + p->freetop;
    // mark all new blocks as unreachable and add them to the pool's free list
    for(i=0; i < need; i++) {
        // initialize each new new memory blocks by setting up an naRef (the container for all Nasal references) 
        struct naObj* o = (struct naObj*)(newb->block + i*p->elemsz);
        o->mark = 0;
        p->free[p->nfree++] = o;	
    }
    p->freetop += need;	
}

Nasal memory pools

For each of the 7 Nasal data types, there is a separate storage pool available, declared as part of the "Globals" structure.

As can be seen, each storage pool is addressed by its enum index (0..6): flightgear/simgear/next/simgear/nasal/data.h#l65

enum { T_STR, T_VEC, T_HASH, T_CODE, T_FUNC, T_CCODE, T_GHOST, NUM_NASAL_TYPES }; // V. important that this come last!

For example, globals->pools[T_VEC] refers to the storage pools for vectors:

globals->pools[T_VEC];

The 7 storage pools are declared in code.h as part of the "Globals" structure: flightgear/simgear/next/simgear/nasal/code.h#l40

The relevant part is:

struct Globals {	
    // Garbage collecting allocators:
    struct naPool pools[NUM_NASAL_TYPES];
    int allocCount;
    // Dead blocks waiting to be freed when it is safe
    void** deadBlocks;
    int deadsz;
    int ndead;
//...
};

So, we have these pools:

  • globals->pools[T_STR]
  • globals->pools[T_VEC]
  • globals->pools[T_HASH]
  • globals->pools[T_CODE]
  • globals->pools[T_FUNC]
  • globals->pools[T_CCODE]
  • globals->pools[T_GHOST]

The Globals structure

Nasal Globals structure

The complete Globals structure looks like this:

struct Globals {	
    // Garbage collecting allocators:
    struct naPool pools[NUM_NASAL_TYPES];
    int allocCount;
    // Dead blocks waiting to be freed when it is safe
    void** deadBlocks;
    int deadsz;
    int ndead;

    // Threading stuff
	
    int nThreads;
	
    int waitCount;
	
    int needGC;
	
    int bottleneck;
	
    void* sem;
	
    void* lock;
	
	
    // Constants
	
    naRef meRef;
	
    naRef argRef;

    naRef parentsRef;

    // A hash of symbol names
    naRef symbols;
    naRef save;
    struct Context* freeContexts;
    struct Context* allContexts;	
};

The naPool structure

Nasal naPool structure

The structure keeping all pool-related information is named "naPool", the layout of the naPool structure can be seen in data.h: flightgear/simgear/next/simgear/nasal/data.h#l181

struct naPool {
    int           type; // one of T_STR, T_VEC, T_HASH etc
    int           elemsz; // size of each element - determined via naTypeSize(T_*)
    struct Block* blocks;
    void**   free0; // pointer to the alloced buffer
    int     freesz; // size of the alloced buffer
    void**    free; // current "free frame"
    int      nfree; // down-counting index within the free frame
    int    freetop; // curr. top of the free list
};

Each memory pool is initialized using naGC_init(): flightgear/simgear/next/simgear/nasal/gc.c#l192

void naGC_init(struct naPool* p, int type)
{
    p->type = type;
    p->elemsz = naTypeSize(type);
    p->blocks = 0;
    p->free0 = p->free = 0;
    p->nfree = p->freesz = p->freetop = 0;
    reap(p);	
}

As can be seen, "elemsz" just stores the size of each type-specific structure, so that it can be used for new allocations.

naGC_init is called in initGlobals() in code.c for each Nasal type (T_STR, T_VEC etc): flightgear/simgear/next/simgear/nasal/code.c#l172

for(i=0; i<NUM_NASAL_TYPES; i++)
        naGC_init(&(globals->pools[i]), i);


The size of a memory pool is determined using poolsize(): flightgear/simgear/next/simgear/nasal/gc.c#l203

static int poolsize(struct naPool* p)
{
    int total = 0;
    struct Block* b = p->blocks;
    while(b) { total += b->size; b = b->next; }
    return total;	
}

The function poolsize() just iterates over all memory blocks in the pool, and computes the total size of the pool by adding up the "size" field of all blocks in the linked list.

Allocating Nasal types

All Nasal types (scalars, vectors, hashes, funcs, ghosts) are allocated via wrappers in misc.c.

These wrappers are:

All of these wrappers are really just helpers on top of naNew(), they just set up type-specific information and initialize each Nasal type properly. For instance, first naNew() is called for the given context, along with the type info (to set up the size properly)- and then each type is initialized:

// The naNew* helpers are just wrappers to set up type-specific structures, after allocating the memory using naNew()
naRef naNewString(struct Context* c)	
{
    naRef s = naNew(c, T_STR);
    PTR(s).str->emblen = 0;
    PTR(s).str->data.ref.len = 0;
    PTR(s).str->data.ref.ptr = 0;
    PTR(s).str->hashcode = 0;
    return s;
}

naNew()

As can be seen, the key allocator function is naNew(). All memory allocations done via the naNew*() helpers will first of all set up memory using naNew().

naNew() is really just a fancy malloc() replacement, i.e. it's just getting passed the Nasal execution context, and the nasal type (which is mapped to the size of the requested memory region using naTypeSize()).

Cquote1.png New nasal objects are added to a temporary bin when they are created, because further allocation might cause a garbage collection to happen before the code that created the object can store a reference to it where the garbage collector can find it. For performance and simplicity, this list is stored per-context. When the context next executes code, it clears this list. Here's the problem: we do a lot of our naNewXX() calls in FlightGear using the old "global context" object that is created at startup. But this context is no longer used to execute scripts* at runtime, so as Csaba discovered, it's temporaries are never flushed. That essentially causes a resource leak: those allocations (mostly listener nodes) will never be freed. And all the extra "reachable" Nasal data floating around causes the garbage collector to take longer and longer to do a full collection as time goes on, causing "stutter". And scripts that use listeners extensively (the cmdarg() they use was one of the affected allocations) will see the problem more seriously.
Cquote2.png
Cquote1.png Once listeners were added, scripts could be recursive: (script A sets property X which causes listener L to fire and cause script B to run) We need two or more contexts on the stack to handle that; a single global one won't work. I didn't like the fix though. Exposing the temporary bin as part of the Nasal public API is ugly; it's an internal design feature, not something users should tune. Instead, I just hacked at the FlightGear code to reinitialie this context every frame, thus cleaning it up. A "proper" fix would be to remove the global context entirely, but that would touch a bunch of code.
Cquote2.png

The naNew() function is also implemented in misc.c, see flightgear/simgear/next/simgear/nasal/misc.c#l66:

naRef naNew(struct Context* c, int type)	
{
    naRef result;
    if(c->nfree[type] == 0)
        c->free[type] = naGC_get(&globals->pools[type],
                                 OBJ_CACHE_SZ, &c->nfree[type]);
    result = naObj(type, c->free[type][--c->nfree[type]]);
    naTempSave(c, result);
    return result;
}

First, the conditional checks if there's any free memory in the Nasal storage pools for the requested Nasal data type, if there isn't any free memory left, new memory is assigned to the contexts->free[type] pool via naGC_get().

Note: naGC_get() is the only function which really manages all memory pools and which triggers the GC. Allocations via naNew()* will end up calling naNew() and naNew() ends up calling naGC_get() always.

For the time being, OBJ_CACHE_SZ can be ignored, it's defined in code.h: flightgear/simgear/next/simgear/nasal/code.h#l12

// Number of objects (per pool per thread) asked for using naGC_get().
// The idea is that contexts can "cache" allocations to prevent thread
// contention on the global pools.  But in practice this interacts
// very badly with small subcontext calls, which grab huge numbers of
// cached objects and don't use them, causing far more collections
// than necessary.  Just leave it at 1 pending a rework of the
// collector synchronization.
#define OBJ_CACHE_SZ 1

The final argument to the naGC_get() function is a pointer to to the context's nfree[type] member which is updated by the function.

Next, a new naRef structure is set up using the allocated naObj memory and the type info:

result = naObj(type, c->free[type][--c->nfree[type]]);


naRef naObj(int type, struct naObj* o)	
{
    naRef r;
    SETPTR(r, o);
    o->type = type;
    return r;	
}

The GC interface

Nasal GC


TODO: flightgear/simgear/next/simgear/nasal/data.h#l204

  • void naGC_init(struct naPool* p, int type);
  • struct naObj** naGC_get(struct naPool* p, int n, int* nout);
  • void naGC_swapfree(void** target, void* val); (used for vector/hash resizing)
  • void naGC_freedead();
  • void naiGCMark(naRef r);
  • void naiGCMarkHash(naRef h);

The pool manager

Callgraph for naGC_get()

All high level type allocators (naNewString, naNewVector, naNewHash etc) make use of the naNew() call introduced earlier.

naNew() doesn't directly allocate new memory from the heap using naAlloc(), but instead uses the naGC_get() helper which manages memory from the global pools (i.e. preallocated memory for each type).

For each native Nasal data type (scalars, vectors, hashes, funcs etc) , there are separate memory pools in globals->pools[n]. The index (n) is one of T_VEC, T_HASH, T_FUNC etc.

The naGC_get() pool "manager" is located in gc.nas: flightgear/simgear/next/simgear/nasal/gc.c#l211

struct naObj** naGC_get(struct naPool* p, int n, int* nout)
{
    struct naObj** result;
    naCheckBottleneck();
    LOCK();
    while(globals->allocCount < 0 || (p->nfree == 0 && p->freetop >= p->freesz)) {
        globals->needGC = 1;
        bottleneck();	
    }
    if(p->nfree == 0)
        newBlock(p, poolsize(p)/8);
    n = p->nfree < n ? p->nfree : n;
    *nout = n;
    p->nfree -= n;
    globals->allocCount -= n;
    result = (struct naObj**)(p->free + p->nfree);
    UNLOCK();
    return result;	
}

As can be seen, the garbage collector itself is triggered by setting the global "needGC" flag to 1 in the globals structure and then calling the bottleneck() function: flightgear/simgear/next/simgear/nasal/gc.c#l105.

The memory region from the global pool is cast back to a pointer (struct naObj**): flightgear/simgear/next/simgear/nasal/gc.c#l226 the "nout" parameter is just there to update c->nfree[type] with the amount of free memory.

In other words, whenever new Nasal types are allocated, new memory from one of the 7 Nasal pools is requested. All of these allocation requests are channeled through naNew() and finally naGC_get(). naGC_get() is the de facto memory manager in Nasal, all memory management is triggered here - new memory block allocations for each pool, as well as garbage collection via the bottleneck() call.

This also means that we basically just need to provide an alternate naGC_get() function, next to the existing one, to add support for alternate GC schemes.

Supporting additional GC implementations

It would probably make sense to change the pool manager interface/signature such that additional implementations can be easily provided and registered, so that merely a GC_CALLBACK pointer must be set during initialization in order to use a different GC scheme:

diff --git a/simgear/nasal/data.h b/simgear/nasal/data.h
index deb4fbd..f3fcaa1 100644
--- a/simgear/nasal/data.h
+++ b/simgear/nasal/data.h
@@ -202,7 +202,7 @@ int naiHash_sym(struct naHash* h, struct naStr* sym, naRef* out);
 void naiHash_newsym(struct naHash* h, naRef* sym, naRef* val);
 
 void naGC_init(struct naPool* p, int type);
-struct naObj** naGC_get(struct naPool* p, int n, int* nout);
+void naGC_alloc(struct Context* c, int type, int n);
 void naGC_swapfree(void** target, void* val);
 void naGC_freedead();
 void naiGCMark(naRef r);
diff --git a/simgear/nasal/gc.c b/simgear/nasal/gc.c
index e9d9b7b..3bae313 100644
--- a/simgear/nasal/gc.c
+++ b/simgear/nasal/gc.c
@@ -191,8 +191,10 @@ static int poolsize(struct naPool* p)
     return total;
 }
 
-struct naObj** naGC_get(struct naPool* p, int n, int* nout)
+void naGC_alloc(struct Context* c, int type, int n)
 {
+    struct naPool* p = &globals->pools[type]; // type-specific memory pool
+    int* nout = &c->nfree[type];		// 
     struct naObj** result;
     naCheckBottleneck();
     LOCK();
@@ -208,7 +210,7 @@ struct naObj** naGC_get(struct naPool* p, int n, int* nout)
     globals->allocCount -= n;
     result = (struct naObj**)(p->free + p->nfree);
     UNLOCK();
-    return result;
+    c->free[type]=result;
 }
 
 static void markvec(naRef r)
diff --git a/simgear/nasal/misc.c b/simgear/nasal/misc.c
index dae0bfe..5caeb76 100644
--- a/simgear/nasal/misc.c
+++ b/simgear/nasal/misc.c
@@ -66,9 +66,7 @@ naRef naStringValue(naContext c, naRef r)
 naRef naNew(struct Context* c, int type)
 {
     naRef result;
-    if(c->nfree[type] == 0)
-        c->free[type] = naGC_get(&globals->pools[type],
-                                 OBJ_CACHE_SZ, &c->nfree[type]);
+    if(! c->nfree[type] ) naGC_alloc(c, type, OBJ_CACHE_SZ);
     result = naObj(type, c->free[type][--c->nfree[type]]);
     naTempSave(c, result);
     return result;

Disabling the current GC

The overhead caused by managing the memory pools and constantly scanning the heap (mark/reap) can be examined by modifying the naGC_get() function such that it no longer uses any pool memory, but directly allocates via malloc()/naAlloc() - without ever freeing any memory (i.e. leaking). While the fgfs process will eventually get killed this way, the GC will be basically switched off.

struct naObj** naGC_get(struct naPool* p, int n, int* nout)
{
    struct naObj** result;
    naCheckBottleneck();
    LOCK();
#define LEAKY 
#ifndef LEAKY
    while(globals->allocCount < 0 || (p->nfree == 0 && p->freetop >= p->freesz)) {
        globals->needGC = 1;
        bottleneck();	
    }
    if(p->nfree == 0)
        newBlock(p, poolsize(p)/8);
    n = p->nfree < n ? p->nfree : n;
    *nout = n;
    p->nfree -= n;
    globals->allocCount -= n;
    result = (struct naObj**)(p->free + p->nfree);
#else
    result = (struct naObj**) naAlloc( naTypeSize(n) );
#endif
    UNLOCK();
    return result;	
}

bottleneck()

Callergraph of naCheckBottleneck()
Callgraph of naCheckBottleneck()

Whenever new memory from one of the global pools is being requested via naGC_get(), naCheckBottleneck() will also be invoked:

flightgear/simgear/next/simgear/nasal/gc.c#l131

void naCheckBottleneck()	
{
    if(globals->bottleneck) { LOCK(); bottleneck(); UNLOCK(); }	
}

This checks if globals->bottleneck is true, and if it is, it makes sure that the bottleneck() function is only invoked after acquiring a lock.

Callgraph of bottleneck()

flightgear/simgear/next/simgear/nasal/gc.c#l100

// Must be called with the main lock.  Engages the "bottleneck", where
// all threads will block so that one (the last one to call this
// function) can run alone.  This is done for GC, and also to free the
// list of "dead" blocks when it gets full (which is part of GC, if
// you think about it).
static void bottleneck()	
{
    struct Globals* g = globals;
    g->bottleneck = 1;
    while(g->bottleneck && g->waitCount < g->nThreads - 1) {
        g->waitCount++;
        UNLOCK(); naSemDown(g->sem); LOCK();
        g->waitCount--;	
    }
    if(g->waitCount >= g->nThreads - 1) {
        freeDead();
        if(g->needGC) garbageCollect();
        if(g->waitCount) naSemUp(g->sem, g->waitCount);
        g->bottleneck = 0;	
    }	
}

The actual garbage collection takes place in line 114 where dead objects from the memory pool are freed, and the garbageCollect() function is called if globals.needGC is true.

garbageCollect()

Callgraph of garbageCollect()

This is the function that does the mark/sweep part using mark() to recursively mark reachable objects, and reap() to collect unreachable objects.

flightgear/simgear/next/simgear/nasal/gc.c#l35

// Must be called with the big lock!
static void garbageCollect()	
{
    int i;
    struct Context* c;
    globals->allocCount = 0;
    c = globals->allContexts;
    while(c) {
        for(i=0; i<NUM_NASAL_TYPES; i++)
            c->nfree[i] = 0;
        for(i=0; i < c->fTop; i++) {
            mark(c->fStack[i].func);
            mark(c->fStack[i].locals);
        }
        for(i=0; i < c->opTop; i++)
            mark(c->opStack[i]);
        mark(c->dieArg);
        marktemps(c);
        c = c->nextAll;	
    }
    mark(globals->save);
    mark(globals->symbols);
    mark(globals->meRef);
    mark(globals->argRef);
    mark(globals->parentsRef);
    // Finally collect all the freed objects
    for(i=0; i<NUM_NASAL_TYPES; i++)
        reap(&(globals->pools[i]));
    // Make enough space for the dead blocks we need to free during
    // execution.  This works out to 1 spot for every 2 live objects,
    // which should be limit the number of bottleneck operations
    // without imposing an undue burden of extra "freeable" memory.
    if(globals->deadsz < globals->allocCount) {
        globals->deadsz = globals->allocCount;
        if(globals->deadsz < 256) globals->deadsz = 256;
        naFree(globals->deadBlocks);
        globals->deadBlocks = naAlloc(sizeof(void*) * globals->deadsz);	
    }
    globals->needGC = 0;	
}

For each invocation of garbageCollect(), the following takes place:

mark()

Callgraph of mark()

All reachable objects are recursively marked using tagged naRef structures flightgear/simgear/next/simgear/nasal/data.h#l50, so that unreachable objects can be collected (i.e. because the reftag bit is not set) and freed.

TODO: flightgear/simgear/next/simgear/nasal/gc.c#l223

reap()

Callgraph for reap()

This is the collector function called by garbageCollect() which also frees elements and allocates new blocks for the specified pool:

// Collects all the unreachable objects into a free list, and
// allocates more space if needed.
static void reap(struct naPool* p)
{
    struct Block* b;
    int elem, freesz, total = poolsize(p);
    freesz = total < MIN_BLOCK_SIZE ? MIN_BLOCK_SIZE : total;
    freesz = (3 * freesz / 2) + (globals->nThreads * OBJ_CACHE_SZ);
    if(p->freesz < freesz) {
        naFree(p->free0);
        p->freesz = freesz;
        p->free = p->free0 = naAlloc(sizeof(void*) * p->freesz);
    }
    p->nfree = 0;
    p->free = p->free0;
    for(b = p->blocks; b; b = b->next)
        for(elem=0; elem < b->size; elem++) {
            struct naObj* o = (struct naObj*)(b->block + elem * p->elemsz);
            if(o->mark == 0)
                freeelem(p, o);
            o->mark = 0;
        }
    p->freetop = p->nfree;
    // allocs of this type until the next collection
    globals->allocCount += total/2;
    // Allocate more if necessary (try to keep 25-50% of the objects
    // available)
    if(p->nfree < total/4) {
        int used = total - p->nfree;
        int avail = total - used;
        int need = used/2 - avail;
        if(need > 0)
            newBlock(p, need);
    }	
}

swapfree()

Callgraph of swapfree()

freeelem()

Callgraph of freeelem()

Frees individual elements and adds freed pointers to the globals free list, implemented in gc.c: flightgear/simgear/next/simgear/nasal/gc.c#l153

static void freeelem(struct naPool* p, struct naObj* o)
{
    // Clean up any intrinsic storage the object might have...
    switch(p->type) {
    case T_STR:   naStr_gcclean  ((struct naStr*)  o); break;
    case T_VEC:   naVec_gcclean  ((struct naVec*)  o); break;
    case T_HASH:  naiGCHashClean ((struct naHash*) o); break;
    case T_CODE:  naCode_gcclean ((struct naCode*) o); break;
    case T_GHOST: naGhost_gcclean((struct naGhost*)o); break;	
    }
    p->free[p->nfree++] = o;  // ...and add it to the free list	
}

Type destructors

Will be called by freelem(), implemented for all native Nasal types:

Contexts

Cquote1.png New nasal objects are added to a temporary bin when they are created, because further allocation might cause a garbage collection to happen before the code that created the object can store a reference to it where the garbage collector can find it. For performance and simplicity, this list is stored per-context. When the context next executes code, it clears this list. Here's the problem: we do a lot of our naNewXX() calls in FlightGear using the old "global context" object that is created at startup. But this context is no longer used to execute scripts* at runtime, so as Csaba discovered, it's temporaries are never flushed. That essentially causes a resource leak: those allocations (mostly listener nodes) will never be freed. And all the extra "reachable" Nasal data floating around causes the garbage collector to take longer and longer to do a full collection as time goes on, causing "stutter". And scripts that use listeners extensively (the cmdarg() they use was one of the affected allocations) will see the problem more seriously. (That's a feature, not a bug. Once listeners were added, scripts could be recursive: (script A sets property X which causes listener L to fire and cause script B to run) We need two or more contexts on the stack to handle that; a single global one won't work.) I didn't like the fix though. Exposing the temporary bin as part of the Nasal public API is ugly; it's an internal design feature, not something users should tune. Instead, I just hacked at the FlightGear code to reinitialize this context every frame, thus cleaning it up. A "proper" fix would be to remove the global context entirely, but that would touch a bunch of code. [2]
— Andy Ross
Cquote2.png
  1. Andy Ross (22 May 2012). Nasal performance.
  2. Andy Ross (Wed, 24 Oct 2007 11:33:36 -0700). [Flightgear-devel] Stutter/Nasal issue resolved (was: FlightGear/Plib periodic stutter notes).

Also see: flightgear/flightgear/next/src/Scripting/NasalSys.cxx (in FGNasalSys::update)

    // The global context is a legacy thing.  We use dynamically
    // created contexts for naCall() now, so that we can call them
    // recursively.  But there are still spots that want to use it for
    // naNew*() calls, which end up leaking memory because the context
    // only clears out its temporary vector when it's *used*.  So just
    // junk it and fetch a new/reinitialized one every frame.  This is
    // clumsy: the right solution would use the dynamic context in all
    // cases and eliminate _context entirely.  But that's more work,
    // and this works fine (yes, they say "New" and "Free", but
    // they're very fast, just trust me). -Andy

Consider using coccinelle here to rewrite naContext handling in all places at once


Nasal Context structure

flightgear/simgear/next/simgear/nasal/code.h#l76

struct Context {
	
    // Stack(s)
	
    struct Frame fStack[MAX_RECURSION];
	
    int fTop;
	
    naRef opStack[MAX_STACK_DEPTH];
	
    int opFrame; // like Frame::bp, but for C functions
	
    int opTop;
	
    int markStack[MAX_MARK_DEPTH];
	
    int markTop;
	
	
    // Free object lists, cached from the global GC
	
    struct naObj** free[NUM_NASAL_TYPES];
	
    int nfree[NUM_NASAL_TYPES];
	
	
    // GC-findable reference point for objects that may live on the
	
    // processor ("real") stack during execution.  naNew() places them
	
    // here, and clears the array each instruction
	
    struct naObj** temps;
	
    int ntemps;
	
    int tempsz;
	
	
    // Error handling
	
    jmp_buf jumpHandle;
	
    char error[128];
	
    naRef dieArg;
	
	
    // Sub-call lists
	
    struct Context* callParent;
	
    struct Context* callChild;
	
	
    // Linked list pointers in globals
	
    struct Context* nextFree;
	
    struct Context* nextAll;
	
	
    void* userData;
	
};

Stack frames

flightgear/simgear/next/simgear/nasal/code.h#l33

Tagged structures

flightgear/simgear/next/simgear/nasal/data.h#l84

Core allocations

All core memory allocation takes place via wrappers in misc.c:

void naFree(void* m) { free(m); }	
void* naAlloc(int n) { return malloc(n); }
void* naRealloc(void* b, int n) { return realloc(b, n); }
void naBZero(void* m, int n) { memset(m, 0, n); }