How the Nasal GC works

From FlightGear wiki
Jump to navigation Jump to search

This page explains how the Nasal GC works.

High Level Architecture

1rightarrow.png See High-Level Architecture for the main article about this subject.

Cquote1.png Regarding Nasal as a scripting language and AW, I'm hopeful that the work I'm doing on HLA will allow us to run Nasal in a separate thread from the FDM and display, so Nasal GC no-longer can impact frame-rates. It would also allow for writing a weather simulation completely external to the FlightGear instance, which could be quite neat.
— stuart (Nov 24th, 2015). Re: the real cost of the Nasal Garbage Collector.
(powered by Instant-Cquotes)
Cquote2.png
Cquote1.png I do like the idea of a stand-alone script interpreter communicating via HLA or some similar IPC, but we still have to consider fragments of script code embedded in models to create or enhance fancy animations. Not all script fragments make sense to run externally.
Cquote2.png
Cquote1.png if scripting is a major performance hog, off-load it to a different thread. I don't think we have many situation where it is, but that's what HLA is going to do.
Cquote2.png
Cquote1.png haven't made too many tests with Nasal active as anyone can do that (adjust paths to have no terrain if necessary). But most of this becomes less relevant with the HLA changes that Stuart's working on.
— Richard (Nov 24th, 2015). Re: the real cost of the Nasal Garbage Collector.
(powered by Instant-Cquotes)
Cquote2.png

Mark/Sweep

When using a mark-and-sweep collector, unreachable objects are not immediately reclaimed. Instead, garbage is allowed to accumulate until all available memory has been exhausted. When that happens, the execution of the program is suspended temporarily while the mark-and-sweep algorithm collects all the garbage. Once all unreferenced objects have been reclaimed, the normal execution of the program can resume.

Obviously, the main disadvantage of the mark-and-sweep approach is the fact that that normal program execution is suspended while the garbage collection algorithm runs. In particular, this can be a problem in a program that must satisfy real-time execution constraints (like a flight simulator). For example, an interactive application that uses mark-and-sweep garbage collection becomes unresponsive periodically.

The mark-and-sweep algorithm is called a tracing garbage collector because is traces out the entire collection of objects that are directly or indirectly accessible by the program. The objects that a program can access directly are those objects which are referenced by local variables on the processor stack as well as by any static variables that refer to objects. In the context of garbage collection, these variables are called the roots . An object is indirectly accessible if it is referenced by a field in some other (directly or indirectly) accessible object. An accessible object is said to be live . Conversely, an object which is not live is garbage.

The mark-and-sweep algorithm consists of two phases: In the first phase, it finds and marks all accessible objects. The first phase is called the mark phase. In the second phase, the garbage collection algorithm scans through the heap and reclaims all the unmarked objects. The second phase is called the sweep phase.

The mark/sweep algorithm can also be easily expressed using pseudo code, i.e. in Nasal itself:

 
 var garbageCollect = func (roots) {
 foreach(var root; roots) {
   mark(root);
  }
  sweep();
 }

The mark() function would just recursively check (i.e. calling itself) if the "marked" flag is set or not, and make sure to set the mark bit for all referenced objects:

 
 var mark = func (obj) {
  if (!obj.marked) {
   obj.marked=1;
   foreach( var referenced; referenced_by(obj) )
     mark(referenced);
  }
 }

Finally, the sweep() function (reap() in the Nasal GC), will free all unreachable objects and reclaim the memory:

 
 var sweep = func {
  foreach(var obj; allocated_objects) {
   if (obj.marked) obj.marked=0;
   else reclaim(obj);
  }
 }

Nasal's implementation of sweep() ( called reap() ) works such that it is always executed for a handful of different type-specific memory pools. In addition, it also makes sure to allocate new memory if required.

Pool storage

A memory pool is basically a preallocated region of memory, which is dynamically resized. The Nasal GC works such that it manages a handful of global memory pools for all native Nasal types (strings, functions, vectors, hashes etc). At the moment, the hard coded defaults ensure that 25-50% of additional object "slots" (memory blocks) are kept available during each execution of reap() [1].

Whenever new memory is requested to create a new type (such as a vector or a string), the available memory in the corresponding pool will be checked, reachable objects will be marked, and dead objects will be removed from all pools using a mark/sweep collector, new memory blocks will be allocated if necessary. All of this happens atomically, i.e. single-threaded, in a "stop-the-world" fashion.

Memory blocks

Nasal Block structure

A Nasal memory pool consists of memory blocks.

A memory block is a singly linked list: [GitLab]/flightgear/simgear/next/simgear/nasal/gc.c#L10.

Callgraph for newBlock()

Each block contains a field to store its size, a pointer to the allocated memory, and another pointer to the next block.

struct Block {
    int   size; // size of the block 
    char* block; // pointer to the memory region
    struct Block* next;	// pointer to the next block
};

New memory blocks are allocated using newBlock() and the memory is initialized with 0: [GitLab]/flightgear/simgear/next/simgear/nasal/gc.c#L167

static void newBlock(struct naPool* p, int need)	
{
    int i;
    struct Block* newb;
    if(need < MIN_BLOCK_SIZE) need = MIN_BLOCK_SIZE;
    // allocate a new Block
    newb = naAlloc(sizeof(struct Block));
    // initialize the Block
    newb->block = naAlloc(need * p->elemsz); // number of elements * size of element
    newb->size = need; // set block size 
    // memory blocks are circular linked lists: 
    newb->next = p->blocks;
    p->blocks = newb;
    naBZero(newb->block, need * p->elemsz);

    if(need > p->freesz - p->freetop) need = p->freesz - p->freetop;
    p->nfree = 0;
    p->free = p->free0 + p->freetop;
    // mark all new blocks as unreachable and add them to the pool's free list
    for(i=0; i < need; i++) {
        // initialize each new new memory blocks by setting up an naRef (the container for all Nasal references) 
        struct naObj* o = (struct naObj*)(newb->block + i*p->elemsz);
        o->mark = 0;
        p->free[p->nfree++] = o;	
    }
    p->freetop += need;	
}

Nasal memory pools

For each of the 7 Nasal data types, there is a separate storage pool available, declared as part of the "Globals" structure.

As can be seen, each storage pool is addressed by its enum index (0..6): [GitLab]/flightgear/simgear/next/simgear/nasal/data.h#L65

enum { T_STR, T_VEC, T_HASH, T_CODE, T_FUNC, T_CCODE, T_GHOST, NUM_NASAL_TYPES }; // V. important that this come last!

For example, globals->pools[T_VEC] refers to the storage pools for vectors:

globals->pools[T_VEC];

The 7 storage pools are declared in code.h as part of the "Globals" structure: [GitLab]/flightgear/simgear/next/simgear/nasal/code.h#L40

The relevant part is:

struct Globals {	
    // Garbage collecting allocators:
    struct naPool pools[NUM_NASAL_TYPES];
    int allocCount;
    // Dead blocks waiting to be freed when it is safe
    void** deadBlocks;
    int deadsz;
    int ndead;
//...
};

So, we have these pools:

  • globals->pools[T_STR]
  • globals->pools[T_VEC]
  • globals->pools[T_HASH]
  • globals->pools[T_CODE]
  • globals->pools[T_FUNC]
  • globals->pools[T_CCODE]
  • globals->pools[T_GHOST]

The Globals structure

Nasal Globals structure

The complete Globals structure looks like this:

struct Globals {	
    // Garbage collecting allocators:
    struct naPool pools[NUM_NASAL_TYPES];
    int allocCount;
    // Dead blocks waiting to be freed when it is safe
    void** deadBlocks;
    int deadsz;
    int ndead;

    // Threading stuff
	
    int nThreads;
	
    int waitCount;
	
    int needGC;
	
    int bottleneck;
	
    void* sem;
	
    void* lock;
	
	
    // Constants
	
    naRef meRef;
	
    naRef argRef;

    naRef parentsRef;

    // A hash of symbol names
    naRef symbols;
    naRef save;
    struct Context* freeContexts;
    struct Context* allContexts;	
};

The naPool structure

Nasal naPool structure

The structure keeping all pool-related information is named "naPool", the layout of the naPool structure can be seen in data.h: [GitLab]/flightgear/simgear/next/simgear/nasal/data.h#L181

struct naPool {
    int           type; // one of T_STR, T_VEC, T_HASH etc
    int           elemsz; // size of each element - determined via naTypeSize(T_*)
    struct Block* blocks;
    void**   free0; // pointer to the alloced buffer
    int     freesz; // size of the alloced buffer
    void**    free; // current "free frame"
    int      nfree; // down-counting index within the free frame
    int    freetop; // curr. top of the free list
};

Each memory pool is initialized using naGC_init(): [GitLab]/flightgear/simgear/next/simgear/nasal/gc.c#L192

void naGC_init(struct naPool* p, int type)
{
    p->type = type;
    p->elemsz = naTypeSize(type);
    p->blocks = 0;
    p->free0 = p->free = 0;
    p->nfree = p->freesz = p->freetop = 0;
    reap(p);	
}

As can be seen, "elemsz" just stores the size of each type-specific structure, so that it can be used for new allocations.

naGC_init is called in initGlobals() in code.c for each Nasal type (T_STR, T_VEC etc): [GitLab]/flightgear/simgear/next/simgear/nasal/code.c#L172

for(i=0; i<NUM_NASAL_TYPES; i++)
        naGC_init(&(globals->pools[i]), i);


The size of a memory pool is determined using poolsize(): [GitLab]/flightgear/simgear/next/simgear/nasal/gc.c#L203

static int poolsize(struct naPool* p)
{
    int total = 0;
    struct Block* b = p->blocks;
    while(b) { total += b->size; b = b->next; }
    return total;	
}

The function poolsize() just iterates over all memory blocks in the pool, and computes the total size of the pool by adding up the "size" field of all blocks in the linked list.

Allocating Nasal types

All Nasal types (scalars, vectors, hashes, funcs, ghosts) are allocated via wrappers in misc.c.

These wrappers are:

All of these wrappers are really just helpers on top of naNew(), they just set up type-specific information and initialize each Nasal type properly. For instance, first naNew() is called for the given context, along with the type info (to set up the size properly)- and then each type is initialized:

// The naNew* helpers are just wrappers to set up type-specific structures, after allocating the memory using naNew()
naRef naNewString(struct Context* c)	
{
    naRef s = naNew(c, T_STR);
    PTR(s).str->emblen = 0;
    PTR(s).str->data.ref.len = 0;
    PTR(s).str->data.ref.ptr = 0;
    PTR(s).str->hashcode = 0;
    return s;
}

naNew()

As can be seen, the key allocator function is naNew(). All memory allocations done via the naNew*() helpers will first of all set up memory using naNew().

naNew() is really just a fancy malloc() replacement, i.e. it's just getting passed the Nasal execution context, and the nasal type (which is mapped to the size of the requested memory region using naTypeSize()).

Cquote1.png New nasal objects are added to a temporary bin when they are created, because further allocation might cause a garbage collection to happen before the code that created the object can store a reference to it where the garbage collector can find it. For performance and simplicity, this list is stored per-context. When the context next executes code, it clears this list. Here's the problem: we do a lot of our naNewXX() calls in FlightGear using the old "global context" object that is created at startup. But this context is no longer used to execute scripts* at runtime, so as Csaba discovered, it's temporaries are never flushed. That essentially causes a resource leak: those allocations (mostly listener nodes) will never be freed. And all the extra "reachable" Nasal data floating around causes the garbage collector to take longer and longer to do a full collection as time goes on, causing "stutter". And scripts that use listeners extensively (the cmdarg() they use was one of the affected allocations) will see the problem more seriously.
Cquote2.png
Cquote1.png Once listeners were added, scripts could be recursive: (script A sets property X which causes listener L to fire and cause script B to run) We need two or more contexts on the stack to handle that; a single global one won't work. I didn't like the fix though. Exposing the temporary bin as part of the Nasal public API is ugly; it's an internal design feature, not something users should tune. Instead, I just hacked at the FlightGear code to reinitialie this context every frame, thus cleaning it up. A "proper" fix would be to remove the global context entirely, but that would touch a bunch of code.
Cquote2.png

The naNew() function is also implemented in misc.c, see [GitLab]/flightgear/simgear/next/simgear/nasal/misc.c#L66:

naRef naNew(struct Context* c, int type)	
{
    naRef result;
    if(c->nfree[type] == 0)
        c->free[type] = naGC_get(&globals->pools[type],
                                 OBJ_CACHE_SZ, &c->nfree[type]);
    result = naObj(type, c->free[type][--c->nfree[type]]);
    naTempSave(c, result);
    return result;
}

First, the conditional checks if there's any free memory in the Nasal storage pools for the requested Nasal data type, if there isn't any free memory left, new memory is assigned to the contexts->free[type] pool via naGC_get().

Note: naGC_get() is the only function which really manages all memory pools and which triggers the GC. Allocations via naNew()* will end up calling naNew() and naNew() ends up calling naGC_get() always.

For the time being, OBJ_CACHE_SZ can be ignored, it's defined in code.h: [GitLab]/flightgear/simgear/next/simgear/nasal/code.h#L12

// Number of objects (per pool per thread) asked for using naGC_get().
// The idea is that contexts can "cache" allocations to prevent thread
// contention on the global pools.  But in practice this interacts
// very badly with small subcontext calls, which grab huge numbers of
// cached objects and don't use them, causing far more collections
// than necessary.  Just leave it at 1 pending a rework of the
// collector synchronization.
#define OBJ_CACHE_SZ 1

The final argument to the naGC_get() function is a pointer to to the context's nfree[type] member which is updated by the function.

Next, a new naRef structure is set up using the allocated naObj memory and the type info:

result = naObj(type, c->free[type][--c->nfree[type]]);


naRef naObj(int type, struct naObj* o)	
{
    naRef r;
    SETPTR(r, o);
    o->type = type;
    return r;	
}

The GC interface

Nasal GC

TODO: [GitLab]/flightgear/simgear/next/simgear/nasal/data.h#L204

  • void naGC_init(struct naPool* p, int type);
  • struct naObj** naGC_get(struct naPool* p, int n, int* nout);
  • void naGC_swapfree(void** target, void* val); (used for vector/hash resizing)
  • void naGC_freedead();
  • void naiGCMark(naRef r);
  • void naiGCMarkHash(naRef h);

The pool manager

Callgraph for naGC_get()

All high level type allocators (naNewString, naNewVector, naNewHash etc) make use of the naNew() call introduced earlier.

naNew() doesn't directly allocate new memory from the heap using naAlloc(), but instead uses the naGC_get() helper which manages memory from the global pools (i.e. preallocated memory for each type).

For each native Nasal data type (scalars, vectors, hashes, funcs etc) , there are separate memory pools in globals->pools[n]. The index (n) is one of T_VEC, T_HASH, T_FUNC etc.

The naGC_get() pool "manager" is located in gc.nas: [GitLab]/flightgear/simgear/next/simgear/nasal/gc.c#L211

struct naObj** naGC_get(struct naPool* p, int n, int* nout)
{
    struct naObj** result;
    naCheckBottleneck();
    LOCK();
    while(globals->allocCount < 0 || (p->nfree == 0 && p->freetop >= p->freesz)) {
        globals->needGC = 1;
        bottleneck();	
    }
    if(p->nfree == 0)
        newBlock(p, poolsize(p)/8);
    n = p->nfree < n ? p->nfree : n;
    *nout = n;
    p->nfree -= n;
    globals->allocCount -= n;
    result = (struct naObj**)(p->free + p->nfree);
    UNLOCK();
    return result;	
}

As can be seen, the garbage collector itself is triggered by setting the global "needGC" flag to 1 in the globals structure and then calling the bottleneck() function: [GitLab]/flightgear/simgear/next/simgear/nasal/gc.c#L105.

The memory region from the global pool is cast back to a pointer (struct naObj**): [GitLab]/flightgear/simgear/next/simgear/nasal/gc.c#L226 the "nout" parameter is just there to update c->nfree[type] with the amount of free memory.

In other words, whenever new Nasal types are allocated, new memory from one of the 7 Nasal pools is requested. All of these allocation requests are channeled through naNew() and finally naGC_get(). naGC_get() is the de facto memory manager in Nasal, all memory management is triggered here - new memory block allocations for each pool, as well as garbage collection via the bottleneck() call.

This also means that we basically just need to provide an alternate naGC_get() function, next to the existing one, to add support for alternate GC schemes.

Supporting additional GC implementations

It would probably make sense to change the pool manager interface/signature such that additional implementations can be easily provided and registered, so that merely a GC_CALLBACK pointer must be set during initialization in order to use a different GC scheme:

diff --git a/simgear/nasal/data.h b/simgear/nasal/data.h
index deb4fbd..f3fcaa1 100644
--- a/simgear/nasal/data.h
+++ b/simgear/nasal/data.h
@@ -202,7 +202,7 @@ int naiHash_sym(struct naHash* h, struct naStr* sym, naRef* out);
 void naiHash_newsym(struct naHash* h, naRef* sym, naRef* val);
 
 void naGC_init(struct naPool* p, int type);
-struct naObj** naGC_get(struct naPool* p, int n, int* nout);
+void naGC_alloc(struct Context* c, int type, int n);
 void naGC_swapfree(void** target, void* val);
 void naGC_freedead();
 void naiGCMark(naRef r);
diff --git a/simgear/nasal/gc.c b/simgear/nasal/gc.c
index e9d9b7b..3bae313 100644
--- a/simgear/nasal/gc.c
+++ b/simgear/nasal/gc.c
@@ -191,8 +191,10 @@ static int poolsize(struct naPool* p)
     return total;
 }
 
-struct naObj** naGC_get(struct naPool* p, int n, int* nout)
+void naGC_alloc(struct Context* c, int type, int n)
 {
+    struct naPool* p = &globals->pools[type]; // type-specific memory pool
+    int* nout = &c->nfree[type];		// 
     struct naObj** result;
     naCheckBottleneck();
     LOCK();
@@ -208,7 +210,7 @@ struct naObj** naGC_get(struct naPool* p, int n, int* nout)
     globals->allocCount -= n;
     result = (struct naObj**)(p->free + p->nfree);
     UNLOCK();
-    return result;
+    c->free[type]=result;
 }
 
 static void markvec(naRef r)
diff --git a/simgear/nasal/misc.c b/simgear/nasal/misc.c
index dae0bfe..5caeb76 100644
--- a/simgear/nasal/misc.c
+++ b/simgear/nasal/misc.c
@@ -66,9 +66,7 @@ naRef naStringValue(naContext c, naRef r)
 naRef naNew(struct Context* c, int type)
 {
     naRef result;
-    if(c->nfree[type] == 0)
-        c->free[type] = naGC_get(&globals->pools[type],
-                                 OBJ_CACHE_SZ, &c->nfree[type]);
+    if(! c->nfree[type] ) naGC_alloc(c, type, OBJ_CACHE_SZ);
     result = naObj(type, c->free[type][--c->nfree[type]]);
     naTempSave(c, result);
     return result;

Disabling the current GC

The overhead caused by managing the memory pools and constantly scanning the heap (mark/reap) can be examined by modifying the naGC_get() function such that it no longer uses any pool memory, but directly allocates via malloc()/naAlloc() - without ever freeing any memory (i.e. leaking). While the fgfs process will eventually get killed this way, the GC will be basically switched off.

struct naObj** naGC_get(struct naPool* p, int n, int* nout)
{
    struct naObj** result;
    naCheckBottleneck();
    LOCK();
#define LEAKY 
#ifndef LEAKY
    while(globals->allocCount < 0 || (p->nfree == 0 && p->freetop >= p->freesz)) {
        globals->needGC = 1;
        bottleneck();	
    }
    if(p->nfree == 0)
        newBlock(p, poolsize(p)/8);
    n = p->nfree < n ? p->nfree : n;
    *nout = n;
    p->nfree -= n;
    globals->allocCount -= n;
    result = (struct naObj**)(p->free + p->nfree);
#else
    result = (struct naObj**) naAlloc( naTypeSize(n) );
#endif
    UNLOCK();
    return result;	
}

bottleneck()

Callergraph of naCheckBottleneck()
Callgraph of naCheckBottleneck()

Whenever new memory from one of the global pools is being requested via naGC_get(), naCheckBottleneck() will also be invoked:

[GitLab]/flightgear/simgear/next/simgear/nasal/gc.c#L131

void naCheckBottleneck()	
{
    if(globals->bottleneck) { LOCK(); bottleneck(); UNLOCK(); }	
}

This checks if globals->bottleneck is true, and if it is, it makes sure that the bottleneck() function is only invoked after acquiring a lock.

Callgraph of bottleneck()

[GitLab]/flightgear/simgear/next/simgear/nasal/gc.c#L100

// Must be called with the main lock.  Engages the "bottleneck", where
// all threads will block so that one (the last one to call this
// function) can run alone.  This is done for GC, and also to free the
// list of "dead" blocks when it gets full (which is part of GC, if
// you think about it).
static void bottleneck()	
{
    struct Globals* g = globals;
    g->bottleneck = 1;
    while(g->bottleneck && g->waitCount < g->nThreads - 1) {
        g->waitCount++;
        UNLOCK(); naSemDown(g->sem); LOCK();
        g->waitCount--;	
    }
    if(g->waitCount >= g->nThreads - 1) {
        freeDead();
        if(g->needGC) garbageCollect();
        if(g->waitCount) naSemUp(g->sem, g->waitCount);
        g->bottleneck = 0;	
    }	
}

The actual garbage collection takes place in line 114 where dead objects from the memory pool are freed, and the garbageCollect() function is called if globals.needGC is true.

garbageCollect()

Callgraph of garbageCollect()

This is the function that does the mark/sweep part using mark() to recursively mark reachable objects, and reap() to collect unreachable objects.

[GitLab]/flightgear/simgear/next/simgear/nasal/gc.c#L35

// Must be called with the big lock!
static void garbageCollect()	
{
    int i;
    struct Context* c;
    globals->allocCount = 0;
    c = globals->allContexts;
    while(c) {
        for(i=0; i<NUM_NASAL_TYPES; i++)
            c->nfree[i] = 0;
        for(i=0; i < c->fTop; i++) {
            mark(c->fStack[i].func);
            mark(c->fStack[i].locals);
        }
        for(i=0; i < c->opTop; i++)
            mark(c->opStack[i]);
        mark(c->dieArg);
        marktemps(c);
        c = c->nextAll;	
    }
    mark(globals->save);
    mark(globals->symbols);
    mark(globals->meRef);
    mark(globals->argRef);
    mark(globals->parentsRef);
    // Finally collect all the freed objects
    for(i=0; i<NUM_NASAL_TYPES; i++)
        reap(&(globals->pools[i]));
    // Make enough space for the dead blocks we need to free during
    // execution.  This works out to 1 spot for every 2 live objects,
    // which should be limit the number of bottleneck operations
    // without imposing an undue burden of extra "freeable" memory.
    if(globals->deadsz < globals->allocCount) {
        globals->deadsz = globals->allocCount;
        if(globals->deadsz < 256) globals->deadsz = 256;
        naFree(globals->deadBlocks);
        globals->deadBlocks = naAlloc(sizeof(void*) * globals->deadsz);	
    }
    globals->needGC = 0;	
}

For each invocation of garbageCollect(), the following takes place:

mark()

Callgraph of mark()

All reachable objects are recursively marked using tagged naRef structures [GitLab]/flightgear/simgear/next/simgear/nasal/data.h#L50, so that unreachable objects can be collected (i.e. because the reftag bit is not set) and freed.

TODO: [GitLab]/flightgear/simgear/next/simgear/nasal/gc.c#L223

reap()

Callgraph for reap()

This is the collector function called by garbageCollect() which also frees elements and allocates new blocks for the specified pool:

// Collects all the unreachable objects into a free list, and
// allocates more space if needed.
static void reap(struct naPool* p)
{
    struct Block* b;
    int elem, freesz, total = poolsize(p);
    freesz = total < MIN_BLOCK_SIZE ? MIN_BLOCK_SIZE : total;
    freesz = (3 * freesz / 2) + (globals->nThreads * OBJ_CACHE_SZ);
    if(p->freesz < freesz) {
        naFree(p->free0);
        p->freesz = freesz;
        p->free = p->free0 = naAlloc(sizeof(void*) * p->freesz);
    }
    p->nfree = 0;
    p->free = p->free0;
    for(b = p->blocks; b; b = b->next)
        for(elem=0; elem < b->size; elem++) {
            struct naObj* o = (struct naObj*)(b->block + elem * p->elemsz);
            if(o->mark == 0)
                freeelem(p, o);
            o->mark = 0;
        }
    p->freetop = p->nfree;
    // allocs of this type until the next collection
    globals->allocCount += total/2;
    // Allocate more if necessary (try to keep 25-50% of the objects
    // available)
    if(p->nfree < total/4) {
        int used = total - p->nfree;
        int avail = total - used;
        int need = used/2 - avail;
        if(need > 0)
            newBlock(p, need);
    }	
}

swapfree()

Callgraph of swapfree()

freeelem()

Callgraph of freeelem()

Frees individual elements and adds freed pointers to the globals free list, implemented in gc.c: [GitLab]/flightgear/simgear/next/simgear/nasal/gc.c#L153

static void freeelem(struct naPool* p, struct naObj* o)
{
    // Clean up any intrinsic storage the object might have...
    switch(p->type) {
    case T_STR:   naStr_gcclean  ((struct naStr*)  o); break;
    case T_VEC:   naVec_gcclean  ((struct naVec*)  o); break;
    case T_HASH:  naiGCHashClean ((struct naHash*) o); break;
    case T_CODE:  naCode_gcclean ((struct naCode*) o); break;
    case T_GHOST: naGhost_gcclean((struct naGhost*)o); break;	
    }
    p->free[p->nfree++] = o;  // ...and add it to the free list	
}

Type destructors

Will be called by freelem(), implemented for all native Nasal types:

Contexts

Cquote1.png New nasal objects are added to a temporary bin when they are created, because further allocation might cause a garbage collection to happen before the code that created the object can store a reference to it where the garbage collector can find it. For performance and simplicity, this list is stored per-context. When the context next executes code, it clears this list. Here's the problem: we do a lot of our naNewXX() calls in FlightGear using the old "global context" object that is created at startup. But this context is no longer used to execute scripts* at runtime, so as Csaba discovered, it's temporaries are never flushed. That essentially causes a resource leak: those allocations (mostly listener nodes) will never be freed. And all the extra "reachable" Nasal data floating around causes the garbage collector to take longer and longer to do a full collection as time goes on, causing "stutter". And scripts that use listeners extensively (the cmdarg() they use was one of the affected allocations) will see the problem more seriously. (That's a feature, not a bug. Once listeners were added, scripts could be recursive: (script A sets property X which causes listener L to fire and cause script B to run) We need two or more contexts on the stack to handle that; a single global one won't work.) I didn't like the fix though. Exposing the temporary bin as part of the Nasal public API is ugly; it's an internal design feature, not something users should tune. Instead, I just hacked at the FlightGear code to reinitialize this context every frame, thus cleaning it up. A "proper" fix would be to remove the global context entirely, but that would touch a bunch of code. [1]
— Andy Ross
Cquote2.png


Cquote1.png The problem is the fact that Nasal recycles context objects - it keeps a pool of freed contexts around. At the point we shutdown Nasal, all contexts should be inactive, i.e in the free list (and they are, phew). Free-ed contexts are still considered for GC however, and there seems to be an issue that the opStack (part of the stack machine which implements the core bytecode interpreter of Nasal) doesn’t end up empty in some situations. So my work around is to clear the opStack (and frameStack) to zero, inside naFreeContext. This then matches the state for a new (non-recycled) context, and ensures that on the next GC cycle, items referenced by the opStack aren’t marked as in-use. (At least to me it makes sense, that a freed context - which is therefore inaccessible - shouldn’t be retaining items in its opStack into its next re-use) The ideal fix would be to find out *why* the opStack doesn’t return to empty ‘naturally’ before the context is freed [2]
— James Turner
Cquote2.png


Also see: [GitLab]/flightgear/flightgear/next/src/Scripting/NasalSys.cxx (in FGNasalSys::update)

    // The global context is a legacy thing.  We use dynamically
    // created contexts for naCall() now, so that we can call them
    // recursively.  But there are still spots that want to use it for
    // naNew*() calls, which end up leaking memory because the context
    // only clears out its temporary vector when it's *used*.  So just
    // junk it and fetch a new/reinitialized one every frame.  This is
    // clumsy: the right solution would use the dynamic context in all
    // cases and eliminate _context entirely.  But that's more work,
    // and this works fine (yes, they say "New" and "Free", but
    // they're very fast, just trust me). -Andy

Consider using coccinelle here to rewrite naContext handling in all places at once


Nasal Context structure

[GitLab]/flightgear/simgear/next/simgear/nasal/code.h#L76

struct Context {
	
    // Stack(s)
	
    struct Frame fStack[MAX_RECURSION];
	
    int fTop;
	
    naRef opStack[MAX_STACK_DEPTH];
	
    int opFrame; // like Frame::bp, but for C functions
	
    int opTop;
	
    int markStack[MAX_MARK_DEPTH];
	
    int markTop;
	
	
    // Free object lists, cached from the global GC
	
    struct naObj** free[NUM_NASAL_TYPES];
	
    int nfree[NUM_NASAL_TYPES];
	
	
    // GC-findable reference point for objects that may live on the
	
    // processor ("real") stack during execution.  naNew() places them
	
    // here, and clears the array each instruction
	
    struct naObj** temps;
	
    int ntemps;
	
    int tempsz;
	
	
    // Error handling
	
    jmp_buf jumpHandle;
	
    char error[128];
	
    naRef dieArg;
	
	
    // Sub-call lists
	
    struct Context* callParent;
	
    struct Context* callChild;
	
	
    // Linked list pointers in globals
	
    struct Context* nextFree;
	
    struct Context* nextAll;

    void* userData;
	
};

Stack frames

[GitLab]/flightgear/simgear/next/simgear/nasal/code.h#L33

Tagged structures

[GitLab]/flightgear/simgear/next/simgear/nasal/data.h#L84

Core allocations

All core memory allocation takes place via wrappers in misc.c:

void naFree(void* m) { free(m); }	
void* naAlloc(int n) { return malloc(n); }
void* naRealloc(void* b, int n) { return realloc(b, n); }
void naBZero(void* m, int n) { memset(m, 0, n); }