Understanding Nasal Performance

From FlightGear wiki
Jump to navigation Jump to search
This article is a stub. You can help the wiki by expanding it.
Nvidia NSight profiler showing the Minimal Startup Profile and all Draw masks set, PUI disabled.
nvidia nsight showing fgfs (Minimal Startup Profile, no scenery, with all Draw masks set), just running the FG1000 without PUI

Most Nasal code runs inside the FlightGear main loop, i.e. typically at 30-60 hz (fps). There are some things that are infamous for affecting frame rate consdierably, such as:

  • file I/O (e.g. reading dozens of SVG/XML files to populate/initialize a Canvas)
  • property tree accesses (getting/setting properties, e.g. to update a Canvas)
  • listener invocations (SGPropertyChangeListener)
  • timer invocations (settimer/maketimer via events)
  • context switches (switching between Nasal and native code, code calling itself recursively e.g. svg.nas)
  • GC invocations (GC pressure, number of references/objects added/removed per frame, naPool resizing)


File I/O

File I/O can be threaded via Nasal's threading support. However, some care must be taken or things can really go haywire. Ideally, threading out file I/O should be strictly opt-in. That way, people can easily check if potential problems are related to threading or not.

Property Tree I/O

While a Canvas can only be modified by writing to its property tree location in the main tree, the state it needs can usually be provided by a "data provider" abstraction. This, in turn, makes it possible to provide the data provider with a read-only copy of the relevant property tree state each frame.

Listeners

1rightarrow.png See Listeners for the main article about this subject.

Timers

1rightarrow.png See Timers for the main article about this subject.

Context switches

Whenever a timer or a listener is triggered, that entails a "context switch". Either a subcontext is created for Nasal code triggering other Nasal code, or a context switch between C++<->Nasal is triggered. Sometimes even multiple times per frame - for instance, svg.nas implements the parsesvg API, which in turn means that for each relevant tag, multiple context switches are triggered.

Garbage Collection

1rightarrow.png See How the Nasal GC works for the main article about this subject.

gnuplot diagram showing "GC pressure" (number of freed dea blocks) plotted against GC invocation frequency showing how Nasal's memory pools grow on demand
diagram showing how a naive threaded approach to free dead blocks can reduce the total workload for the garbage collector

For the time being, Nasal is often creating a ton of garbage and will typically need to free 100k dead blocks during each invocation of the garbage collector. The very instant complex cockpits or multiple Canvas displays are involved, Nasal's primary memory pools may grow to hold up to 50k-60k references/objects.

This behavior seems rather wasteful and should probably be investigated - however, it is possible to reduce the GC's workload a little by using a thread pool to free dead blocks in separate threads.

Looking at the numbers, destroying/re-creating 100k memory pool blocks during each GC invocation seems like a bug, there's probably something that is resetting Nasal contexts to recreate them afresh. Basically, the whole memory pool setup seems to be torn down and rebuilt.