Canvas Threading: Difference between revisions

From FlightGear wiki
Jump to navigation Jump to search
m (→‎Problem: https://forum.flightgear.org/viewtopic.php?f=71&t=36540&p=354901&hilit=#p354901)
Line 101: Line 101:


== Approach ==
== Approach ==
One starting point would be changing the assumption that all canvas texture PROPERTIES live in the global property tree, instead each Canvas texture would get its own SGPropertyNode, which isn't accessible from anywhere else.
At that point, you have a Canvas/OD_Gauge context that can be updated by changing said PRIVATE property tree. As long as this property tree is only ever updated from single place (thread), multi-threading things becomes possible, because you only need to serialize access whenever you want fetch/display the updated texture. But apart from that, the update/redraw mechanism could be running in a background thread.
From a Canvas perspective, one obvious issue is dealing with Canvas textures that fetch data/imagery from other textures, because that, too, would require synchronization.
But other than that, you would end up with a Canvas system whose textures can be asynchronously updated by a background thread, scripts doing so would look a bit different, because they would lack access to 90% of the common FG APIs (think geodinfo and friends), because those cannot be considered to be thread-safe.
As you can probably tell, this is something that we once discussed behind the scenes - and it would nicely align with the original idea of "remote properties", i.e. sync'ing and replicating properties between property trees from different threads/processes, the main thing needed to do this is a subscribe/publish mechanism that works over sockets (or some other IPC): http://wiki.flightgear.org/Remote_Properties
This is something where Richard's Emesary work could become highly useful, because the cost of adapting the Canvas system to optionally support an out-of-mainloop mode would be marginal - further, bugman's ongoing work on unit-testing and unit-testing Nasal in particular, should come in very handy, because it would become much easier to start up dedicated FGNasalSys instances (our in-sim Nasal interpreter) that may not run inside the main loop, i.e. lacking most standard FG APIs.
Now, when it comes to using Canvas without Nasal, that's actually a valid use-case, and I find it important to keep that use-case in mind, because over time, we've seen more and more attempts at coming up with frameworks, that basically shield back-end code from changes to front-end code (and vice versa), this is why it is important to primarily work through the property tree, and not rely on dedicated Nasal bindings (cppbind).
It would be a good thing to keep this in mind, because doing so means that multi-instance setups supporting Canvas would become much easier, i.e. there is no problem using Nasal at all, as long as it happens through well-defined interfaces that basically hide the scripting aspect.
Furthermore, a number of core devs have been thinking about using the Canvas system for scenery-related runtime-drawing, which would also require Canvas to become thread-safe, i.e. using a dedicated/private property tree instance to isolate all access to the property tree that is used to update/redraw such textures, which would mean that anything involving OSM2City, photo-scenery, but even random buildings, could be enormously boosted by making the Canvas system available accordingly
threading out all of Nasal is not trivial at all - however, modifying a handful of subsystems to allow future features to run outside the main loop, would be relatively self-contained task.
threading out all of Nasal is not trivial at all - however, modifying a handful of subsystems to allow future features to run outside the main loop, would be relatively self-contained task.
If you have ever done any C++ programming for FlightGear, you realize that there is a thing called the global property tree, and that there is a single global scripting interpreter. The bottleneck when it comes to Nasal and Canvas is unnecessary, because the property tree merely serves as an encapsulation mechanism, i.e. strictly speaking, we're abusing the FlightGear property tree to use listeners that are mapped to events, which in turn are mapped to lower-level OSG/OpenGL calls - which is to say, this bottleneck would not exist, if a different property tree instance were used.
If you have ever done any C++ programming for FlightGear, you realize that there is a thing called the global property tree, and that there is a single global scripting interpreter. The bottleneck when it comes to Nasal and Canvas is unnecessary, because the property tree merely serves as an encapsulation mechanism, i.e. strictly speaking, we're abusing the FlightGear property tree to use listeners that are mapped to events, which in turn are mapped to lower-level OSG/OpenGL calls - which is to say, this bottleneck would not exist, if a different property tree instance were used.

Revision as of 12:00, 28 July 2020

This article is a stub. You can help the wiki by expanding it.


Status

RFC (03/2020)

Motivation

Cquote1.png once it [Canvas] is in simgear It should be really multi-viewer/threading capable. Everything that is not, might be changed at some time to match this criterion.

Such a change often comes with changes in the behavior that are not strictly needed but where people started relying on at some time. So better think about that at the first time.


— Mathias Fröhlich (2012-10-22). Re: [Flightgear-devel] Canvas reuse/restructuring.
(powered by Instant-Cquotes)
Cquote2.png

Background

This is a collection of ideas, discussions and patches with the goal of moving certain types of Canvas code out of the main loop into a dedicated background thread/process.

Flightgear does uses multiple threads, Nasal scripting is not run in one of those however - for the reasons that Thorsten outlined. It is trivial to run Nasal in another thread, and even to thread out algorithms using Nasal. Nasal itself was designed with thread-safety in mind, by an enormously talented software engineer with a massive track record doing this kind of thing (background in embedded engineering at the time). FlightGear however was never "designed" like Thorsten alluded to, rather its architecture "happened" by dozens of people over the course of almost 2 decades meanwhile.

The bottleneck when it comes to threading in Nasal is indeed FlightGear, the very instant you access any non-native Nasal APIs, i.e. anything that is FlightGear specific (property tree, extension functions, fgcommands, canvas) - the whole thing is no longer easy to make work correctly, without re-architecting the corresponding component (think Canvas).

In the case of Canvas, it would be relatively straight-forward to do just that, by introducing a new canvas mode, where each canvas (texture) gets its own private property tree node (SGPropertyNode) that is part of simgear::canvas, at that point, you can also add a dedicated FGNasalSys instance to each canvas texture (Nasal interpreter), and that could be threaded out using either Nasal's threading support or using simgear's SGThread API.

Obviously, there would remain synchronization points, where this "canvas process" (thread) would fetch data from FlightGear (properties) and also send back its output to FlightGear (aka the final texture).

Other than that, it really is surprisingly straightforward to come up with a thread-safe version of the Canvas system by making these two major changes - the FGNasalSys interpreter would then no longer have access to the global namespace or any of the standard extension functions, it could only manipulate its own canvas property tree - all I/O between the canvas texture thread (Nasal) and the main loop (thread) would have to take place using a well defined I/O mechanism, in its simplest form a simple network protocol (even telnet/props or Torsten's AJAX/mongoose layer would work "as is") - more likely, this would evolve into something like Richard's Emesary system.

Like Thorsten said already, you cannot "simply" thread out "all nasal" without either changing all existing Nasal code or without re-architecting FlightGear along the way.

Based on my own understanding of FlightGear, its main loop and the scripting layer, the most promising way forward would indeed be to tinker with a new addon-mode where scripts could be run inside such a sandboxed environment, using a background thread. This would be akin to firefox web extensions, that basically hit the same restrictions because of the proliferation of javascript in browsers - so, this kind of model has been demonstrated to work: one background thread for the work, and main loop scripts for the interaction with the rest of the environment.

This kind of thing can be worked on without breaking things, and it is largely facilitated by bugman's unit testing work, i.e. being able to start independent instances of the Nasal interpreter and test these outside the sim.

Once you are able to do just that, you can also easily take FGNasalSys and come up with a stripped-down version to remove all the stuff that makes such an instance thread-unsafe, and re-add what's useful later on. Probably, using some kind of RPC/IPC mechanism - socket I/O for starters should do.

The very moment you see bugman making reports about testing Nasal standalone in conjunction with certain FG APIs, all the building blocks will be in place. A new addon mode/version could be added to support threaded addons, which is a no-brainer to do, because it cannot break anything, since we don't have any threaded addons yet. And at that point, it would also be trivial to tinker with a new canvas mode, that has its own private property tree and its own private Nasal instance.

This is a really low-hanging fruit to be honest, and it's straightforward path to provide FlightGear with better threading support, so that anything involving new Nasal work, can be made to live inside separate threads, i.e. using such addons or canvas textures that are updated asynchronously, and which are only synchronized at certain time steps.

In addition, from a canvas standpoint this would provide for an excellent mechanism to bring unit testing to canvas-based avionics, because those can then trivially be executed outside the main loop, so that we could even run a batch job on the build server to create screen shots of avionics (say a PFD or ND) purely based on hooking them up to a pre-recorded flight or some other stored state vector containing all the properties/data.

Just running "all of Nasal" outside the main loop is going to be much more work, than being smart about it, and by preparing the hooks to thread out the interesting stuff, and provide an infrastructure to port/implement new features in the future.

Such a modified/modernized Canvas system would then contain its own private property tree for each instance and its own scripting interpreter (context), which would mean that it could even be compiled into a standalone executable, and even be executed in a headless fashion:

Problem

Originally, the whole Canvas idea started out as a property-driven 2D drawing system, but admittedly, what we ended up with is a system that is meanwhile tightly coupled to Nasal unfortunately. Indeed, there are some things where you definitely need to use Nasal to set up/initialize things. But under the hood, 99% still is pure property I/O, which is also why the property tree is becoming a bottleneck.

In general, Nasal is not the problem here - but the way the Canvas system is designed, and the way both, Nasal and Canvas, are integrated - it's a single-threaded setup, i.e. we are inevitably adding framerate-limited scripted code that runs at <= 60 hz to the main loop, to update rendering related state. This is a bit problematic, but it's not a real problem to fix.

It would be a problem to fix up existing Canvas-code (think NavDisplay, PFDs, EICAS etc), but with a few minor tweaks, we could come up with a dedicated Canvas mode where scripts updating a canvas property tree, are running out of the main loop. This would mean that they could not access any of the mainloop-APIs, but apart from that, it's actually a no-brainer, i.e. a straigthforward thing to do.

Out of the box, OSG comes with support for creating and updating textures asynchronously, we just aren't using this currently - for obvious reasons, coding such a Canvas texture, would be a different thing. But the hooks required to make this happen, are fairly straigthforward.


We have more and more aircraft that feature comparatively complex avionics, implemented on top of the Canvas stack via Nasal. Depending on the number of simulated displays/avionics, there is a fair share of property I/O going on, including a fair amount of redundant I/O, because many avionics/display instances share certain I/O requirements (think access to /position, /orientation etc.)

Many modern aircraft will feature between 6-8 Canvas-based MFDs that may be shown/updated at the same time.

For the time being, the free-form nature of Canvas/Nasal based avionics means that most avionics don't use any dedicated frameworks or standard patterns to formalize if/how and when crucial state is updated.

This includes property tree state, as well as other state retrieved via FlightGear extension functions (think Nasal/cppbind).

Thus, a number of complex cockpits have been demonstrated to be affected by the number of Nasal/Canvas based displays. Often, this is due to the structure of existing legacy code.

Goal

This article is intended to provide a comprehensive summary of the various discussions and proposals we have seen in the context of adapting the Canvas system to come up with a new execution mode/model, with the ultimate goal of improving run-time performance - which may include, but isn't restricted to, optionally moving certain aspects of a Canvas-based display out of the main loop into dedicated background threads.

Furthermore, the goal is come up with an execution model that is backwards compatible, and strictly "opt-in" for any functionality that cannot be provided in a safe fashion.


Canvas Architecture

WIP.png Work in progress
This article or section will be worked on in the upcoming hours or days.
See history for the latest developments.

The Canvas system is primarily implemented in C++, it's a listener based subsystem that watches the global property tree for relevant updates/changes, specifically accesses to /canvas are monitored.

Under the hood, each Canvas is implemented as an owner-drawn gauge (OD_Gauge), canvas textures are positioned in the scene using a texture visitor (OSG), replacing static textures as needed.

Each Canvas texture is then composed of so called "elements", the lowest-level element being the "group" which is primarily used to logically structure/organize a texture into a hierarchy of lower-level building blocks. Therefore, each Canvas texture always has a "root" node, which is a group.

In turn, each group may consist of specific "element" implementations, i.e. to render certain types of context, such as:

  • text
  • paths
  • raster images

(and any combination of these)

In addition, there are higher level helpers implemented in scripting space, e.g. a "window" class implemented on top of the image element. Or support for SVG graphics, implemented on top of the OpenVG based path handling support. Also, there is a special group type to handle specifically geographic projections, for mapping/charting purposes.

Approach

One starting point would be changing the assumption that all canvas texture PROPERTIES live in the global property tree, instead each Canvas texture would get its own SGPropertyNode, which isn't accessible from anywhere else.

At that point, you have a Canvas/OD_Gauge context that can be updated by changing said PRIVATE property tree. As long as this property tree is only ever updated from single place (thread), multi-threading things becomes possible, because you only need to serialize access whenever you want fetch/display the updated texture. But apart from that, the update/redraw mechanism could be running in a background thread.

From a Canvas perspective, one obvious issue is dealing with Canvas textures that fetch data/imagery from other textures, because that, too, would require synchronization.

But other than that, you would end up with a Canvas system whose textures can be asynchronously updated by a background thread, scripts doing so would look a bit different, because they would lack access to 90% of the common FG APIs (think geodinfo and friends), because those cannot be considered to be thread-safe.

As you can probably tell, this is something that we once discussed behind the scenes - and it would nicely align with the original idea of "remote properties", i.e. sync'ing and replicating properties between property trees from different threads/processes, the main thing needed to do this is a subscribe/publish mechanism that works over sockets (or some other IPC): http://wiki.flightgear.org/Remote_Properties

This is something where Richard's Emesary work could become highly useful, because the cost of adapting the Canvas system to optionally support an out-of-mainloop mode would be marginal - further, bugman's ongoing work on unit-testing and unit-testing Nasal in particular, should come in very handy, because it would become much easier to start up dedicated FGNasalSys instances (our in-sim Nasal interpreter) that may not run inside the main loop, i.e. lacking most standard FG APIs.

Now, when it comes to using Canvas without Nasal, that's actually a valid use-case, and I find it important to keep that use-case in mind, because over time, we've seen more and more attempts at coming up with frameworks, that basically shield back-end code from changes to front-end code (and vice versa), this is why it is important to primarily work through the property tree, and not rely on dedicated Nasal bindings (cppbind).

It would be a good thing to keep this in mind, because doing so means that multi-instance setups supporting Canvas would become much easier, i.e. there is no problem using Nasal at all, as long as it happens through well-defined interfaces that basically hide the scripting aspect.

Furthermore, a number of core devs have been thinking about using the Canvas system for scenery-related runtime-drawing, which would also require Canvas to become thread-safe, i.e. using a dedicated/private property tree instance to isolate all access to the property tree that is used to update/redraw such textures, which would mean that anything involving OSM2City, photo-scenery, but even random buildings, could be enormously boosted by making the Canvas system available accordingly


threading out all of Nasal is not trivial at all - however, modifying a handful of subsystems to allow future features to run outside the main loop, would be relatively self-contained task. If you have ever done any C++ programming for FlightGear, you realize that there is a thing called the global property tree, and that there is a single global scripting interpreter. The bottleneck when it comes to Nasal and Canvas is unnecessary, because the property tree merely serves as an encapsulation mechanism, i.e. strictly speaking, we're abusing the FlightGear property tree to use listeners that are mapped to events, which in turn are mapped to lower-level OSG/OpenGL calls - which is to say, this bottleneck would not exist, if a different property tree instance were used.

This, in turn, is easy to change - because during the creation of each canvas, the global property tree _root is set, which could also be a private tree instead. Quite literally, this means changing 5 lines of C++ code to use an instance-specific SGPropertyNode_ptr instead of the global one.

At that point, you have a canvas that is inaccessible from the main thread (which sounds dumb, but once you think about it, that's exactly the point). So, the next step is to provide this canvas instance with a way to access its property tree, which boils down to adding a FGNasalSys instance to each Canvas - that way, each canvas texture would get its own instance of SGPropertyNode + FGNasalSys

Anybody who's ever done any avionics coding will quickly realize that you still need a way to fetch properties from the main loop (think /fdm, /position, /orientation) but that's really easy to do using the existing infrastructure, you could really use any of the existing I/O protocols (think Torsten's ajax stuff), and you'd end up with Nasal/Canvas running outside the main loop.

The final step is obviously making the updated texture available to the main loop, but other than that, it's much easier to fix up the current infrastructure than fixing up all the legacy code

telling the canvas system to use another property tree (SGPropertyNode instance) is really straightforward - but at that point, it's no longer accessible to the rest of the sim. You can easily try it for yourself, and just add a "text" element to that private canvas. The interesting part is making that show up again (i.e. via placements). Once you are able to tell a placement to use such a private property tree, you can use synchronize access by using a separate thread for each canvas texture (property tree). But again, it would be a static property tree until you provide /some/ access to it - so that it can be modified at runtime, and given what we have already, hooking up FGNasalSys is the most convenient method. But all of the canvas bindings/APIs we have already would need to be reviewed to get rid of the hard-coded assumption that there is only a single canvas tree in use.

Like you said, changing fgfs to operate on a hidden/private property tree is the easy part, interacting with that property tree is the interesting part.

Also, it would be a very different way of coding, we would need to use some kind of dedicated scheduling mechanism, or such background threads might "busy wait" unnecessarily.

If you know how to build sg/fg from source (git) and how to apply patches, I can provide the corresponding pointers to get you started experimenting with such an adapted Canvas system, we experimented with it a couple of years ago, and there should still be patches somewhere on the forum or the wiki.

References

References