Multi-Threading in FlightGear

From FlightGear wiki
Jump to: navigation, search
This article is a stub. You can help the wiki by expanding it.
Cquote1.png My sense is that most of our crashes seem to come from newer code or threading issues.
Cquote2.png


Cquote1.png the main issue that we're trying to address by looking at multi-threading and HLA is to get nice, consistent, and ideally faster frame-rates. HLA directly addresses part of this if we split off the rendering from everything else. So the viewer will run at (say) 60fps irrespective of what the FDM etc. is running at.
Cquote2.png

Objective

Document the way threading is being used in FlightGear, and the way FlightGear has evolved recently, as well as upcoming developments related to better multi-core support.

Original Design

Cquote1.png I would say that we were quite aware of threads when we built flightgear. We are also quite aware of the severe problems you can quickly create for yourself in a real world application once you go beyond the simple producer/consumer example in your text book.

FlightGear spawns one thread to do as much of the tile paging work as possible in a separate thread. This is painful though because tile paging is complicated, and you can't do anything opengl related outside the main render thread. Model loading invokes texture loading which invokes opengl calls. <boom> application blows up. So FlightGear was written with threading in mind, and we have avoided it as much as possible.


— Curtis L. Olson (2006-08-29). Re: [Flightgear-users] [Plib-users] Multithreading in FG.
(powered by Instant-Cquotes)
Cquote2.png


Cquote1.png threads do impose a lot of extra complexity, they can hide bugs that are very difficult to reproduce and track down, very hard to spot by just reading the code, etc.
— Curtis Olson (2009-08-03). Re: [Flightgear-devel] Multithreading support.
(powered by Instant-Cquotes)
Cquote2.png
Cquote1.png I design flight simulators for a living - and the fight to keep latency down is at least as important as the fight to keep frame rates up.
Cquote2.png

Issues

Cquote1.png the critical thing is actually [...] if we could be sure that rendering (OSG + our pieces of drawing) were in a separate thread from ‘everything else’, we’d use two CPU cores at 100% (ish) and get better framerates [...] OSG threading in theory does the above *already*, but the way we integrate that with the property system causes race conditions and crashes already, and dynamic scene elements reduce the amount of parallelisation significantly [...] fixing existing crashes/races when we use OSG threading aggressively - and I think the easiest solution to do that is to decouple the OSG side from the simulation using a shadow property tree. The fact it would give us points 2 &4 at the same time is of course very desirable.
Cquote2.png
Cquote1.png One thing we need to be aware of with FlightGear is that we have people from all ranges of backgrounds diddling in the code and submitting patches. Things like threading architectures fall down *very* quickly when someone touches them who isn't skilled with threads and also very familiar with our particular threaded architecture and all the nuances and interactions between data values and timing issues ... and these are *very* complex in an application of the scope of FlightGear.
— Curtis Olson (2009-08-03). Re: [Flightgear-devel] Multithreading support.
(powered by Instant-Cquotes)
Cquote2.png
Cquote1.png To what extent is FG thread-safe? At one extreme, is this merely an aspiration for the distant future? At the other extreme, is this an established fact and requirement? Are there perhaps bits that are known to be thread-safe, and can be threaded, so long as we stay away from the naughty bits? Is the answer different for MSVC versus Linux?
— John Denker (2010-02-08). [Flightgear-devel] scheduling and threading.
(powered by Instant-Cquotes)
Cquote2.png
Cquote1.png The question of asynchronous IO and thread safety must be dealt with.
— John Denker (2010-03-04). [Flightgear-devel] network IO issues.
(powered by Instant-Cquotes)
Cquote2.png
Cquote1.png This is a fundamental discussion that often separates the home simulation use and professional simulators. So far concentration has gone into the first I guess, but the second would not exclude the first (while the first might exclude the second).
— Erik Hofman (2010-02-08). Re: [Flightgear-devel] scheduling and threading.
(powered by Instant-Cquotes)
Cquote2.png
Cquote1.png The "tree" part of the property tree is not thread safe at all. If a node is added to the property tree in one thread while another thread is parsing a path in the property tree to find a value, the second thread can see an outdated / corrupted version of the property tree. New property tree nodes are probably not added that often at runtime, and some code (but not all) keeps pointers directly to the property nodes it cares about.

Getting and setting property values are not necessarily atomic operations, particularly where string values are concerned. So all sorts of races are possible there too. There's no global lock on the property tree and no local locks on nodes in the tree either.


— Tim Moore (2010-03-04). Re: [Flightgear-devel] network IO issues.
(powered by Instant-Cquotes)
Cquote2.png
Cquote1.png When working on the sound code I noticed it uses SGPropertyChangeListener to call a callback function when a property is changed. This may sound very useful, but since properties are inherently thread unsafe this could cause race conditions. Now FlightGear itself may be non-threaded but I'm not so sure if there is any guarantee that Nasal is not interfering with the main program. Or that it holds a property over two instances of running the main loop. Or anything like that. Are we sure that callback functions on properties are safe without any form of locking?
— Erik Hofman (Dec 12th, 2015). [Flightgear-devel] SGPropertyChangeListener.
(powered by Instant-Cquotes)
Cquote2.png

Status

Cquote1.png we appear to be single-thread-CPU bound (and if we are on my machine, we probably are on most)
— Rebecca Palmer (2014-09-03). [Flightgear-devel] Performance tests.
(powered by Instant-Cquotes)
Cquote2.png
Cquote1.png OpenGL's thread safety is really neither here nor there. OpenGL calls are

restricted to a few places, and certainly can't happen in a general property
listener. OpenGL calls could end up running in several different possible
threads or even simultaneously if there is more than one graphics context.


— Tim Moore (2010-03-04). Re: [Flightgear-devel] network IO issues.
(powered by Instant-Cquotes)
Cquote2.png
Cquote1.png Some of us have thrown around the idea of using the property system as

general inter-thread communications mechanism, but there's nothing concrete.
The approach I've thought of is to double-buffer the values in the property
tree, so that readers see a consistent value for each property. The
challenge is doing this without duplicating the entire property tree every
frame, and also supporting property tree writes by more than one thread.


— Tim Moore (2010-03-04). Re: [Flightgear-devel] network IO issues.
(powered by Instant-Cquotes)
Cquote2.png

Background

WIP.png Work in progress
This article or section will be worked on in the upcoming hours or days.
See history for the latest developments.
Cquote1.png You can see for yourself:

Fg-3.2-RAM-utilization-in-minimal-mode.png

as you can see, there are a few helper threads being used by fgfs - those are "hard-coded" for some systems using background/worker threads (think sound system) - apart from that, there are some OSG level directives to influence the osgviewer threading mode - however, that code is generally considered to be "broken" (or leaking) for the time being (as per the wiki link posted previously), and it is specific to rendering only.

the fgfs main loop itself is single-threaded primarily. While Nasal does provide support for threading, the majority of the FG-level interface is not thread-safe at all, including most extension functions and other subsystems. The majority of subsystems (i.e. those not using SGThread) are also single-threaded, which includes recent additions like Canvas.


— Hooray (Thu Apr 02). Re: FG 64bit & Linux dependencies.
(powered by Instant-Cquotes)
Cquote2.png
Cquote1.png The FDM itself is also a singleton by design while also being strictly single-threaded, with the fixed assumption that there's only ever a single FDM being used.

Subsystems are structured in a "cooperative multi-tasking (time sliced)" fashion, meaning that one subsystem lagging behind would slown down all the others accordingly - e.g. the Nasal garbage collector is being run in the main thread, which makes frame rate/spacing non-deterministic for the other subsystems that are not FDM-interleaved.

To learn more, see: viewtopic.php?f=18&t=15755&p=153298


— Hooray (Thu Apr 02). Re: FG 64bit & Linux dependencies.
(powered by Instant-Cquotes)
Cquote2.png
Cquote1.png In general, FG still is CPU-bound (=limited) on most platforms these days, but you will still see idle cores - i.e. cores not being properly/fully utilized by fgfs. For details, I suggest to refer to the mailing list and postings made by Rebecca Palmer, Frederic Bouvier, Tim Moore and Mathias Frohlich (all of whom have independently checked and confirmed that FG is generally CPU-limited for most use-cases)
— Hooray (Thu Apr 02). Re: FG 64bit & Linux dependencies.
(powered by Instant-Cquotes)
Cquote2.png
Cquote1.png Better multi-core support is being worked on - primarily by working towards adopting HLA: FlightGear_high-level_architecture_support
— Hooray (Thu Apr 02). Re: FG 64bit & Linux dependencies.
(powered by Instant-Cquotes)
Cquote2.png

Debate

Cquote1.png I would try do completely separating the rendering task from the simulation task and only let them interact through state variables in shared memory. In that way one could run the simulation task and input processing at a high and stable rate, while the frame rate produced by the renderer could be allowed to vary considerably with the complexity of the scene.
— Anders Gidenstam (2006-08-28). Re: [Flightgear-users] Multithreading in FG.
(powered by Instant-Cquotes)
Cquote2.png

Scripting

Cquote1.png With the new FGPythonSys, I could help work on making core infrastructure thread-safe right now without affecting the operation of the rest of the program. Placing FGPythonSys in its own subsystem thread would be the ultimate tool for hardening the core - it would highlight all the points in FlightGear that are prone to racing and locking.
Cquote2.png

Components

Scripting (Nasal)

Property Tree

Canvas

References

References