Multi threaded programming in Nasal

From FlightGear wiki
Revision as of 02:22, 20 September 2010 by Hooray (talk | contribs)
Jump to navigation Jump to search

This article is meant to become an introduction to multi-threading for Nasal users, it is targeted at non developers, people who have no background in software engineering but who want to understand how to use threading in Nasal.

This will mostly apply to aircraft or scenery developers, but also people developing completely new subsystems in Nasal, such as the local weather system or the bombable addon.

While this is written by software developers, we are trying hard not to use any complicated terminology - or at least explain important terms where necessary. If you feel that something lacks clarity or could somehow be improved, please let us know and do provide corresponding feedback (using either the forum or the wiki). Of course, you are also invited to directly improve this article!

This article is currently work in progress.

At the moment, I am planning to cover the following topics:

  • reentrancy
  • race conditions
  • dead locks
  • locking granularity
  • mutexes vs. semaphores
  • worker threads
  • common threading patterns (i.e. producer/consumer)
  • job queues
  • mutable vs. immutable state
  • source code examples in Nasal


What is a thread

Normally, most code in FlightGear will be run sequentially, i.e. in a pre-defined sequential order (usually within the FlightGear main loop), where one instruction is executed after another one, one by one.

The FlightGear "main loop" will make calls to each FlightGear component or subsystem and give each system time do some work (for updating things).

Every program has at least one main "thread" of execution, this is the thread that is also used to run the program. The main loop is run in the main thread.

For people who are not too familiar with programming or computer science, you may refer to as a thread as a running "task". A task can be imagined like a process of work being done. A worker thread is a thread that works in addition to the main thread.

"Multi-threading" is about having several of those "threads of execution" that may run in parallel, concurrently (at the same time). Multi-threading makes programs significantly more complex, but it also provides certain advantages, as you'll see below.

The FlightGear main loop

In the FlightGear main loop, components are separately updated-individual subsystems will usually hand over control to the next component, until all components have been updated and the main loop starts all over again.

For a very simple example, consider the following main loop:

Main loop:
 - update flight dynamics
 - update autopilot
 - update sound system
 - update visuals
 (start over again)

This applies to much of the FlightGear C++ code, but also to the Nasal source code.

Now, imagine that the main loop is being run for a fixed amount of time (say 10 seconds).

The time that each subsystem now gets depends on how much time the other systems require. In other words, if one subsystems requires a lot of time, the other subsystems may not get sufficient time to do all of their own work. So subsystems like the FDM, the autopilot, sound or graphics may lag behind.

For FlightGear that literally means that doing a lot of work in the main loop has also a strong effect on the framerate of the simulator, because subsystems may literally block each other.

If you now need to do a lot of work (i.e. calculations) in Nasal, it is not necessarily a good idea to do all this in the main loop (and main thread), because it may severely affect FlightGear performance.

Multithreading

Nasal provides built-in support for multi-threading using the "threads" module. This allows you to spawn (create) a new thread and run a Nasal function in a separate thread. The details are covered at http://plausible.org/nasal/lib.html

You can imagine multi-threading like many "processes" (or "people") working at the same time, preferably cooperating in well defined fashion, where each thread processes its own piece of code.

The real problem is now coordinating this cooperation, because some resources (data, functions) cannot be used at the same time by multiple users. This is something that you also see in real life:

Access to certain resources (think a toilet or a doctor) needs to be coordinated, i.e. you want to ensure that a toilet (or a tooth brush) can only ever be used by one single person at a time.

In real life we use "locking" (to lock the toilet door). This process of synchronizing access to a shared resource is called serialization, because the access to a shared resource is being serialized. Locks are also used in computer science, using a so called MUTEX or SEMAPHORE (see wikipedia).

If access to a shared resource is not properly synchronized for multi-threaded use, very subtle bugs may appear and even the whole program may crash.

Thread safety

In computer science, if access to a shared resource is properly synchronized to be usable from multiple threads of execution, we refer to this as being "thread safe". In other words, a piece of code is "thread safe" if it can be safely used from multiple threads, running at the same time.

So while Nasal itself is designed to be thread safe, the FlightGear APIs (the extension functions) are usually not -yet- thread safe, at all. If they happen to be thread safe, it is indeed by "accident" and it is usually not safe to rely on this.

This is worth keeping in mind, because even a valid -and thread safe- Nasal program may cause a FlightGear crash once it uses FlightGear APIs in a script, simply because there is currently no synchronization being done for the Nasal extension functions.

So multi-threading is all about cooperation, coordination, synchronization and serialization, simply because there is no longer just one sequential thread of execution - but multiple concurrent ones, which may need to access the same data or code at some point. That's where you need to coordinate things.

Example

So a bunch of n processes/people using e.g. the "yellow pages" to look up a telephone number is not problematic as long as the shared data is only accessed in a "read only" fashion and no person/process actually changes *anything* that is shared by other users.

This is further simplified if all processes/users can conceptually work with COPIES of the original data, instead of a "shared set" of data that'd require synchronization..

In the above example of the "yellow pages", this is obviously trivially parallelizable: if you hand out n copies of the yellow page to n people/users, they can all do lookups in parallel - and don't need to use only one single data source, so the possible performance gain is obvious just by increasing the amount of users and passing each user a copy of the data.