FlightGear and OpenGL Core Profile

From FlightGear wiki
Jump to navigation Jump to search
This article or section contains out-of-date information

Please help improve this article by updating it. There may be additional information on the talk page.

This article is a stub. You can help the wiki by expanding it.
OpenGL ES compatible subset of FlightGear
Started in 04/2020
Description identify, patch and build a subset of FlightGear compatible with GLES
Contributor(s) talks
Status RFC

Some FG users are very interested in seeing how FlightGear runs on lower end/embedded hardware (think RPi-style) using OpenGL ES This is a link to a Wikipedia article.

With other contributors, and core developers, wanting to use a more recent version of OpenGL to make use of more modern OpenGL features (e.g. in effects and shaders). To do so, this entails porting FlightGear to OpenGL Core Profile.

The bigger issue here is we need to ditch PUI (which is in progress) and some OpenGL 1.0 code (HUD, 2D panels especially - can be #ifdef for now) so we can enable Core profile on Mac - since Mac 4.x support (we only hit about 4.3 alas, but with some extensions to get in sight of 4.5) is Core profile only, no Compatability mode.

I *believe* the new open-source Intel-Mesa drivers on Linux (which are supposed to decent quality, and even fast) might be in the same situation.[1]

Probably to do it initially, GLES1 would have to be used - I think GLES2 has no fixed function - and simgear/flightgear would have to be patched to use only OSG calls (if possible), e.g. by excluding features/subsystems that still use legacy OpenGL code incompatible with GLES.

OSG can have GLES{1,2} support without windowing compiled in.

OpenGL 1 and 2 can be mixed but OpenGL ES 1 and OpenGL ES 2 can't. There is a performance price to pay for this backward compatibility. For this reason, we should get rid of PLib because this (old) code is all OpenGL 1 that may slowdown the all rendering pipeline.

For historical reasons, Flightgear and Simgear are mixing OpenGL 1 (old code) and OpenGL 2 (more recent code).[2]

Background

1rightarrow.png See Howto:Optimizing FlightGear for mobile devices for the main article about this subject.

The OpenSceneGraph port initiated in 2006 has never been fully completed, so that there is a certain amount of code making use of legacy OpenGL calls, which complicates modernizing the renderer.

In particular, this means that Supporting multiple renderers is unnecessarily complicated, unifying the 2D rendering back-end is a long standing challenge. It's only since just very recently, that supporting different renderers is being worked on thanks to the Compositor effort.

But even then, phasing out legacy code or porting it, still needs to be addressed sooner or later.

Being able to run fgfs on such, comparatively low-powered, systems using OpenGL ES can actually be a good thing for fgfs as a whole - it can help us understand bottlenecks that are hardly visible on typical gaming/developer rigs, but that may still show up over time (think leaking listeners/memory) - this sort of thing can also be considered the prerequisite for people wanting to target/build/run fgfs on other embedded hardware, such as thin clients with integrated GPUs or even mobile phones/tablets (think Android)

we need to remove PUI and change the Canvas not to use Shiva, to be ES2 compatible or Core-profile compatible.[3]

In other words, if the right people were to team up to specifically target such hardware, this could also mean significant performance improvements for people on powerful gaming rigs.

Approach

For starters, we can try the FlightGear Headless option to build a fgfs version without showing any graphics at all.

The next step will be identifying and excluding problematic sources (those containing legacy/raw OpenGL code, e.g. using glBegin() or glEnable() respectively):

https://sourceforge.net/p/flightgear/flightgear/ci/next/tree/src/Cockpit/render_area_2d.cxx

void RenderArea2D::RenderQuad( const SGVec2f *p) {
    glBegin(GL_QUADS);
        glNormal3f(0.0f, 0.0f, 0.0f);
        glVertex2fv( p[0].data() );
        glVertex2fv( p[1].data() );
        glVertex2fv( p[2].data() );
        glVertex2fv( p[3].data() );
    glEnd();
}

Whenever FlightGear sources contain such or similar code, it is pretty safe to assume that we will need to port/exclude such modules from compilation to ensure that no legacy OpenGL code is executed at runtime.

Instrumentation/HUD/HUD_runway.cxx
Instrumentation/HUD/HUD_tbi.cxx
Instrumentation/HUD/HUD.hxx
Instrumentation/HUD/HUD_instrument.cxx
Instrumentation/HUD/HUD_ladder.cxx
Instrumentation/HUD/HUD_dial.cxx
Instrumentation/HUD/HUD_tape.cxx
Cockpit/render_area_2d.cxx
Cockpit/panel.cxx
GUI/WaypointList.cxx
GUI/CanvasWidget.cxx
GUI/MapWidget.cxx

This means, we'll need to review/disable the compilation of the following folders in $FG_SRC and patch up all hard-coded references to these systems:

  • Instrumentation/HUD
  • Cockpit
  • GUI

In addition, there's Viewer/PUICamera.cxx which references glEnable()

Example: Disabling PUI

1rightarrow.png See Developing using CMake for the main article about this subject.

We should add separate build options to explicitly disable certain features individually.

The bigger issue here is we need to ditch PUI (which is in progress) and some OpenGL 1.0 code (HUD, 2D panels especially - can be #ifdef for now) so we can enable Core profile on Mac - since Mac 4.x support (we only hit about 4.3 alas, but with some extensions to get in sight of 4.5) is Core profile only, no Compatability mode.[4]

Knowing that PUI contains legacy OpenGL code and knowing it's scheduled to be removed anyway because it isn't compatible with modern OpenGL, we will disable it completely without reviewing/porting individual PUI files. This means opening $FG_SRC/CMakeLists.txt to add a new option to disable PUI.


Successfully disabling a feature means primarily:

  • adding a corresponding new build option to the top-level CMakeLists.txt (e.g. DISABLE_PUI)
  • opening fg_init.cxx and navigating to the lines where the feature/subsystem is initialized
  • wrapping the corresponding code in between #ifdef...#endif blocks
  • locating any remaining references to the subsystem in question and repeating the last step there to ensure that removed subsystems are not accessed at runtime

https://sourceforge.net/p/flightgear/flightgear/ci/next/tree/CMakeLists.txt

option(SYSTEM_CPPUNIT    "Set to ON to build Flightgear with the system's CppUnit library")
option(DISABLE_PUI    "Set to ON to build Flightgear without PUI support")

if(DISABLE_PUI)
      add_definitions(-DDISABLE_PUI)
endif(DISABLE_PUI)

We will be adding #ifdef macros to the sources in question, and update CMakeLists.txt accordingly.

Ideally, in conjunction with a feature-specific build option to disable the corresponding feature (think PUI or the HUD).

Thus, after editing CMakeLists.txt, we need to open fg_init.cxx to prevent initalization of PUI.


https://sourceforge.net/p/flightgear/flightgear/ci/next/tree/src/Main/fg_init.cxx#l1043

    ////////////////////////////////////////////////////////////////////
    // Create and register the XML GUI.
    ////////////////////////////////////////////////////////////////////
#ifndef DISABLE_PUI
    globals->add_subsystem("gui", new NewGUI, SGSubsystemMgr::INIT);
#endif


So, after editing fg_init.cxx to prevent the PUI GUI from getting initialized by FlightGear, we will need to find remaining hard-coded references to it, to fix those up and deal with PUI not being available. This means grepping $FG_SRC for any references to "pui" to locate remaining get_subsystem() calls. (FIXME: new subsystem lookups use templates), this will include code in unrelated modules, e.g. fgcommands (think menu bindings) accessing the GUI via something like get_subsystem("gui");

https://sourceforge.net/p/flightgear/flightgear/ci/next/tree/src/Main/fg_scene_commands.cxx#l277

    NewGUI * gui = (NewGUI *)globals->get_subsystem("gui");
    if (!gui) {
      return false;
    }

Thus, it makes sense to intercept registration of these fgcommands at the bottom of the file by wrapping these inside ifdef macros:

https://sourceforge.net/p/flightgear/flightgear/ci/next/tree/src/Main/fg_scene_commands.cxx#l497

/**
 * Table of built-in commands.
 *
 * New commands do not have to be added here; any module in the application
 * can add a new command using globals->get_commands()->addCommand(...).
 */
static struct {
  const char * name;
  SGCommandMgr::command_t command;
} built_ins [] = {
    { "exit", do_exit },
    { "reset", do_reset },
    { "reposition", do_reposition },
    { "switch-aircraft", do_switch_aircraft },
    { "panel-load", do_panel_load },
    { "preferences-load", do_preferences_load },
    { "toggle-fullscreen", do_toggle_fullscreen },
    { "screen-capture", do_screen_capture },
    { "hires-screen-capture", do_hires_screen_capture },
    { "tile-cache-reload", do_tile_cache_reload },
#ifndef DISABLE_PUI
    { "dialog-new", do_dialog_new },
    { "dialog-show", do_dialog_show },
    { "dialog-close", do_dialog_close },
    { "dialog-update", do_dialog_update },
    { "dialog-apply", do_dialog_apply },
    { "open-browser", do_open_browser },
    { "gui-redraw", do_gui_redraw },
#endif
    { "add-model", do_add_model },
    { "presets-commit", do_presets_commit },
    { "press-cockpit-button", do_press_cockpit_button },
    { "release-cockpit-button", do_release_cockpit_button },
    { "dump-scenegraph", do_dump_scene_graph },
    { "dump-terrainbranch", do_dump_terrain_branch },
    { "print-visible-scene", do_print_visible_scene_info },
    { "reload-shaders", do_reload_shaders },
    { "reload-materials", do_materials_reload },
    { "open-launcher", do_open_launcher },
    { 0, 0 }			// zero-terminated
};

This will ensure that fgcommands that are PUI related won't be available in a non-PUI build.

We will also need to look for other files accessing the "NewGUI" subsystem grep -nr "NewGUI" -l:

Main/subsystemFactory.cxx
Main/fg_scene_commands.cxx
ATC/atcdialog.cxx
ATC/atcdialog.hxx
Autopilot/route_mgr.cxx

Since we have excluded $FG_ROOT/GUI from the build and updated fg_init.cxx, we will only need to review/patch these files to ensure that there are no hard-coded references to PUI if it's not available, using the same ifdef based approach.

Obviously, fgdata level resources like menu bindings and/or Nasal code may still try to execute such bindings.

Finally, there may still be other PUI specific references in the source tree ($FG_SRC), so that it does help to check the other folders next: grep -nr "PUI" -l

Since we have excluded $FG_SRC/GUI from the compilation, we can safely ignore any references to that folder, with the remaining ones being:

Input/FGMouseInput.cxx
Main/fg_os.hxx
Main/bootstrap.cxx
Viewer/renderer_compositor.hxx
Viewer/PUICamera.cxx
Viewer/renderer_legacy.cxx
Viewer/CMakeLists.txt
Viewer/PUICamera.hxx
Viewer/renderer_compositor.cxx
Viewer/renderer_legacy.hxx
Viewer/GraphicsWindowQt5.cpp
Canvas/canvas_mgr.hxx

The renderer is basically just referencing the PUI camera so that the GUI can be drawn, whereas canvas_mgr merely acccesses the PUI subsystem to be able to render canvas based textures. In other words, both references can be easily removed for testing purposes.

Affected features and sources

Feature Directory Notes Status
HUD $FG_SRC/HUD One of the longterm development items we have is to replace the hardcoded HUD with one built on canvas. As well as allowing us to remove anothe piece of plib, it would allow simple overlays like this as well. So I think the answer here is to do that work, and then implement a canvas HUD for this overlay.[5] 10}% completed
PUI $FG_SRC/GUI trivial to disable, can be replaced via Phi. Is in the process of being replaced by a Canvas based UI 70}% completed
2D Panels $FG_SRC/Cockpit see Populate /canvas property tree for the 2D panel

There's also a branch (which is a few of years old) contains WIP on implementing 2D panels as canvas. It uses the same loading / updating logic in C++ (for 100% compatibility), but rather than building custom rendering, it builds up a Canvas element hierarchy in C++.[6][7]

80}% completed
Canvas Path (Shiva) $SG_SRC/canvas see Shiva Alternatives, and Scott's and James' comments: 20}% completed (02/2022)
Effects $FG_ROOT/Effects will include Shaders and probably involve a custom Compositor pipeline, for current Status refer to OpenGL#Status 70}% completed

In addition, SimGear needs special treatment, too [8]. Using the same heuristics as before (based on grepping $SG_SRC for glBegin and glEnable, we end up with the following sources (ignoring Canvas Path /shivavg, which is dealt with already above):

simgear/scene/tgdb/SGVasiDrawable.cxx:42:    glBegin(GL_POINTS);
simgear/scene/model/shadanim.cxx


Feature Directory Notes Status

Porting

OpenGL ES This is a link to a Wikipedia article

Vertex buffer objects

Canvas.Path (OpenVG)

1rightarrow.png See Canvas Path for the main article about this subject.

Note  For the patch adding explicit Canvas.Path level synchronization, refer to this forum thread from 08/2021[9]


Will likely need to replace shiva vg with an OpenGL 2.0 based implementation like nanovg [10].

As part of the Core profile migration, we need to replace ShivaVG (which is the functional guts of Path.cxx) with a shader based implementation, ideally NanoVG, although Scott has indicated this might not be as easy as originally hoped. [11]

Furthermore, there are segfaults/race conditions that are reportedly related to Canvas Path: Based on what we've seen and discussed so far, I still think that it's primarily Canvas.Path (specifically ShivaVG) that we need to look at - aggressive OSG threading is ... aggressive. [12]

Which is to say, the Canvas.Path element is not even using native OSG code, and the flickering shown in the screen shots, suggests that it's something related to Canvas.Path handling.

Inside a gdb session, you'll probably see something related to $SG_SRC/canvas/elements/shiva - which is where the 3rd party sources related to ShivaVG reside:

https://github.com/ileben/ShivaVG/blob/master/README
No multi-threading support has been implemented yet.

This also pointed out in the Qt 4 docs, when shiva was used:

https://doc.qt.io/archives/qt-4.8/openvg.html
The paint engine is not yet thread-safe, so it is not recommended for use in threaded Qt applications that draw from multiple threads. Drawing should be limited to the main GUI thread.

i.e. the shiva stuff is called by C++ code from multiple threads - otherwise, seeing flickering specific to map elements implemented via OpenVG (Canvas.Path) would not make sense

Shiva itself cannot be considered thread-safe, but may be invoked from multiple threads in fgfs, once fgfs is being used in conjunction with compositeviewer:

So, what the OP in 08/2021 was probably seeing is this:

  1. setting up multiple windows (using CV or not, probably irrelevant, since both options share the same CameraGroup back-end)
  2. using OSG multi-threading (i.e. no Singlethreaded)
  3. OSG will implicitly try to run some stuff asynchronously using worker threads
  4. this is why shiva gets called from multiple OSG threads, despite shiva itself not being thread-safe

This is neither specific to the Canvas ND, nor to the CompositeViewer mode.

As far as I can tell right now, it's due to ShivaVG - which some core devs have been wanting to replace/update for years anyway, see for example James' comments on "Skia": Canvas_news#Skia_talks.

Note  It's important to highlight that the issue is apprently not specific to Qt5 and neither to the CompositeViewer or the Compositor - it's a bug that people can probably also trigger when using aggressive OSG threading and a single window, since the shiva back-end may get called from multiple OSG threads.

People using single-threaded mode, will still have the same issue built into the binary, but won't trigger it. People using multi-screen setups or the CompositeViewer mode, will use additional OSG threads which will run into the ShivaVG related issue.

For the time being, the workaround is using single-threaded. A "fix" would be porting/fixing Canvas.Path to get rid of Shiva or use a different back-end.

An interim solution would be using explicit synchronization (locks/mutexes) to tell OSG not to use threading for Canvas.Path based drawables - according to the docs, that should be possible by setting the data variance to osg::Object::DYNAMIC - but for the drawable itself that is already being done, thus CanvasPath.cxx will probably need a review to add explicit OSG/OpenThread based synchronization (which is probably something best discussed with Fernando, Richard and James on the devel list/issue tracker). [13]


it does make sense that the OP can trigger this issue reliably, because he happens to be creating 3 different windows (=threads) - whereas I have been testing with a single additional window, which is why the issue probably takes time to show up. With multiple concurrent threads, the shiva code may get called from different threads, so that it will probably segfault rather "reliably".

In the meantime, there is no "fix" per se - the workaround is to use single-threaded mode when using canvas/shiva related features.

Alternatively, we could look at reworking the CanvasPath drawable implementation so that it's using explicit synchronization, to ensure it never gets called from multiple OSG threads.[14]

It seems, the current analysis is spot-on, and seems to be in line with comments found in the osg-users archives:

https://groups.google.com/g/osg-users/c/nH-73NNFw4A

when using DrawThreadPerContext or CullThreadPerCameraDrawThreadPerContext threading models the StateSet and Drawable DataVariance is used to prevent dynamic leaves of the scene graph being updated and rendered at the same time - the draw traversal holds back the main thread till all the dynamic objects have been dispatched.


https://groups.google.com/g/osg-users/c ... CkzY9geHEJ
The thread safety provided by the OSG isn't quite what you are
assuming, just setting the DataVariance to DYNAMIC only affects
whether the update, event and cull traversals of the current frame can
be run multi-thread the draw traversal of the previous frame, and this
hint is only applicable to StateSet and Drawable and is used to
prevent multi-threading where objects are that are being modified by
the update or event traversals running concurrently with the draw
thread that is reading from them. This threading model is
light-weight in that it avoids the need to large numbers of mutex
locks or multi-buffering, but it doesn't provide an means for general
multi-threading. The OSG's multi-threading also can handle running
cull or draw threads on multiple contexts in parallel, and with
database paging, again this is done a light-weight manner than scales
well and has a small overhead. The design is very much geared towards
the needs of high performance graphics applications rather than
general purpose multi-threading.

So... you'll need to take a step back and work on how to work best
with the design of the OSG. The OSG is designed to allow single
threaded updates of the scene graph during the update and event
traversals. If you do wish to do some work multi-thread preparing new
scene graph elements these can be done as a separate subgraph in a
separate thread then merged with the main scene graph during the
update phase - this is how the osgDB::DatabasePager works.


https://groups.google.com/g/osg-users/c ... YJEx_jcAYJ
Modifying the scene graph outside of the frame call is safe in
SingleThreader, and CullDrawThreadPerCamera threading models as they
don't leave any threads active after the end of the
renderingTraversals() method (called from frame()).

With DrawThreadPerContext and CullThreadPerCamewraDrawThreadPerContext
the draw threads will still be active on completion of the
renderingTraversals(), so if you modifying drawables and state that
the thread is still reading from in the draw traversal you will end up
with problems - and potential crashes. There is a standard mechanism
to deal with this issue - and that is the renderingTraversals() method
to block till all dynamic objects in the draw traversals have been
dispatched. The way you tell the draw traversal that an drawable or
stateset will be modified dynamically is to set its data variance to
DYNAMIC.

drawable->setDataVariance(osg::Object::DYNAMIC);
stateset->setDataVariance(osg::Object::DYNAMIC);

This is mentioned in the "Quick Start Guide" book, as well as many
times on the osg-users mailing list so have a look through the
archives if you want more background reading.

OBJECT::DYNAMIC is already being set on the drawable itself (via the interface of sc::Element), but probably not yet on the StateSet (?) - so that might be worth trying next, but other than that, there's probably no "real fix", other than telling OSG not to execute shiva code from multiple threads - short of fixing Shiva, which seems unlikely since a number of core devs have been wanting to get rid of it anyway

So far however everything points at canvas, and looking at canvas code, it seems that there's a lot of stuff going on in update and cull callbacks that looks to me like it shouldn't. I don't really understand the architecture well enough to tell for sure though, or to figure out how to fix it.[15]

From a quick eyeball just now, the code in CanvasPath.cxx looks /mostly/ to me:

  • all setXYZ methods set a dirty flag
  • there is an OSG updateCallback which updates the custom drawable
  • drawImplementation calls the rendering commands, with some extra work to synchronise the current OSG state with the Shiva use of state.

Importantly, the vgPath is created inside update, but this /should/ only touch CPU state. Whereas if the attributes are dirty for say fill or colour or opacity, the vgPaint is re-created, and this likely touches OpenGL commands inside Shiva, so it’s done inside drawImplementation.

So, I don’t see anything grossly incorrect here, with my understanding of what OSG expects, and what Shiva does. Importantly, I don’t see anything special related to culling at all. My recollection is that even in the most aggressive threading modes, OSG won’t update a node at the same time as drawing it, which is why there’s no internal locking: if this guarantee did not hold, I think almost every OSG Group/Node would need to internally lock its state, with enormous overhead.

(There is a complication here around STATIC vs DYNAMIC nodes, but the Path node is tagged DYNAMIC, so the above guarantee should still be correct)[16]

Community talks

References

References

Related content