Troubleshooting crashes

From FlightGear wiki
Revision as of 04:14, 28 April 2014 by Hooray (talk | contribs) (→‎Minimal Rembrandt Startup Profile: http://forum.flightgear.org/viewtopic.php?f=17&t=22853&p=207346#p207346)
Jump to navigation Jump to search

Targeted FlightGear versions: 2.6, 2.8, 2.10, 2.12, 3.0+

Motivation

This article should be suitable for pretty much any OSG-based version beyond 2.6/2.8 - the 2.10 reference/disclaimer below is just about a specific GLSL issue that got introduced (and now fixed in 2.11+) in FG 2.10 in particular, the point of the whole write-up is to provide a dedicated place for people to go with these sorts of problems, and to reduce the workload on our part obviously - and provide some guidance to end-users facing such issues, allowing them to experiment a little and provide more specific feedback. The idea is to keep it up-to-date for upcoming FG versions, too - including 2.12 and beyond. In other words, any and all help is appreciated, so please feel free to get involved and contribute to the article by adding your own findings, suggestions and techniques. Your help is really appreciated!

If you should find yourself having to disable additional settings in order to further maximize FlightGear performance, please do add your findings to this article, so that others can benefit.

Our hope is that we can come up with a fairly safe subset of FlightGear settings that should work for 99% of our users, even on moderately old hardware. This would enable us to change the FlightGear defaults accordingly, and use these settings as a safe fallback alternative - so that FlightGear should no longer just crash for people without certain hardware/features.

There's a trend to expose more and more information via the property tree, to make the simulator increasinly runtime-configurable, and to allow subsystems to be re-initialized at runtime, so being able to start up FlightGear in some form of "safe" mode is generally a good thing and useful for anybody helping with troubleshooting on the forums, because it helps us exclude many potential problems. In the long run, such a profile could also be used as the foundation for a FlightGear Benchmark or some scripted regression suite to run FlightGear in a headless mode to help with release preparations.

The Issue with 2.10

Note: FlightGear 2.10 users seeing crashes during simulator startup (without seeing the actual aircraft/scenery) with very old graphics cards that lack GLSL (shader) support should know that there used to be a bug in 2.10 that broke support for graphics cards without any shader support. In other words, if you are able to run other OpenGL software, but cannot even display the about dialog in FG 2.10 using the minimal startup profile detailed below, it is almost certainly your graphics card's lack of shader support causing the issue, because FlightGear 2.10 tries to copy graphics card information to the property tree for debugging purposes, to allow end-users to provide better troubleshooting reports - ironically, this very feature introduced crashes for all 2.10 users without any GLSL support. This issue has been fixed in FlightGear 2.11+

Crashes in general

If FlightGear crashes (or looks/performs very badly/slowly), it's possible that this is due a number of reasons, such as limited hardware resources (CPU, GPU, RAM etc) or insufficient driver support - so crashes are not necessarily due to a faulty program. FlightGear does work for hundreds of forum users, and usually without any form of sophisticated prior setup or configuration. So if something doesn't work "out of the box" it's usually some local or system-specific issue. In general, you are unlikely to encounter any major bugs when using a release - if you don't mind having to deal with bugs and technical issues, we'd like to invite you to help us test prereleases (so called "release candidates"/RCs) and nightly builds - or directly build from git.

Performance Issues

This article or section requires updating due to being mostly based on a forum/mailing list conversation, its style is usually not suitable for the wiki/newsletter. For example, due to using first person speech or lacking proper cquotes.

Please help improve the article by updating it. There may be additional information on the talk page..

If working through this article doesn't solve the problem, you are probably facing a hardware/driver issue, which means that you may need to upgrade (or even downgrade) your drivers, and/or disable some settings on the driver side of things.

If however, the "minimal startup profile" stops the error from occurring, that would suggest that FG is using some "code paths" that are not supported by your current driver - this isn't all that uncommon, we have an increasing number of GLSL shaders, while most of the existing FG code was really only written with a fixed rendering pipeline in mind - and in fact, much of it even predates the OSG port, such as for example the GUI (PLIB/PUI).

It isn't that far-fetched to think that these factors may not be very well supported under certain circumstances. In general, nvidia drivers are often understood to be/perform better (despite being closed-source). It would also be interesting if any "modern" OSG code exhibits similar problems or not, to check that use any of the Canvas-based GUI dialogs/instruments for example, and please report back here.

In summary, our way of using OSG is not particularly optimized, and we're doing a lot of things that are known to be inefficient, such as having lots of GL state changes, and using legacy GL code in conjunction with more modern code - all of these things are having performance penalties, and they also affect compatibility - especially because GLSL, unlike DirectX shader code, is not bytecode, but compiled on-the-fly by your driver - in other words, each GPU vendor will typically have their own GLSL compiler implementation, and these are known to be fragile under certain circumstances - as an open source project, we do not have the resources to literally test each new -or modified- shader on all major hardware platforms - so we really rely on end-user feedback, but also on end-users being able -and willing to- read up on troubleshooting such issues, to provide better/more informed feedback, e.g. by using tools like gDebugger or the corresponding ATI/AMD and nvidia equivalents.

Writing portable cross-platform code is tricky in and of itself, but that problem is already solved by FlightGear - however, supporting different GPU vendors and makes is basically an identical challenge these days, because hardware, drivers and GLSL compilers differ hugely when it comes to quality and performance.

Some people have been suggesting to our core/shaders developers to switch to AMD/ATI or even Intel hardware in order to get rid of certain problems.

But it's not as simple as that to be honest: FlightGear is a fairly old code base, and it also isn't particularly modern - these days, many parts are basically unmaintained, and haven't been touched in years, despite containing lots of legacy code.

OSG is much more powerful than you may think, but it cannot magically fix all the problems that FlightGear introduces, we have a ton of features that basically still date back to the pre-OSG days, i.e. when we were using purely PLIB and SDL. OSG itself is generally rock-solid and there are rarely any issues found with it. To see for yourself, just run osgviewer or any of the OSG examples.

As recently pointed out elsewhere, those OSG examples even support Intel GMA cards, often even shaders to some degree - but we have never really formalized the way OpenGL/OSG code is written/developed for FlightGear, including effects and shaders.

And as mentioned previously, writing portable shaders is made unnecessarily difficult due to the nature of GLSL itself, and due to the fact that we cannot easily develop/test things on different hardware. Most contributors really only have 1-2 computers - typically, with nvidia hardware, specifically purchased for running FG and other 3D software.

This is a volunteer-driven project, you cannot really expect people to spend thousands of dollars on hardware that they don't need. Some of our shader developers already spent more money on certain hardware than 99% of our users probably, including core developers. So it really isn't fair to suggest that all those shader problems are due to "bad coding habits".

First of all, you should really check if the error is in any way affected by NOT using shaders at all, if the error persists, it is is obviously unrelated.

Then again, it is even possible that there are bugs in the C++/OSG code in FG, and that the nvidia drivers just are more forgiving, or even just have better/more lenient (or aggressive) optimization techniques.

Yeah, it is true that most commercial games will perform rather well on modern hardware where FG may typically show single digit frame rates, and even crash - but that is unlikely to be due to GLSL/shaders at all. There are more factors involved here, outside the reach of people doing primarily Nasal/GLSL development.

To see for yourself, just run osgviewer or even fgviewer and see if certain errors show up or not. That is a good thing to do troubleshooting wise, and it is for us easier to check what MIGHT be going on.

I am not sure if this is just a "driver" issue - even if there's no problem on the driver side of things, it clearly is an incompatibility - regardless of the reason. Commercial projects typically have the manpower and resources to do lots of testing so that shaders and other code can be adjusted accordingly.

So far, this has not been a focus of the FlightGear project, and it is not reasonable to ask individual developers to handle this by suggesting that they should get certain hardware, and even pay for it ...

For example, I have access to 4 different computers, but I don't typically build/run/test FlightGear on all of them - even though I could do that, but there are more enjoyable things to do admittedly. Still, I try to provide the corresponding step-by-step instructions to tell people how to troubleshoot problems - but don't expect me to do all this on my own, let alone purchase additional hardware each month, just because someone may be running into certain issues.

Possible causes

It's just as well possible that the FlightGear default settings are not supported with your hardware/drivers, so that the process will be killed by the OS because it's trying to do something not supported by your computer. If you are experiencing full system crashes, you should see System Crashes instead.

By default, FlightGear 2.6+ will assume a certain runtime environment and certain GPU features which may not be suppported by most older cards (e.g. shader support), such as GeForce 6/7 generation hardware and most older hardware (older than 4-5 years), so that it can be expected that FG will probably not work on such computers "out of the box" and may get terminated by your operating system eventually, i.e. the default startup settings will need to be customized accordingly, to disable all unsupported default features. See Settings for slower graphics cards, Problematic Video Cards and FlightGear Hardware Recommendations for more details.

Also, make sure you've got the newest driver for your video card installed and running without errors. A suggestion for a thorough updating procedure can be found here.

Before proceeding any further, please make sure that you are actually able to run other OpenGL-based software/simulators/games, unrelated to FlightGear!

The startup profile detailed below disables tons of features that are known to have a massive impact on frame rate and overall performance by making FG look plain bad and crude, as such, it gives you a rough idea of the maximum performance fgfs is able to achieve on your system - which in turn, allows you then to re-enable individual settings one-by-one, and see their impact on framerate/performance - so that you can determine which settings are particularly problematic - so that you can decide to come up with a custom profile to make the 60-100 fps goal for example - as you can see, on a 2009-era laptop, you could hope for ~790 fps and 3ms of frame spacing when in the "ugly" mode - so by hand-picking my settings and tweaking them as needed, you'll be getting beyond 50 fps, even with 75-85% of eye candy enabled, with fairly moderate GPU hardware

Running FlightGear with minimal settings

In such cases, it is important to start up FlightGear with minimum settings, and start re-enabling features step by step to see what settings are supported by your hardware. You may want to raise the FlightGear log level, so that you get to see where the program stops starting/working, and what it was doing before it got terminated.

The following settings are really just intended to provide a minimal startup profile, with ALL eye candy disabled by default - so that FG doesn't crash due to lack of certain features (OpenGL, shaders etc) and you get a "bare bone" FlightGear window. They use the UFO, as it is the least complex aircraft (the default aircraft is the Cessna 172) and a suitable scenery location to rule out aircraft or scenery as a cause for crashes.

Once you have launched FlightGear with these minimal settings, you can enable all graphics options one by one. If an option incurs a huge performance hit, take into consideration whether it is necessary for your flying experience or not.

The performance impact can be evaluated by using:

It makes sense to keep using the ufo until you have optimized all settings accordingly and found out what's working and what isn't, i.e. don't even touch the aircraft settings or the location settings (airport, azimuth, offset) until you have really optimized everything else already.

Performance targets

Generally, try to optimize FlightGear for at least 30+ fps before changing to a regular aircraft and/or location! Realistically, the ufo is not very representative - so that it would be better to aim for something in between 50-100 fps prior to switching to a more complex aircraft like the c172p for example. For instance, aircraft like the A380, IAR80, Concorde or 777 are known to require fairly powerful hardware.

Thus, if you are not getting frame rates much higher than 100 fps with the minimal startup profile (without using frame throttling, i.e. locking the refresh rate at a certain frequency, obviously), then using complex aircraft (777,787) or scenery (KSFO, KLSV etc) will be next to impossible with your system.

Minimal Startup Profile

Note  If you find any other settings causing a massive improvement of performance/compatibility, please do feel free to add those below, so that others can benefit from your findings. That's the whole point of keeping this on the wiki, so that everybody can contribute. Eventually, we're hoping to use these findings to come up with a reliable subset of FlightGear that can be used to start up FlightGear in some form of "compatibility mode", that can be assumed to "just work" for the majority of FlightGear setups and users. This is going to be particularly important for efforts like FlightGear Headless, FlightGear Benchmark and FGCanvas.

As mentioned, the point of the following settings is to disable everything that could have an effect on performance, compatibility and stability, to ensure that we get a working FlightGear window up and running with no eye candy at all, and maximum frame rate. Once that is working, features can be re-enabled step by step.

Depending on the hardware specs of the target platform, adjusting the threading mode of FG/OSG may also help: Howto:Activate multi core and multi GPU support.

You can directly use a custom Fgfsrc file for the following sections or parse the following lines to the console (separated by one empty character) after "FGFS" or set the respective options in Fgrun or in one of FlightGear's XML configuration files.

Please make sure to rename/delete your autosave and fgfsrc files prior to trying these settings, or your local settings would possibly override these settings.

# --ignore-autosave # uncomment this for FlightGear versions >= 2.99
--airport=ksfo
--offset-distance=4000
--offset-azimuth=90
--altitude=500
--heading=0
--model-hz=60
--disable-random-objects
--prop:/sim/rendering/texture-compression=off
--prop:/sim/rendering/quality-level=0
--prop:/sim/rendering/shaders/quality-level=0
--disable-ai-traffic
--prop:/sim/ai/enabled=0
--aircraft=ufo
--disable-sound
--prop:/sim/rendering/random-vegetation=0
--prop:/sim/rendering/random-buildings=0
--disable-specular-highlight
--disable-ai-models
--disable-clouds
--disable-clouds3d
# --disable-textures
--fog-fastest
--visibility=5000
--disable-distance-attenuation
--disable-enhanced-lighting
--disable-real-weather-fetch
--prop:/sim/rendering/particles=0
--prop:/sim/rendering/multi-sample-buffers=1
--prop:/sim/rendering/multi-samples=2
Note  Beginning with FlightGear 3.1+, you can also toggle individual scenegraph traversal masks on/off (these can be changed at runtime using the Property browser:
  • --prop:browser=/sim/rendering/draw-mask
  • --prop:/sim/rendering/draw-mask/terrain=0
  • --prop:/sim/rendering/draw-mask/aircraft=0
  • --prop:/sim/rendering/draw-mask/models=0
  • --prop:/sim/rendering/draw-mask/clouds=0

Using these settings, frame rates between 400-500 fps and a steady frame spacing (latency) of 3-11 ms, can be achieved on a notebook from 2007 (1024x768). Disabling wireframe mode lowers the framerate by about 50-80 fps. The window should be up and running (fully initialized) within 3-5 seconds, even on moderately old hardware (absent a navcache rebuild).

FlightGear 2.10 bare bones on 2006-era hardware (NVIDIA 7600GO)
FG 2.12 zero eyecandy mode on 2009-era hardware (NVIDIA 260M)
FG 2.12 minimal eye candy mode on 2009-era hardware (NVIDIA 260M)

As can be seen, additional resources can be freed by flying in areas without scenery (ocean tiles), by using simple aircraft like the ufo, and by using a HUD instead of a 2D/3D cockpit panel.


Further assistance

If you need help with the whole process, please get in touch via the FG forums.

When doing that, please fully document the whole process, i.e. including:

  • startup settings (and fgfsrc if used)
  • error messages and warnings (console output)
  • screen shots
  • screen shots showing frame rate, frame spacing, performance monitor, osg on-screen stats (see Howto:Use the system monitor )
  • use the about dialog to provide other important information

Reporting incompatible default settings

Once you have located and identified settings that seem to cause reproducible issues with your hardware, it would be a good idea to report these settings using the issue tracker, so that these can hopefully be made optional in the upcoming release: http://flightgear-bugs.googlecode.com/

Debugging Segfaults & Obtaining Backtraces

Note  This section is currently Linux/gdb specific. Any help in updating it to become less platform/OS/debugger-specific would be greatly appreciated!

A segfault is another word for "crash", it's a so called "segmentation fault", where the program is doing things that it wasn't designed to do, such as accessing invalid memory for example, so that the operating system will terminate the process because its behavior is no longer valid/correct. These segfaults will typically happen due to coding errors, either in FlightGear or one of its dependencies (libraries like SimGear, plib, OpenSceneGraph or OpenAL).

A typical segfault may look like this:

Initializing splash screen
Splash screen progress init
Tungsten Graphics, Inc.
Mesa DRI R200 (RV250 4C66) x86/MMX/SSE2 TCL DRI2
1.3 Mesa 9.1.1

Program received signal SIGSEGV, Segmentation fault.
0xb71467e6 in __strlen_sse2_bsf () from /usr/lib/libc.so.6

The last two lines are basically showing the actual crash, anything shown earlier is normally just an indicator how far the code could proceed shortly before the crash actually occurred.

In order to see what was going on when the crash occurred, you'll need to build a binary with debugging symbols enabled/included, usually this includes FlightGear and SimGear - and possibly even OSG (OpenSceneGraph, normally only if your crash happens inside rendering related code).

A so called backtrace is basically pretty much like a flight data recorder tape/log that is used to conduct a post-crash analysis.

A binary with debugging symbols included will contain additional information that helps developers track down what was causing a certain crash, such as for example the file name, function, line number, variable that was accessed before the crash happened. This is analogous to how a FDR/CVR on an airplane records certain parameters like position, orientation, altitude, velocities, configuration, cockpit conversations etc. This can also be used by developers to see who is responsible for a certain segfault/bug, i.e. which commit introduced the bug.

Debugging FlightGear crashes is significantly simplified by having access to such runtime data, because we can basically go back in time and see what the FlightGear program was doing shortly before the crash.

Unfortunately, logging all this data by default has a certain run-time overhead and requires additional tools and knowledge, so it is usually too costly and not done, FlightGear will by default be heavily optimized by your compiler and not include code logging important debugging/post-crash parameters, for the sake of better performance. . However, as developers, we do appreciate all efforts to provide meaningful bug reports, and a reproducible bug including a good crash analysis, i.e. a so called backtrace is as good as it gets, and the likelihood of a certain bug being examined and fixed is significantly higher when the bug report includes both, a backtrace, and a set of steps to reproduce the error reliably.

A good way to to provide a reproducible test case is using the Replay/flight recorder system to create a flight recorder tape that triggers the crash, alternatively, consider providing a "flight plan" that triggers the crash - ideally, without it being aircraft specific, i.e. reproducible using aircraft like the ufo

Thus, you'll want to reconfigure your SimGear/FlightGearGear ($SG_SRC & $FG_SRC) source trees using the -CMAKE_BUILD_TYPE=Debug switch, for details please refer to: Building using CMake#Debug_Builds. It is a good idea not to touch your existing build trees for this, but instead create an additional directory hierarchy for your debugging binary, please see Building using CMake#Multiple build directories for details.

Once you have rebuilt and relinked SimGear and FlightGear, you'll want to use a debugger like gdb to run your new binary. It does help to have a good way to reproduce a crash, such as using certain startup/runtime settings. For the sake of simplicity it is usually a good idea to disable all unrelated features and subsystems/settings, this includes complex aircraft and complex scenery locations (airports) if possible. For details, refer to the minimal startup profile detailed in this article.

  1. For gdb to be available, you'll normally have to use your package manager (apt, yum, yast etc) and install the "gdb" package first.
  2. next, you'd navigate to your build directory where your fgfs debug binary is located
  3. you'd then, run gdb
  4. via the gdb shell, you can specify the file to be used via file src/Main/fgfs
  5. this will preload the binary and its dependencies
  6. to actually run the file, you would use the run command
  7. you'll normally want to pass arguments right behind run, especially the mandatory ones (i.e. to specify $FG_ROOT)
  8. creating a simple .Fgfsrc file helps for the sake of simplicity
  9. once the segfault/crash occurs, you'll want to run backtrace (with bt being the shorter equivalent)
  10. only if that isn't conclusive, use thread apply all bt full instead to get a full backtrace for all threads (background tasks)
  11. please use the issue tracker to post your backtrace

This is what a typical backtrace may look like:

(gdb) thread apply all bt

Thread 1 (Thread 0xb66c4740 (LWP 10908)):
#0  0xb71467e6 in __strlen_sse2_bsf () from /usr/lib/libc.so.6
#1  0x0886c080 in copy_string (s=s@entry=0x0)
    at /home/cesar/compilation/pkg/simgear/2.10.0/src/simgear-2.10.0/simgear/props/props.cxx:161
#2  0x08870550 in set_string (val=0x0, this=0x8dd55b8)
    at /home/cesar/compilation/pkg/simgear/2.10.0/src/simgear-2.10.0/simgear/props/props.cxx:524
#3  SGPropertyNode::setStringValue (this=0x8dd55b8, value=0x0)
    at /home/cesar/compilation/pkg/simgear/2.10.0/src/simgear-2.10.0/simgear/props/props.cxx:1587
#4  0x08326ae1 in (anonymous namespace)::GeneralInitOperation::run (this=0x8bdea48, gc=0x8be4310)
    at /home/cesar/compilation/pkg/flightgear/2.10.0/src/flightgear-2.10.0/src/GUI/gui.cxx:123
#5  0x085c2b7b in flightgear::GraphicsContextOperation::operator() (this=0x8bdea48, gc=0x8be4310)
    at /home/cesar/compilation/pkg/flightgear/2.10.0/src/flightgear-2.10.0/src/Viewer/WindowSystemAdapter.cxx:41
#6  0xb78ca246 in osg::GraphicsOperation::operator()(osg::Object*) () from /usr/lib/libosg.so.80
#7  0xb78c7784 in osg::GraphicsContext::runOperations() () from /usr/lib/libosg.so.80
#8  0xb7b29fe2 in osgViewer::ViewerBase::renderingTraversals() () from /usr/lib/libosgViewer.so.80
#9  0xb7b2707c in osgViewer::ViewerBase::frame(double) () from /usr/lib/libosgViewer.so.80
#10 0x085c4841 in fgOSMainLoop ()
    at /home/cesar/compilation/pkg/flightgear/2.10.0/src/flightgear-2.10.0/src/Viewer/fg_os_osgviewer.cxx:286
#11 0x082221fa in fgMainInit (argc=32, argv=0xbffff494)
    at /home/cesar/compilation/pkg/flightgear/2.10.0/src/flightgear-2.10.0/src/Main/main.cxx:339
#12 0x081e71ee in main (argc=32, argv=0xbffff494)
    at /home/cesar/compilation/pkg/flightgear/2.10.0/src/flightgear-2.10.0/src/Main/bootstrap.cxx:251

As can be seen, the first few lines basically show exactly what happened, and where it happened, while all the subsequent lines are basically showing the callgraph, i.e. how the piece of code got triggered. For example, FlightGear started in bootstrap.cxx line 251, then called into main.cxx (line 339), at which point the osgviewer setup took place. Next, there are a few lines that do not contain a lot of interesting info, because those are all library calls into OSG - things are starting to be interesting again around line #5, where FlightGear specific code is getting called again. As can be seen in this particular example, the actual segfault was triggered in a SimGear operation here, which was due to a NULL pointer.

At first, this may look a bit confusing and even intimidating, but it does make sense to people familiar with C++ and FlightGear/SimGear - so even if you cannot understand a single thing, just post such output on the issue tracker, so that others can have a look. We'll ask for additional information if necessary.

However, at times it may be nearly impossible to see what actually goes in in FG these days simply because there's too many disparate parts involved here. For example, in the rare instance that our Nasal scripting interpreter could lead to a segfault, we could potentially see where the C code was using a debugger like gdb – but that wouldn't tell us where we were in the Nasal code.

And that similarly applies to many things, especially things that "jump" boundaries here. Another prime example is property tree listeners - Nasal listeners don't have a backtrace beyond the C code, even if it is triggered directly from a Nasal setprop()/.setValue(), because it passes through C and C++ stack frames and then back into Nasal.

Thus, valgrind/gdb/cachegrind/perftools may be too low-level to be meaningful in most cases, i.e. just look at what's going on in the scenery/aircraft department, here just identifying "hot spots" in terms of code/functions would be a red herring, because the real culprit is typically not at all in the run-time code, but the underlying scenegraph data, such as highly complex textures, 3D models or terrain/scenery.

This can now be easily verified by using Zakalawe's new draw-masks to switch off aircraft/scenery/clouds or AI model rendering.

Having acess to gdb/valgrind etc is one thing, being able to actually run and interpret the results is a completely different thing.

Thus, it is more promising -in the long term- to optionally expose certain info at run-time, so that users can access it to see how their aircraft/scenery/feature/configuration behaves in comparison to some other aircraft/scenery/feature/configuration.

Also, when it comes to using tools like gdb -and especially valgrind- the main challenge is that FlightGear is primarily a GUI application, and it doesn't lend itself it being debugged/profiled or leak-tested in an isolated fashion at all.

For example, for valgrind runs to be useful and straightforward, we would need to be able to minimize what's done inside fg_init.cxx so that we can disable things not needed, but also even run FlightGear in a "headless" mode without a visible GUI window, so that such tests could be run directly on the build server as part of a regression testing suite. Otherwise, even just running fgfs in valgrind for 10 minutes sim time may turn out to take hours due to the nature of valgrind, i.e. it being an emulator.

valgrind et all are great if you know how to use them, but for end users, these tools are too fine-grained, we would ideally support some per-subsystem or per-feature granularity for end users.

Related