Howto:Use the system monitor: Difference between revisions

From FlightGear wiki
Jump to navigation Jump to search
m (→‎The system monitor: + messy quotes that should be merged/rewritten into a single message probably)
(Disabled some "Caution" text that i think was distracting from the main content.)
 
(3 intermediate revisions by 2 users not shown)
Line 1: Line 1:
{{Portability Navbar}}
{{Portability Navbar}}


== Profiling ==
<!--
For developers, and people building from source, FlightGear versions >= 2.9 provide real profiling support using Google PerfTools, for details please see [[Built-in Profiler]].
== Terms ==
{{FGCquote
|1= those systems are not "processes", but just "loops" - i.e. whatever the thing is called in the list, you should append "update-loop" to it - and some names are still misleading (events=Nasal).
|2= {{cite web
  | url    = {{forum url|p=264426}}
  | title  = <nowiki>Re: Su-15</nowiki>
  | author = <nowiki>Hooray</nowiki>
  | date  = Nov 13th, 2015
  }}
}}
{{FGCquote
|1= FG is basically using cooperative multi-tasking-imagine it like a team of people (15+), all of whom need to finish their work and hand it over to the next person, so that a single frame can be produced - with some people (subsystems) being allowed to redo/check their work several times in a row (fdm, autopilot), while others will only get a single opportunity to update their internal state.  
 
Finally, it is this "team of people" that will produce a single frame - so whenever "someone" is too slow, it will end up causing stuttering. No matter the reason, scenery/terrain, Nasal etc - at some point, another subsystem will have to wait, which means fewer frames created per second.
|2= {{cite web
  | url    = {{forum url|p=264313}}
  | title  = <nowiki>Re: Su-15</nowiki>
  | author = <nowiki>Hooray</nowiki>
  | date  = Nov 13th, 2015
  }}
}}
-->


== The system monitor ==
== The system monitor ==
Line 8: Line 29:
[[File:System-performance-monitor-fg260.png|400px|thumb|Screen shot showing the system performance monitor available in FG 2.6.0+]]
[[File:System-performance-monitor-fg260.png|400px|thumb|Screen shot showing the system performance monitor available in FG 2.6.0+]]


<!--
{{Caution|
{{Caution|
{{FGCquote
{{FGCquote
|1= the built-in performance monitor, and especially the GUI (it being used in PUI/PLIB)- is adding to the performance overhead you're seeing
|1= the built-in performance monitor, and especially the GUI (it being used in PUI/PLIB)- is adding to the performance overhead you're seeing
|2= {{cite web
|2= {{cite web
   | url    = http://forum.flightgear.org/viewtopic.php?p=243288#p243288
   | url    = {{forum url|p=243288}}
   | title  = <nowiki>Re: FlightGear permanently damages the performance of comput</nowiki>
   | title  = <nowiki>Re: FlightGear permanently damages the performance of comput</nowiki>
   | author = <nowiki>Hooray</nowiki>
   | author = <nowiki>Hooray</nowiki>
Line 21: Line 43:
|1= you will want to use the http/telnet interfaces to watch the performance stats, because the GUI dialog using PUI/Nasal is dead-slow, and even adding to the overhead you are seeing
|1= you will want to use the http/telnet interfaces to watch the performance stats, because the GUI dialog using PUI/Nasal is dead-slow, and even adding to the overhead you are seeing
|2= {{cite web
|2= {{cite web
   | url    = http://forum.flightgear.org/viewtopic.php?p=264313#p264313
   | url    = {{forum url|p=264313}}
   | title  = <nowiki>Re: Su-15</nowiki>
   | title  = <nowiki>Re: Su-15</nowiki>
   | author = <nowiki>Hooray</nowiki>
   | author = <nowiki>Hooray</nowiki>
Line 29: Line 51:
|1= "events" is the subsystem that triggers/runs Nasal timers, and any Nasal code (anywhere), may trigger the GC (garbage collector) - to see if that's the case, you can use the "GC tracker" patch in the sigar code we have.
|1= "events" is the subsystem that triggers/runs Nasal timers, and any Nasal code (anywhere), may trigger the GC (garbage collector) - to see if that's the case, you can use the "GC tracker" patch in the sigar code we have.
|2= {{cite web
|2= {{cite web
   | url    = http://forum.flightgear.org/viewtopic.php?p=258607#p258607
   | url    = {{forum url|p=258607}}
   | title  = <nowiki>Re: How to tell if you are CPU or GPU limited (split)</nowiki>
   | title  = <nowiki>Re: How to tell if you are CPU or GPU limited (split)</nowiki>
   | author = <nowiki>Hooray</nowiki>
   | author = <nowiki>Hooray</nowiki>
Line 46: Line 68:
Note that you can also use the http (browser) or telnet protocols to do this while extending the logging protocol
Note that you can also use the http (browser) or telnet protocols to do this while extending the logging protocol
|2= {{cite web
|2= {{cite web
   | url    = http://forum.flightgear.org/viewtopic.php?p=258579#p258579
   | url    = {{forum url|p=258579}}
   | title  = <nowiki>Re: How to tell if you are CPU or GPU limited (split)</nowiki>
   | title  = <nowiki>Re: How to tell if you are CPU or GPU limited (split)</nowiki>
   | author = <nowiki>Hooray</nowiki>
   | author = <nowiki>Hooray</nowiki>
Line 53: Line 75:
}}
}}
}}
}}
 
-->


In FlightGear >= 2.6.0, use the menu option '''Debug''' => '''Monitor System Performance''' to analyze stutters.
In FlightGear >= 2.6.0, use the menu option '''Debug''' => '''Monitor System Performance''' to analyze stutters.
Line 69: Line 91:


So, frame spacing is a much better property to judge visual quality than just watching fps.
So, frame spacing is a much better property to judge visual quality than just watching fps.
The new GUI dialog is available in the menu: "'''Debug'''" => "'''Monitor system performance'''".


* Only statistics on our "subsystems" are shown. That's the FG modules running in our single-threaded main loop. Everything outside the main loop isn't shown, i.e. there are no statistics on the GPU or OSG rendering threads (use the OSG statistics screen for these).
* Only statistics on our "subsystems" are shown. That's the FG modules running in our single-threaded main loop. Everything outside the main loop isn't shown, i.e. there are no statistics on the GPU or OSG rendering threads (use the OSG statistics screen for these).
Line 94: Line 112:


== Discussed Enhancements ==
== Discussed Enhancements ==
* decouple Nasal events from the "events" subsystem and use a new "events-nasal" instead [http://flightgear.org/forums/viewtopic.php?f=46&t=17069&p=164715#p164715]
* decouple Nasal events from the "events" subsystem and use a new "events-nasal" instead {{forum link|p=164715}}
* show average frame-spacing effect per subsystem as a percentage
* show average frame-spacing effect per subsystem as a percentage
* show memory consumption per subsystem (overload the new operator)
* show memory consumption per subsystem (overload the new operator)
Line 117: Line 135:


In short: if the framerate '''varies along with the Frame Delay''', then it's a graphics performance issue. If framerate is pretty stable, and Frame Delay '''varies more or less wildly''' I'd suspect some other subsystem being the culprit and causing more or less periodic delays as it needs to do it's stuff.
In short: if the framerate '''varies along with the Frame Delay''', then it's a graphics performance issue. If framerate is pretty stable, and Frame Delay '''varies more or less wildly''' I'd suspect some other subsystem being the culprit and causing more or less periodic delays as it needs to do it's stuff.
== Profiling ==
For developers, and people building from source, FlightGear versions >= 2.9 provide real profiling support using Google PerfTools, for details please see [[Built-in Profiler]].


= Interpreting the OSG on screen stats =
= Interpreting the OSG on screen stats =
[[File:Fgviewer-with-osg-stats-on-intel-gma-netbook.png|thumb|Screenshot showing fgviewer (late 2015) with osg on screen stats]]
{{Main article|OSG on screen stats}}
 
{{Caution|
{{FGCquote
  |FPS is not a good performance metric. If you have vsync turned on in your<br/>
driver, a very tiny change in the time of an operation can make a big fps<br/>
difference. Also, everyone has different hardware... It's more useful to<br/>
look at the OSG statistics (choose "Cycle On-Screen Statistics" twice) and<br/>
report the times reported for each traversal. Or include a screenshot of<br/>
them.
  |{{cite web |url=http://sourceforge.net/p/flightgear/mailman/message/26628520/
    |title=<nowiki>Re: [Flightgear-devel] Multithreading pattern and frame rate</nowiki>
    |author=<nowiki>Tim Moore</nowiki>
    |date=<nowiki>2010-11-20</nowiki>
  }}
}}
}}
 
[[Image:OSG-OnScreen-Stats.png|thumb|400px|Screenshot showing the OSG on screen stats]]
 
You can also try to play with osg's frame statistics, you can switch that on from the debug menu.
 
{{FGCquote
  |The stats in "camera" table are for objects that have survived culling and<br/>
are passed on to OpenGL. Vertices should be the sum of all the vertices<br/>
contained in the primitives. I believe that "triangles" refers to the total<br/>
number of triangles and not triangle sets.<br/>
<br/>
The table on the right contains stats for the whole scene graph that is<br/>
paged in. It is expensive to gather those statistics; you may notice a<br/>
slowdown when you enable them.
  |{{cite web |url=http://sourceforge.net/p/flightgear/mailman/message/33042445/
    |title=<nowiki>Re: [Flightgear-devel] Building of framerate doom</nowiki>
    |author=<nowiki>Tim Moore</nowiki>
    |date=<nowiki>2014-11-15</nowiki>
  }}
}}
 
Being CPU and draw limited means the yellow bar is long. And the blue one grows when cull happens to be a problem.
 
A simple help is to switch on the onscreen stats in the viewer or flightgear  and see if the yellow bar is longer than the orange one. A yellow bar longer than the orange one means that the cpu is not able to saturate the gpu.
 
Again, beware this does in no way mean that you should write complicated shaders to saturate the gpu! This rather means that the geometry setup/state change time - the yellow one - must decrease!
 
Once your orange bar is longer than the yellow one, you can start to care for the shaders and their execution. When thinking about shaders, keep in mind that different GPU's perform very different on the same shader.
 
How to achieve that is a very long chain of right decisions. Starting with the
modeler:
 
Avoid having leaf geometry with very small parts. Collapse as much triangles  as sensible for culling into the same leaf geometry. Sure if geometry needs a
different texture or anything that requires a different OpenGL state you cannot collapse this into the same leaf. May be it makes sense then to collapse the
textures into a common one. Then you might be able to collapse more geometry.
 
Avoid animations that are not really required. If you need a transform node in between some geometry, the geometry below the transform needs to be drawn
seperate which takes time on the cpu. Everything that needs geometry to be drawn seperately should be avoided to that level where it stops to make sense
because of culling.
 
May be introduce some level of detail. Not just avoid the whole model in some distance, but also provide a static model without animations, simple geometry
for the mid range. May be provide some more culling friendly and detaild with the animations you need for the short range.
 
Keep in mind that culling a model that you are close to should make you split the model into more parts that could possibly be culled away. But for culling
a far away model is probably very sufficient to cull it either as a whole or not.
 
Avoid state changes. Use as much common state as possible. Osg does a good job in sorting together draws using the same state, but if the overall scene
contains plenty of different state osg cannot change that.A new texture is a new state for example. A new shader is a new state. ....
 
Once your orange bar is longer than the yellow one, you can start to care for the shaders and their execution.
When thinking about shaders, keep in mind that different GPU's perform very different on the same shader.
 
Appart from OpenGL we spend a lot of time in scenegraph traversal. This is mostly due to plenty of structural and often needless nodes in the scenegraph.
The reason or this is that historically the xml files did some implicit grouping for *every* animation in the xml file. To make that reilably work Mathias
had to introduce a huge amount of group nodes in the xml file loader. These really hurt today as they introduce a whole lot of group nodes that just have
a single child which need to be traversed for update and for cull.
 
There's also the long-term plan to flatten the LOD quadtrees and transforms of the tiles. Each tile will get some top-level LOD groups for all objects (shared and static). I'm hoping in combination with the LOD-scale function in OSG, this will mean we can automatically drop random tress / building  and STG objects to keep frame-rate up. (as an optional mode of course!) - we hope to simply get less tiles and models present by a better lod structure.
 
The trick is to minimize the number of LOD nodes introduced into the scene graph while avoiding popping effects - flattening the transforms is good to do. But some of them are critical to stay like the are or at least similar. The first one that positions objects with respect to the earth centered coordinate system for precision reasons. Also these coordinate systems coudl for drawing reasons aligned in any direction. But as long as we do simulations using the same tree and geometry alignments that we do for draw this still interferes with the bounding volumes we have for ground intersections. And this axis aligned bounding box  implementation gains a lot by having the boxes horizontally aligned. Todays transforms are done so that huger things are aligned with the horizont which serves this purpose.
 
In [[FGViewer]] there's a PagedLOD whole world database tree running. This is similar to osg's marble dataset but carefully designed around our tile structure. Using this I think we can really replace a lot of our fine structured scenery tiles with something more croase that is used for tiles more far away.
Drawback with our current codebase: Our integration of the particle systems need to be rethought as this contains geometry with culling disabled which
makes a pagedlod just never expire. Switching the particle systems off works pretty good so far.
 
Another goal is to add more node bits (and a GUI dialog to control them) so various categories of objects can be disabled during the update pass. This will
mean the direct hit of, say, AI models vs particles vs random trees can be measured. Of course it won't account for resources (memory, textures) burned by
such things, but would still help different people identify slowness on their setups.
 
Profiling shows that Group::traverse is the most used function in flightgear. The lowest hanging fruit could be to optimize away the redundant nodes from
the top level model that gets loaded by a database pager request. We cannot do that recursively once a model part is loaded since the mentioned grouping
nodes might be referenced in any level in the model hierarchy above the  currently loaded model. So only the top level model could do this without
braking tons of models.
 
At the highest level:
 
* Update: cpu processing your application is performing
* Cull: cpu processing performed by osg in traversing the scene graph and packaging drawing state
* Draw: cpu processing for sending drawing state to gpu and gpu processing of the drawing state
 
If '''Update''' stage is large, you need to determine what part of your application is taking the time and optimize it.
 
If '''Cull''' stage is large the scene hierarchy is too complex and needs to be optimized - spatial optimization, flatten static hierarchies, ...
'''Cull''' times can be improved by splitting the Cull operations over concurrent threads. In the case of Flightgear you can do this by using the following command-line parameter:
 
<code>--prop:/sim/rendering/multithreading-mode=CullThreadPerCameraDrawThreadPerContext</code>
 
See: [[Multi core and Multi GPU support]]
 
 
If '''Draw''' stage is large then you need to determine if you are geometry (drawable or vertex) or pixel limited. If you grab the window corner and shrink/expand it's size
and the performance increases/decreases significantly, then you are pixel/fill limited (too much depth complexity/too much pixel processing).
If the performance doesn't change then the most likely cause is scene complexity at the object level, '''drawables''' in the osg stats table. This has a far greater impact on Draw (and as stated above on Cull) than the number of vertices in the scene. This impact is such that a scene with '''double the vertex count''', but '''half the drawable count''' of a '''less detailed scene''' renders '''faster'''. This is caused by the fact that for sending each '''drawable''' the OpenGL state machine needs to be reset. Vertex count starts having an impact only for extreme values (millions of vertices; this however doesn't mean you should ignore the possible performance impact caused by it)
 
''In short, if you want to improve the detail of a scene without causing a big performance drop, make sure you have as little detailed object-material(texture) pairs as needed. Ideally you would merge all meshes that are opaque, share the same material, and are not animated into a single mesh and if needed use a single texture, same for transparent mesh parts that are not animated (they should be split from opaque parts), leaving as individual meshes only those parts that will have animations attached''
 
It's actually much more complex than this, but it's how you can start diagnosing performance issues.
 
Run your scene in osgViewer to see how fast it renders. You might have to build the scene in your regular app and then save the scene root to an .ive
file, which you can load into osgViewer.
 
A really good tool to use is an OpenGL command logger. It logs all the actual OpenGL commands that OSG is issuing and can help identify problem areas.
 
== References ==
* http://www.mail-archive.com/flightgear-devel@lists.sourceforge.net/msg35202.html
* http://www.mail-archive.com/flightgear-devel@lists.sourceforge.net/msg35917.html
 
 
[[Category:Howto]]
[[Category:Howto]]
[[Category:Scenery enhancement]]
[[Category:Scenery enhancement]]

Latest revision as of 07:13, 16 October 2020


The system monitor

Screen shot showing the system performance monitor available in FG 2.6.0+


In FlightGear >= 2.6.0, use the menu option Debug => Monitor System Performance to analyze stutters.

In a perfect world, min/max/mean values should be almost identical for every subsystem - and standard deviation should be almost 0.

Larger differences / high standard deviations result in a sloppy simulation and stuttering movements, though they'll hardly influence the frames-per-second value at all.

Also, imagine a system producing 100 frames in 0.5 seconds, then blocking completely for another 0.5 seconds: The fps display would still show "100fps", which seems great. But the 0.5 second stutter cause the visual performance to be terrible.

When evaluating simulation performance, don't get fooled by the frame rate. What's really important to us is the "worst case frame latency". Even if the system is producing a huge average of 100 frames per second, it can still look absolutely crappy, if only a single frame took more than 20-30ms, since that's immediately visible to the human eye. So, we're building a real-time system here, and 20-30ms is our timing constraint.

That's why it is preferable to display "(worst-case) frame spacing" instead of fps (View => Display Options => Show frame spacing). The frame spacing for the previous example would show "500ms", while a system producing 100 frames with perfectly even spacing would show "10ms".

So, frame spacing is a much better property to judge visual quality than just watching fps.

  • Only statistics on our "subsystems" are shown. That's the FG modules running in our single-threaded main loop. Everything outside the main loop isn't shown, i.e. there are no statistics on the GPU or OSG rendering threads (use the OSG statistics screen for these).
  • Title bar shows average and worst case frame rate. In a perfect world, all frames should be spaced evenly, so both values should be almost identical. When frames are produced unevenly, then a larger difference between the worst and average frame rate is visible. Needless to say, we should try to get FG to deliver frames evenly, so movements look smooth.
  • Main table shows statistics on the individual subsystems. Helps to identify how much time certain systems currently consume, or which are responsible for jitters resulting in uneven frame rates.
  • A bit background on the FG subsystems may be necessary though to really judge what's going on:
    • For example, you'll see the "nasal" subsystem consuming almost no time at all, so it looks great. However, almost all the nasal code runs in timers, and timers are driven by the "events" subsystem. So, to judge Nasal performance, you'll mainly need to look at "events" (and yes, you'll see time being consumed and jitters being produced there, which is related due to Nasal's current MARK/SWEEP Garbage Collector implementation, i.e. see: Improving Nasal).
    • the props (property tree) system make invoke Nasal callbacks via listeners
    • The "tile-manager" an "ai-model" subsystems may also contain time consumed by the Nasal engine, since each scenery tile and each AI/MP aircraft may trigger a Nasal script ("load"/"unload" hooks).
    • The GUI system is another subsystem which may trigger Nasal hooks
    • While it is true that the current Nasal mark/sweep garbage collector has been shown to cause occasional stuttering under certain circumstances, the system monitor in its current form is not suitable to tell you the difference between a badly coded "rogue" script causing stutter or the stuttering being really caused by the Nasal GC. Also, it is worth noting that following good coding practices may actually help decrease the overall pressure on the GC, i.e. by keeping the number of references to be traversed low and pretty constant, but also by not calling unneeded code too frequently using timers or listeners.
  • At the moment, it is unfortunately not yet possible to get timing statistics for individual Nasal scripts or sub modules, but this is something that has been requested by a number of users and so we are looking into making this possible eventually.
  • Statistics data is only collected while the dialog is open (or you manually set /sim/performance-monitor/enabled).

Anyway, this alone isn't improving anything yet, but having a GUI to conveniently monitor performance will hopefully help us to see where exactly we're having issues - and help us to improve...

Discussed Enhancements

  • decouple Nasal events from the "events" subsystem and use a new "events-nasal" instead [2] This is a link to the FlightGear forum.
  • show average frame-spacing effect per subsystem as a percentage
  • show memory consumption per subsystem (overload the new operator)
  • plot graphs for the various metrics using the new Canvas system
  • track Nasal GC pressure (invocations) and total memory allocated [3]

Debugging stuttering

To put it simple, vertex shader performance is affected by the amount of detail visible in the geometry of scene (number of vertices visible), and of course shader complexity itself, while fragment performance is affected by screen resolution(or window size if you run it windowed) and shader complexity as well. They both affect framerate in the end, since the frame needs to be finished before being put up for display. Frame delay might be affected by a lot of other stuff, but it's a better measure of smoothness since it displays the "longest" time it had to wait for a frame in a certain amount of time. There might be some performance lost in high detail scenes, due to the culling (visibility check, and hiding of faces facing away from the camera, or being behind other faces, or too small to display (openscenegraph is smart about that one and it won't "display" faces/parts of objects that would end up too small)) that needs to be done on the scene before it goes through the shaders.

To test if it's a graphics performance issue, you'd do a couple of tests.:

First load up the ufo, over detailed scenery, remain static, don't move the view around. Check framerate and frame delay. If they're consistent with each other it might be a graphics performance issue, if not you can be sure there's some other subsystem that does background processing and needs to finish its stuff.

Move the view around, do you get stutters? If you do, then there's some pressure on your video card's RAM. As it needs to load the new geometry and textures that come into view, it might not have enough space left, so it might need to offload the "old" ones. Even like this framerate should stabilise once you stop moving the view around, or once the rate of movement is slow enough so that the video card can keep up.

Fly straight and level, at reasonable speeds, keeping the view fixed. Do you get stutters? Is framerate consistent with frame delay? If it is, and you get low fps, again it might be a graphics issue, as new random objects come into view and/or the trees shader creates new ones in the distance. If not I's still suspect another susbsytem being the culprit.

You can repeat these with some aircraft, see if and how things change.

In short: if the framerate varies along with the Frame Delay, then it's a graphics performance issue. If framerate is pretty stable, and Frame Delay varies more or less wildly I'd suspect some other subsystem being the culprit and causing more or less periodic delays as it needs to do it's stuff.

Profiling

For developers, and people building from source, FlightGear versions >= 2.9 provide real profiling support using Google PerfTools, for details please see Built-in Profiler.

Interpreting the OSG on screen stats

1rightarrow.png See OSG on screen stats for the main article about this subject.