Understanding Rembrandt

From FlightGear wiki
Revision as of 17:40, 10 March 2016 by Bugman (talk | contribs) (Switch to {{flightgear url}} to fix the broken Gitorious link.)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Background

Many of us are wondering why Rembrandt is so slow for us, despite having fairly powerful computers.


Cquote1.png there’s plenty of other people with reasonable hardware who have problems with Rembrandt performance - me for example :) And while that may be related to settings, being a lay Mac user I don’t like solutions which require extensive configuration to give acceptable performance.
— James Turner (Nov 27th, 2013). Re: [Flightgear-devel] Rendering strategies.
(powered by Instant-Cquotes)
Cquote2.png
Cquote1.png with a i3770K and a GTX670, I get some hit from ALS (10-30%) but Rembrandt instantly drops me to 20fps, and < 10fps I use an aircraft I actually want to fly (777 or Citation) and go to any major airport (EGKK, EHAM, EDDM, EDDF, EGLC, VHHH) This is at 2560x1600, but on the 670 I would be highly surprised if I'm fill-rate limited, given that AA is off, and the general suboptimal sie of our primitive batches. Emilian has explained on IRC this might be due to the out-of-the-box / default config for Rembrandt being highly suboptimal, which I didn't yet evaluate, I would be delighted to have it more usable. I'm going to test further over the weekend.
— James Turner (Jun 20th, 2013). Re: [Flightgear-devel] reminder: entering feature freeze now.
(powered by Instant-Cquotes)
Cquote2.png

Objective

Try to better understand -and document- Rembrandt internals, so that we can better troubleshoot performance issues.

Ok, I just tried Rembrandt again (after spending 5 minutes reading the wiki article), and while my computer is much less powerful than yours, I am also getting roughly ~15 fps at ksfo with the ufo - and looking at those OSG stats, there's a hell lot of stuff going on obviously, i.e. we have 10 different cameras 3 of them extremely busy (0,1 and 5)

The performance is actually decent, though ALS with all goodies maxed out gets 25 fps for the same scene under Linux (there's the grass shader needing lots of noise calls for instance). I don't get a huge green bar.

The big hit comes when I try to see Las Vegas (with the Urban shader) - that drives me down to 3 fps. Or when I try to activate filtering and switch it to 3 - then my framerate likewise dives down to 4.

So I remember how this went - the base performance of Rembrandt without shadows was actually pretty decent on my box under Windows, 30 fps or so. Switching shadows on cost me some of that but was still flyable, but the shadows were flickering so much that I got a headache after 5 minutes, so I needed to switch the filtering to max. to be able to look at it - and that killed the framerate for good.

If this is something that we really want to investigate more closely, I guess it would be a good idea to read the "deferred rendering" paper that Fred linked to in the article - at least those parts describing the 3 cameras that seem really busy (geometry/shadows/lighting)

Scene Complexity

Purely from a troubleshooting standpoint, I would like to know what kind of effect/impact we can expect from discarding vertices/triangles and quads from all three cameras (having 10 fps even at night time seems very odd), i.e. if discarding those translate into any proportional/tangible performance gains

Actually, the base idea of deferred rendering is that it should be pretty insensitive to the amount of vertices you feed to it because it really has a minimal geometry shader (computationally cheaper than default even, it basically only notes where stuff is on the screen and stores the non-projected position in a buffer) and all the actual work of lighting etc. is done in the fragment pipelines. So I'd be very surprised if it responds at all to changes of the vertex load.

I sort of see this on my card - if I'm fragment-limited, it switches to synchronized framrates, I get either 25 or 30 or 60 fps, but not 33 or 47. Completely different when the vertex shader jams, then I get to see arbitrary numbers. Which is a neat first-look diagnostics. Rembrandt is clearly fragment-dominated on my box.

The thing is, it only takes a few errors in the C++ code that could massively inflate the amount of primitives sent to effects/shaders. And Rembrandt is obviously not well understood now that Fred is not maintaininig it currently. So there might be some low-hanging fruits there, but I am not going to spend hours going through the code unless I see tangible results.

As far as I understand the wiki article, most stages are XML configurable, so we can probably customize things a bit there, or even disable certain cameras/stages - which would make sense to see if each camera's stage looks sane.

For starters, I would probably start up at night time above sea - i.e. "minimal startup profile" and see what Rembrandt is doing then in each stage. The number of vertices etc. should be fairly minimal then, shouldn't it ?

ok, when going to zero-scenery places, I am getting rock-solid 60 fps/25ms here (daytime), with Rembrandt running with aircraft shadows, even with maxed out settings. Can we work with that ? What about you ? I remember your "orbitview" (?) project where you placed a huge sphere into the scenery. Could this help us to do some troubleshooting, i.e. using Nasal to place a few models (and possibly light sources) and see what's having an impact ?

the other thing I noticed is that CPU load doesn't seem to decrease despite AGL/ASL attitude being too high to realistically cast shadows - would these be things that we could add to the effects/shaders to reduce rembrandt workload a bit ?

Reducing Complexity ?

What would be involved in editing effects/shaders to simply discard 50% of all vertices ? I just want to see for myself if that's having an effect here or not  ?

In a vertex shader it's fairly difficult to do. To effectively discard a vertex, you need to evaluate some criterion based on its coordinates/attributes and then if that criterion is true move it out of the view frustrum and return, so you run a 'minimal' set of operations.

The obvious thing to do if you want to test response to vertex numbers is to set visibility lower so that terrain is simply not showing up at the vertex shader at all in a controllable way. A theoretically elegant way if you can is to set random numbers as vertex attributes and to move the vertex out of the view frustrum if the random number is smaller than a threshold. But in passing attributes, you're of course changing the pipeline in a substantial way...


If you want to test scaling of fragment shaders, that's much easier to do - evaluate another criterion in the first line which is true half of the time (say whether you're in the right half of the screen, you can test against gl_FragCoord which is the fragment position on the screen) and then insert a discard; if you want to dump the fragment without computing anything.

Profiling

I am going to check what the C++ runtime profile looks like in comparison to the classical renderer.

Okay, I left rembrandt running for 10 minutes with the profiler enabled: fgcommand("profiler-start") - for some reason, the profile showed that "osgParticles" were eating up /some/ resources despite being not enabled - I tried to explicitly disable them, but that would still not change anything, so I removed the corresponding subsystem from src/Main/fg_init.cxx, which gave me +8 fps:

diff --git a/src/Main/fg_init.cxx b/src/Main/fg_init.cxx
index 86494da..04d6f42 100644
--- a/src/Main/fg_init.cxx
+++ b/src/Main/fg_init.cxx
@@ -635,9 +635,11 @@ void fgCreateSubsystems(bool duringReset) {
     // Initialize the scenery management subsystem.
     ////////////////////////////////////////////////////////////////////
 
+#if 0
     globals->get_scenery()->get_scene_graph()
         ->addChild(simgear::Particles::getCommonRoot());
     simgear::GlobalParticleCallback::setSwitch(fgGetNode("/sim/rendering/particles", true));
+#endif
 
     ////////////////////////////////////////////////////////////////////
     // Initialize the flight model subsystem.
@@ -1077,8 +1079,10 @@ void fgStartNewReset()
     
     simgear::clearEffectCache();
     simgear::SGModelLib::resetPropertyRoot();
-        
+
+#if 0        
     simgear::GlobalParticleCallback::setSwitch(NULL);
+#endif
     
     globals->resetPropertyRoot();
     fgInitConfig(0, NULL, true);

Shadows at night time

Well, something seems a bit odd there, because rembrandt is doing shadows, right ? And when I switch to night time mode, I am still just getting 10-15 fps and seeing similar subsystem/osg activity.

as far as I know it's doing shadows at night just as well. If you drive your ufo to a Rembrandt-defined light source, I think it will cast a shadow also at night. I don't think the shadow part is off just because the sun is down.

Wrong, Rembrandt is not doing shadows at night time: 
https://sourceforge.net/p/flightgear/flightgear/ci/5ccc83566785c9b5b75e8d03579dbd1aa45d7237/tree/src/Viewer/renderer.cxx#l938

The conceptual beauty of shadows in Rembrandt is that they're not faked, there's actually a physics computation going on where light in the scene reaches and where it doesn't. The downside is that unless there's a huge amount of filtering going on, that computation is suffering from numerical accuracy so much that it flickers all over the place.

the thing about rembrandt & light sources is true, I remember seeing screen shots - but I cannot imagine that the amount of computations for a handful of airports light should be equal to have a "central" light source illuminating everything (aka the sun) ?

I invit you to start FlightGear and enable the Rembrandt shadows and look at your framerate, 
then disable Rembrandt shadows and look again at your framerate. 
Conclusion: how many FPS costs the Rembrandt shadows ? 
(here the answer is 3~5 FPS, so Rembrandt shadows are not the "FPS killer" thing)
Depends on you graphic card and a lot of the aircraft. A well done aircraft not splitted in many submodels and objects shows a sginificant less impact than other aircraft with shadows. 
Also the shadow distance definded with the cascades has a big influence. It is true that shadow rendering in Rembrandt are not the main reason for the comparable lower fps, 
but still can have big impact depending on graphic card, aircraft models, scenery complexity.


Basically, it seems there's no "LOD for shadows" taking place, i.e. computations are heavy/complex despite having any light souces in your vicinity that could have an effect realistically. I would have expected that the corresponding shaders/effects specifically look for light source so that the computations kick in, but otherwise don't - unless, rembrandt is even doin moonlight shadow ?

The shadow cascades are acting like "shadows LOD", 
you can tweak this setting at runtime and see how Rembrandt shadows are updated. 
After that you will conclude that "LOD for shadows" is implemented finally.
Also, those cascades has an influence on fps, as the amount of objects and not vertices influences the perfomance in Rembrandt. 
The more objects you have the less framerates you have. With increasing cascades distance you get more objects to draw in scenery/ Aircraft and the less fps.

Zero Scenery Tests

Using --aircraft=ufo - --enable-rembrandt --prop:/sim/rendering/shadows/map-size=8192 --prop:/sim/rendering/shadows/num-cascades=4 I can even get way beyond 60 fps when there's not scenery to be displayed, it would be interesting to check what Fred did there, i.e. if there are heuristics in place to recognize this ? We would probably want to add a few static models to the scenery and see what the performance impact is like.

I think through roughly a dozen different test cases like these, one could incrementally understand rembrandt and its stages - obviously, one would now need to edit some of the XML files and maybe some effects/shaders to see how things are affected


Internals

Cquote1.png Rembrandt has GLSL components (all the *-gbuffer shaders are Rembrandt GLSL code), but Rembrandt needs some additional 'infrastructure' set up - the buffers it renders into being the most important, and a special definition of the sequence of rendering. Such buffers don't feature in a fixed pipeline rendering or in forward rendering. I believe they could theoretically all be set up runtime as well. I don't understand this too well, but I think it's a bit akin to the reset problem in Flightgear - seems easy, but there's a lot of smallprint to consider. It's much easier to create the infrastructure at startup time.
— Thorsten (Sun Oct 19). Re: Orbital Makes the Sky Black.
(powered by Instant-Cquotes)
Cquote2.png
Cquote1.png Internally, a Rembrandt buffer is not much different from any other RTT context - Canvas is all about rendering to a dynamic texture and updating it dynamically by modifying a sub-tree in the property tree - but its primary primitives are 1) osgText, 2) shivaVG/OpenVG paths, 3) static raster images, 3) groups/maps - none of these would be particularly useful in this context. But Zan's newcamera work could be turned into a new "CanvasCamera" element to allow camera views to be rendered to a Canvas, including not just scenery views - but also individual rendering stages. Canvas itself maintains a FBO for each texture, which is also the mechanism in use by Rembrandt. Tim's CameraGroup code is designed such that it does expose a bunch of windowing-related attributes to the property tree - equally, our view manager is property-controlled.


For a WIP-intro, refer to: The_FlightGear_Rendering_Pipeline


— Hooray (Sun Oct 19). Re: Orbital Makes the Sky Black.
(powered by Instant-Cquotes)
Cquote2.png


Cquote1.png it's all the init/setup code that is hard-coded, and which used to contain a few hard-coded shaders - those are basically different rendering buffers that are chained together to set up a deferred rendering pipeline - this isn't done in a "plug & play" fashion currently - exposing this to XML/property tree space would be a huge undertaking probably - Zan's "newcamera" work really is the best match here - and RTT/buffer management is exactly what Canvas is already doing under the hood.

Thus, each RTT buffer could simply be a Canvas texture internally - so that all the hard-coded Rembrandt logic could be maintained more easily at some point. It is indeed lack of consistency and integration that is the main challenge here - because all of these features were developed at a different point in time, and people were usually only interested in making one thing work, instead of unifying those solutions (effects + newcamera branch + Canvas). And it is indeed a lot of work to do this properly - a unified approach takes a lot of time and energy.


— Hooray (Sun Oct 19). Re: Orbital Makes the Sky Black.
(powered by Instant-Cquotes)
Cquote2.png
Cquote1.png Rembrandt pre-dates the whole reset/re-init effort by several years, and while SGSubsystem does provide the corresponding interfaces to be implemented by each subsystem to handle simulator resets, our rendering system isn't a conventional SGSubsystem - equally, all the CameraGroup stuff has become fairly massive meanwhile.

It is definitely possible to implement dynamic reset/re-init even for the renderer, including all buffers and windows/views - Zan's work still is the most promising effort in this department. But that, too, predates the whole Canvas effort. Like wlbragg said: we don't necessarily need to use a lot of dedicated C++ support code to implement alternate rendering schemes like Rembrandt, the main hooks required to support arbitrary -and fully dynamic- schemes is already in place.


— Hooray (Sun Oct 19). Re: Orbital Makes the Sky Black.
(powered by Instant-Cquotes)
Cquote2.png
Cquote1.png Internally, FG doesn't usually use the C/C++ APIs for GLSL directly, but uses the OSG abstraction layers instead (which are fairly well documented, even for people new to shaders).

The "obscure" parts of Rembrandt are not necessarily its effects or shaders, but the underlying C++ code which sets up all the buffers and sequencing. Once that is either documented or exposed, it is foreseeable that deferred rendering will be resurrected again, even if FredB should still not be around - such an effort would not need to involve Zan's work or Canvas, but it would be the most logical step for the time being, absent some other overlapping development effort.


— Hooray (Sun Oct 19). Re: Orbital Makes the Sky Black.
(powered by Instant-Cquotes)
Cquote2.png
Cquote1.png It basically works like this:


  • FlightGear contains an effects subsystem that integrates shaders + properties and materials (=textures)
  • the effects subsystem is using the OSG abstraction layer for GLSL like you say
  • FlightGear itself has all the viewer/renderer logic in $FG_SRC/Viewer
  • Rembrandt needs buffers set up, initialized and updated beyond what's possible via fgdata-space additions, such as Nasal, effects, shaders
  • this is why FredB took the existing "fixed pipeline" code and adapted it to set up a bunch of buffers for all deferred rendering stages
  • and then, a few hard-coded shaders are/were associated with some buffers
  • incrementally, Fred then started to adopt the effects subsystem to move hard-coded shaders out of C++ into $FG_ROOT (fgdata)
  • thus, Rembrandt does indeed uses effects and shaders primarily now - but all the setup/init and update logic still resides in C++
  • many of the performance implications associated with Rembrandt were linked to its C++ code
  • people primarily doing fgdata-level effects/shader development are thus in an exceptionally bad situation to help improve/maintain Rembrandt, because the whole integration is not sufficiently documented, and it can really only be changed via C++ additions.
  • we've seen other C++ patches rejected, despite the corresponding subsystems having active/involved maintainers
  • several core developers have repeatedly stated that they won't target/support Rembrandt due to its known performance implications
  • in its current form, the setup/init logic isn't designed to be dynamically adjusted, i.e. many attributes are not yet exposed to properties/listeners, or only read during startup
  • this is a common trait of C++ code that didn't quite progress beyond "prototyping" - it took Zakalawe almost 2 years to implement reset/re-init support in its current form

  • — Hooray (Sun Oct 19). Re: Orbital Makes the Sky Black.
    (powered by Instant-Cquotes)
Cquote2.png
Cquote1.png Both, Fred and Mathias, seemed pretty eager to adopt Zan's work back then:

https://www.mail-archive.com/flightgear ... 36481.html
https://www.mail-archive.com/flightgear ... 36486.html

Meanwhile, we ended up with Canvas as an abstraction mechanism for FBO management via properties - so integrating Canvas would indeed be a logical choice, unrelated to any particular manifestation like ALS or Rembrandt - integrating these technologies would primarily mean that new features could be prototyped without necessarily having to customize the hard-coded renderer logic - including things like our hard-coded skydome for example, which could be implemented in fgdata space then - which would not just be relevant for efforts like Earthview (orbital flight), but also make other things possible that would currently require a fair amount of tinkering with the C++ code.


— Hooray (Sun Oct 19). Re: Orbital Makes the Sky Black.
(powered by Instant-Cquotes)
Cquote2.png
Cquote1.png I am not talking about Rembrandt and/or ALS in particular here - I am just seeing the main challenge being the lack of accessibility when it comes to required structural changes to the C++ code - regardless of the concrete renderer - the lack of Rembrandt maintenance, and the slow response whenever ALS requires C++ level changes, is primarily because the corresponding renderer code is not being maintained actively - moving this into fgdata space via effects and shaders is a logical thing to do, and will allow people like Thorsten (or yourself) to make corresponding modifications without facing a core development bottleneck when it comes to Rembrandt/FGRenderer or any other $FG_SRC/Viewer modifications.


The CameraGroup.cxx file is basically begging to be refactored sooner or later. None of this needs to involve Canvas, it would just be a straightforward and generic approach to do so, but certainly not mandatory - Zan's original work was implemented using directly XML and the property tree - however, Canvas contains a few helpers to make this increasingly straightforward, requiring very little in terms of code (e.g. PropertyBasedElement as a container for subsystems implemented on top of the property tree).


— Hooray (Sun Oct 19). Re: Orbital Makes the Sky Black.
(powered by Instant-Cquotes)
Cquote2.png