World Scenery 3.0 rendering

From FlightGear wiki
Jump to navigation Jump to search

This page has changing notes about the ongoing World Scenery 3.0 rendering implementation. It can be considered a scratchpad. If you are looking to help with rendering, testing, scenery generation with VPB, or any other aspect of WS3 get in touch using the "fg-devel" mailing list.

Notes - short term features

Misc issues (near term):

  • Uniforms tile_width and tile_height sent from the CPU are the wrong way around. These could be packed in a vec2 called tile_size.xy. They are currently put into the global variable tile_size in landclass-search-functions.frag , and flipped manually.
  • terrain-overlay.eff is used in newer regional definitions instead of terrain-default. There's no way to tell FG to use the WS30 version - ws30-overlay.eff.
  • Roads: switching materials ws30Freeway and ws30Road to road-*.eff causes roads not to render. Switching to lfeat.eff causes roads to render, even at ultra, but there are no moving cars etc. (lfeat.eff is derived from Runway.eff which inherits from terrain and uses a runway shader at high settings). The terrain-default effect has the wrong texture and goes black in the distance. It is not designed, or the most efficient, for roads. The terrain fragment shaders have changed in recent eras so at ultra setting it is probably missing some uniform - e.g. xsize is sent used as a uniform (I can get roads to render by setting xsize to ysize, but there is an error in the console log about scaling the texture).
  • Likely no longer needed: Landclass texture dimensions in texels is useful - however textureSize() provides that. The question is if some exotic compiler/GPU has issues with this - maybe unlikely given FG will eventually need OpenGL 4.x+ support? Querying textureSize() is likely fast since the GPU must know this to handle the texture, so the value should be readily available. So this is not needed?
  • There's some sort of landclass border artifact visible at long range visible on a 10 series GPU. It's mostly hidden by the haze , which is thicker than normal right now due to issues with tile model position. To see it clearly remove haze and lighting via the control in the fragment shader.

Sending materials parameters to shaders as arrays of vec4s

v2, 19th Nov 2021.

Two ways of doing it: 1) each parameter as a separate array with a landclass index. 2) all material parameters per landclass in one block.

1) needs a function to look up non-vec4 parameters - it's possible to keep vec4 parameters as an array of vec4s, and pack single parameters in groups of 4 as a combined array. There needs to be a way to define, in an xml mapping file, which parameters are packed where in a combined vec4 array.

e.g. four materials parameters base_magnification_factor, use_overlay, overlay_magnification_factor, overlay_bias are packed into a suitably named combined array.

The other alternative is to have more occupancy and use an array with the relevant data type - e.g. floats .

2) has the advantage that accesses are in interleaved format, which is easier on caches and reduces register pressure by being vec4s. 2 always needs a simple function or two to look it up.

e.g.

float getFloatFromVec4ArrayData(int i)
{
    int n = int(floor(float(i/4.0)));
    vec4 v4 = specularArray[n];
    int index_within_v4 = int(mod(float(i),4.0));
    float value = v4[index_within_v4];
    return value;
}


vec4 getVec4FromArrayData(int i) 
{
  return (vec4(getFloatFromArrayData(i), getFloatFromArrayData(i+1), getFloatFromArrayData(i+2), getFloatFromArrayData(i+3));
}

// Accessing parameters
vec4 specular = getVec4FromArrayData(size_of_parameter_block*landclass_index+17); // returns 17, 18, 19, 20
vec4 ambient = getVec4FromArrayData(size_of_parameter_block*landclass_index+21);
int use_overlay = int(getFloatFromVec4ArrayData(size_of_parameter_block*landclass_index+25));

2) needs a way to define a block of parameters per landclass in xml. e.g.

0: the number identifying the effect. There needs to be a xml file to assign different effects a number. e.g. WS30.eff:0, agriculture:1, some future terrain effect: 3. This number will be used to turn different code blocks on/off.

1: base texture slot

2: base_texture_magnification parameter

3 to 6: specular rgba

[...]
Sending of texture set data to shaders

v1, 13th Nov 2021.

I think the maximum number of texture sets used by regional definitions is 4 - e.g. DryCropPastureCover/DryCrop in regions global summer. So the maximum number of texture sets could be limited to save space. 4 sets, or a few more? Possible to extend the number later. e.g. base_tex1, base_tex2, base_tex3, base_tex4, overlay_tex1, overlay_tex2, overlay_tex3 ... For 6 textures that's 24 texture parameters per landclass. It may be possible to reduce this a bit, as things like dot textures probably don't feature in texture sets currently.

Roads

v2, 13th Nov 2021.

Eventually road shaders should have access to the underlying landclass - this could come directly from the tile's landclass texture if road meshes are separate per tile. Landclass info could also be embedded as a vertex attribute. Roads can lookup the ground texture array using this info to create dirt concentrated towards road edges in country roads or less well maintained roads - for example dirt, mud, sand depending on surrounding terrain. It's also possible to just create some suitable dirt colours. The maintenance status of roads could come from a OSM2City heuristic - e.g. distance from cities/town, lighting status, surrounding terrain. it's also possible to for the shaders to guess maintenance status - e.g. well maintained roads when the surrounding country side is urban or park-lands , less well maintained roads when unlit, and dirt or mud when there is irrigated or dry crop agriculture. Examples from wikimedia commons: [1],[2][3][4][5]

Transitions

v1, 13th Nov 2021.

There are two ways of dealing with transitions:

1. Searching the landclass texture in a special pass either by faking a compute shader, or actually using a compute shader once FG moves to Vulkan (in the LTS after the next LTS which may be 2023 or 2024).

2. Doing a search every pixel in the landclass texture in the fragment shader and finding an neighboring landlcasses. This is implemented already.

1) Special pass to search the landclass texture

  • Faking a compute shader by rendering to a texture so the right values get written to the right place
  • Done once, done as each tile is loaded
  • Combined landclass and transitions texture. Minimum 1 neighbor and 3 channels. 5 Channels: 1: landclass, 2: closest neighbor, 3: second closest neighbor, 4: mix factor between landclass and two neighbours, 5: split between neighbor 1 and 2. May be possible to combine 1 and 2 in once channel by using 4 bits (16 levels each).
  • Drawback: landclass and transitions texture needs 3x-5x more VRAM. 3 more channels could be used to include some advanced transitions parameters.
  • Performance on older GPUs: Advantage of this depends on how fast 2) is on older systems once shaders have been fully ported - unknown currently. Tradeoff = more VRAM usage (older GPUs will have less max VRAM) versus less searching - however the caching of texture means lookups in the immediate vicinity are fast so searching isn't too bad.

2) Searching the landclass texture every pixel (fragment)

This is currently implemented.

There are small scale transitions to depixelate the landclass texture. There are large scale transitions between large/coarse landclass blobs.

Scenery generation

Higher landclass texture resolution means less pixelation.

Vector landclass data -> landclass raster conversion: smaller landclass blobs look better with transitions enabled, but small and contrasting blobs may highlight the pixelation without transitions.

Object placement based on masks

v1, 21 Nov 2021.

WS2 scheme

  • Objects like trees or farm buildings were placed based on object placement masks (textures) - in WS2. Different texture channels were for different trees or scenery objects, and also controlled rotation of objects.
  • Each landclass blob had it's own set of ground texture coordinates that started from 0. These coordinates restarted after each landclass blob, so they didn't extend too far and cause precision issues.
  • C++ placement code looked at the ground texture coordinates, and the object maps, and then placed objects. This allowed things like trees to line up exactly with the agriculture field boundaries.
  • There are texture variants sets - C++ will randomly select different texture sets and their object placement masks.

WS30

  • The texture coordinates are created in a shader. Texture coordinates have to restart often to avoid precision issues.
  • The C++ placement code currently doesn't know the texture coordinates, or any detiling algorithms used for things like agriculture.
  • Texture variant selection is done on the GPU - currently there isn't a mechanism to do it on CPU and tell the GPU e.g. encoding it as a channel in the landclass texture at the cost of increasing occupancy and putting pressure on VRAM (which is the type of thing that is suited for a fake compute shader pass - see the transitions section).

Solution

  • Reproduce the texture coordinates code in C++. Reproduce the texture variant selection code in C++.
  • Reproduce the texture detiling code in C++. Detile the masks with it. The detiling code's noise functions must be such that they produce same results on all CPUs and all GPUs - this may need using texture lookup noise for detiling and texture variant selection.
  • It's easiest to use exactly the same code as glsl for maintenance reasons. C++ headers like glm can recreate glsl structures, operator overloads, and built-ins so glsl code will compile in C++ : https://github.com/g-truc/glm
  • The texture lookup and detiling glsl code can be separated into functions, and put in a shader include file. This file needs to be copied manually everytime it's changed - otherwise building simgear or flightgear becomes dependent on FGData. The file can be included in C++, and the functions called to lookup the object masks correctly - the function interface can stay the same.

Scenery size on server and client

v1, 13th Nov 2021.

It's possible to use different filenames to signify different formats for elevation and landclass texture data.

Terrain elevation data in the source DEMs are rounded off to 1m. So having more bits than that is overkill.

Terrain elevation option: 8 bit terrain elevation + offset stored in file name - when the tile elevation difference is less than 255m. If this is done, these tile should also be 8 bit in memory with an offset provided by a uniform - the vertex shader can translate different formats trivially. For tiles with elevations greater then 255m splitting elevation data into 2x2, 3x3 ..4x4 subtiles may work - the file names could indicate subtile format and offsets. Sometimes terrain has a simple underlying slope with small deviations - it's also possible to specify two gradients - elevation (z) = tile offset + dx*x + dy*y + z_8bit(x,y). Offset, dx, and dy can be specified in the filename and the vertex shader can trivially apply this.

Client: higher uncompressed scenery size means people run out of disk space and flush their Terrasync cache more often - this means higher TerraSync bandwidth. These days SSDs mean people's harddrive sizes are smaller than what they would be if the non-SSD technology was dominant. Hybrid SSDs are the solution, but are uncommon.

Client: Non-SSDs can suffer from file fragmentation. IIRC FG can now read Tar files. So it may be simpler to leave the tar files in place for all scenery instead of extracting them.

VRAM occupancy

v2, 9th Dec 2021.

As of early Nov, WS3 takes up more VRAM than WS2.

Vertex data: 16 bit + offset if possible. Rounded off to 1m this allows 65km tile size. A 1x1 degree tile at the equator is ~111km so this is an option for smaller tiles. The mesh points need to be rounded of to 1m, or a fraction of 1m for tiles that are 32km, 16km, or 8km. That may need a VPB change. It may also be possible to round off vertices to 1m at runtime - but it may take slightly more CPU times, unless there's already a algorithm that processes a lot of vertices (e.g. tree or object placement) - in which case it could be added to that. Rounding vertices should not create holes? as the connectivity should be intact.

Landclass data: Should be reduced to greyscale . Currently the actual landclass is stored in the g channel. The r channel contains the original CORINE values. Update, Dec 2021: Fixed.

Tile LoD scheme - fine tuning

v1, 9th Dec 2021.

World Scenery 2

  • Scenery is split into roughly square tiles. Each tile is of the same size and the same detail.
  • Rendering draws tiles in a square around the camera influenced by LoD: bare. Visible tiles are prioritised.
  • At high altitude a large number of tiles are drawn, and in the scenegraph. These tiles are at full detail.

World Scenery 3.0:

  • Scenery is split into roughly square tiles. There are currently 7 LoD levels (0 to 6) with increasing detail. Tiles of each LoD level cover the entire world.
  • Rendering draws tiles from a mixture of LoD levels. The further away from the camera, the less detail the drawn tiles are (lower LoD level).
  • For tiles of differing LoD levels to fit next to each other without gaps, each tile contains a boundary with a special mesh, like an interface layer - the tile skirt.
  • The lower LoD tiles are also bigger and cover a larger area compared to the higher LoD tiles - it takes fewer of them to cover the globe. This reduces the number of separate tiles visible at high altitude, and helps prevent the scenegraph from growing huge.
  • As the camera comes closer 4 tiles of a given LoD levels will be replaced by 1 tile of a higher LoD level. e.g. One LoD five tile will be replaced by four LoD 6 tiles. As the camera moves away four LoD 6 tiles will be replaced by one LoD 5 tile - and four LoD 5 tiles will be replaced by one LoD 4 tile.

Scenery pop-in

  • The lower tile detail levels means less vertices are visible when the camera is far. It also means the shape of the terrain is more approximate - but since the terrain is further away it should be hard to notice - with some fine tuning. How noticeable it gets depends on how much of the cameras field of view the scenery takes up - and how much the monitor takes up in the person's field of view. As the camera gets closer lower detail tiles are replaced by higher detail ones - this is called scenery pop-in.
  • The scenery pop-in is not visible if the colour of the space on the monitor taken up by tiles doesn't change. If the tile mesh is silhouetted against the sky - like a mountain ridge - then scenery pop-in is more noticeable as the colour of the background contrasts more. Similarly if a ridge or a mountain close to the camera is of a different colour the the background created by a a different landclass, water, or a city then the pop-in is more noticeable.
  • If the colour of texturing changes slightly between different LoDs then it will also be noticeable - even if the terrain is perfectly flat. Pop-in will be more noticeable in regions where the colour naturally changes over a short space - less in deserts.
  • There's some colour change due to the landclass texture being stretched by different amounts for each tile LoD level. The size of the texture remains the same, but the tile sizes change. It's technically possible to lookup landclases and textures of a neighbouring LoD level and blend, but this needs 2 extra texture lookups with maybe the landclass transition also being done twice - and is likely not worth it.
  • The stress test for scenery pop-in would be in a region with tall, complex, sharp mountains and ridges - like the Himalayas - so each tile LoD level contains large differences in elevation. If there is also colour variation over short ranges, it will help.

Tile loading

  • Currently (Nov 2021) tiles are loaded in a square around the camera influenced by LoD bare. The VPB tile manager now loads the tiles.
  • New tile LoDs take a while to load. Currently tile loading is easily seen as colours change for new tiles due to placeholder texture coordinates. To see the effect - try reloading scenery and watch as LoD levels are loaded from the lowest to the highest detail. It's also possible to rapidly go from high altitude to a distant spot and stop, with the UFO - and then watch tile loading. Colours change due to landclass textures between different tiles not matching as they are stretched over different sized tiles.
  • If tile loading is slow, it is possible to load the lowest detail tiles and then worry about the rest - which is what the VPB tile manager seems to be doing (??). I'm not sure loading tiles around the craft is that slow even on the more worse case systems (?), but loading the lowest detail and every other level at startup might be.
Tile loading prioritisation
  • Ideally tiles in front of the camera should be prioritised. Ideally the first tile LoD levels to be loaded should be the ones that would be selected based on distance from the camera at the time of loading. The 2nd pass of tiles to be loaded should be based on the direction the craft is moving. The 2nd LoD to be loaded should be a higher detail in the direction ahead of the craft, and a lower detail behind the craft. And so on for subsequent passes.
  • Tile loading priority depends on the altitude, how fast the craft is moving, and whether it's likely to change direction - based on speed, flight history, or craft type (e.g. airliner compared to acrobatic craft). How soon new LoDs are needed depends on how large the tiles are. At high altitude, huge, low-detail tiles like LoDs 0/1/2 don't need to be loaded quickly even when flying fast. For fast moving craft at low altitude, small and detailed tiles like LoDs 6/5/4 need to be loaded fast e.g. for a supersonic jet doing a low pass. If the craft is changing direction rapidly at high speeds like when doing acrobatics it's harder to predict. Loading time for acrobatics at slower speeds don't matter much as there will be plenty of time to load terrain.
  • For a lot of craft and flights it's safe to never load LoDs that are too different from the current requirements. Only tile LoDs close to the current altitude will be needed on a typical flight. This could reduce scenegraph nodes, and VRAM occupancy. For example, is unlikely to ever need 1x1 degree LoD 0 tiles for a glider - and a single prop craft is unlikely to need LoDs levels that are too different. By contrast a larger margin is needed for the Space Shuttle/SRBs which can quickly climb to orbit.
  • Starting on the ground is probably a special case. Only LoDs that are in view are needed for a while. It would benefit development/testing if sim startup is fast, as well as being more pleasant to use. Maybe a dev mode that starts up after only loading the tile under the craft would be useful - often people are iterating on non-scenery things, or on scenery effects/materials on scenery tiles which can be placed close to the camera.

Tile LoD algorithm

Profiling

The worst case systems are likely to be old laptops, and tile LoD schemes need to be verified on those systems.

Reducing number of tiles

Reducing the number of tiles reduces the CPU bottleneck from the OSG scenegraph traversal. Even if only a small number of tiles are being drawn, the presence of tile of other LoDs which are loaded but not needed could cause cull traversal to be very slow. RAM/VRAM occupancy is reduced - see the section on VRAM occupancy. See mailing list for a way of skipping OSG by using a couple of large vertex buffers for each tile LoD , and using a alternative landclass storage in memory.

Size of most detailed tiles: doubling the size of the LoD 6 tile reduces tile count by just over 4x. It means LoD level 1 will be the same size as LoD level 6. So one LoD level is not needed, reducing the tile count by more than 4x.

Increasing the size of the most detailed size means increasing the size of the landclass texture. It's likely the small size of the landclass textures means all of the texture gets cached, and allows doing lots of landclass lookups faster. However caching is designed so that local searching is fast. The landclass occupancy is also small - 1 channel 256x128 texture = 4 channel 64x32 texture. So it may not affect performance much. It's possible that increasing the landclass texture size makes landclass searches slower - and should be verified on older systems. It's possible to double the size of LoD 6, and skip LoD 5 - so LoD 4 to LoD 0 use the same size of texture. However, the scenery pop-in from LoD 6 to 4 will be big and probably noticeable.

Tile LoD selection scheme

To do



Shader info

v1, 13th Nov 2021.

Uniforms

The following Uniforms are generated by VPBTechnique itself, and do not need to be defined in ws30.eff:

  • uniform int tile_level; The tile LoD Level
  • uniform float tile_width; The tile width (E-W) in m
  • uniform float tile_height; The tile height (N-S) in m
  • uniform bool photoScenery; Whether a photo scenery is enabled and available for this tile
  • uniform vec4 dimensionsArray[128]; The dimensions of a given atlas texture index
  • uniform vec4 ambientArray[128]; The ambient color of a give atlas texture index
  • uniform vec4 diffuseArray[128]; The diffuse color of a give atlas texture index
  • uniform vec4 specularArray[128]; The specular color of a give atlas texture index
  • uniform mat4 zUpTransform; The matrix to rotate a given vertex in model space into a z-up frame
  • uniform vec3 modelOffset; The origin of this tile in earth centered coordinates


Possible things old/exotic compilers/GPUs may throw errors or have rendering issues with

It's possible there are some old compilers out there that may have issues with WS3. Maybe there are compilers that barely support glsl 1.3. Maybe compilers for mobile or newer GPU vendors might have the odd issue: Intel integrated, Intel discrete (Iris), Apple GPUs (M1 and newer), AMD APUs. Maybe some Linux opensource drivers will throw issues.

These are some possible future culprits that weren't present in WS2:

- Addressing vectors by index instead of member e.g. v[2] instead of v.z

- TextureGrad() - possible to use textureLoD and manually calculate lod - GLSL 1.30. dFdx()/DFdy() - GLSL 1.20. TextureSize() - GLSL 1.30