Hi fellow wiki editors!

To help newly registered users get more familiar with the wiki (and maybe older users too) there is now a {{Welcome to the wiki}} template. Have a look at it and feel free to add it to new users discussion pages (and perhaps your own).

I have tried to keep the template short, but meaningful. /Johan G

Difference between revisions of "Howto:Canvas Path Benchmarking"

From FlightGear wiki
Jump to: navigation, search
(Code)
(Background)
Line 4: Line 4:
  
 
== Background ==
 
== Background ==
One present-day perfomance property tree-writte bottleneck would be complex canvas instruments - take any of the Shuttle displays which shows roughly a hundred sensor readings each, and there's nine of those. That in the worst case means fetching roughly 900 properties from the FDM, converting them to whatever units and strings we want to display and writing them again to the tree for canvas. Combine with the ADI ball which renders a 2d projection of a 3d shape, adding another number of the same order of magnitude to write. If we'd do a naive update 'all in one frame', you'd see the bottleneck, the only reason you don't is that we fetch and update staggered. There's a handful of other cases in AW where property I/O is a bottleneck and I've worked around it. <ref>{{cite web
+
One present-day perfomance property tree-writte bottleneck would be complex canvas instruments - take any of the Shuttle displays which shows roughly a hundred sensor readings each, and there's nine of those. That in the worst case means fetching roughly 900 properties from the FDM, converting them to whatever units and strings we want to display and writing them again to the tree for canvas. Combine with the ADI ball which renders a 2d projection of a 3d shape, adding another number of the same order of magnitude to write. If we'd do a naive update 'all in one frame', you'd see the bottleneck, the only reason you don't is that we fetch and update staggered. There's a handful of other cases in AW where property I/O is a bottleneck and Thorsten had to  work around it. <ref>{{cite web
 
   |url    =  https://sourceforge.net/p/flightgear/mailman/message/35517837/  
 
   |url    =  https://sourceforge.net/p/flightgear/mailman/message/35517837/  
 
   |title  =  <nowiki> Re: [Flightgear-devel] Explicit recursive listeners </nowiki>  
 
   |title  =  <nowiki> Re: [Flightgear-devel] Explicit recursive listeners </nowiki>  

Revision as of 16:22, 26 January 2018

This article is a stub. You can help the wiki by expanding it.
Screenshot showing a Canvas GUI dialog with 3000 random OpenVG paths (line segments) drawn (for benchmarking purposes) [1]

Background

One present-day perfomance property tree-writte bottleneck would be complex canvas instruments - take any of the Shuttle displays which shows roughly a hundred sensor readings each, and there's nine of those. That in the worst case means fetching roughly 900 properties from the FDM, converting them to whatever units and strings we want to display and writing them again to the tree for canvas. Combine with the ADI ball which renders a 2d projection of a 3d shape, adding another number of the same order of magnitude to write. If we'd do a naive update 'all in one frame', you'd see the bottleneck, the only reason you don't is that we fetch and update staggered. There's a handful of other cases in AW where property I/O is a bottleneck and Thorsten had to work around it. [2]


Assembling a property path by string manipulation may be in theory less appealing, but it is in practice 3 to 10 times faster than using the props module (implemented in scripting space)- Thorsten has made several benchmark tests, all leading to the same result. Large-scale property manipulation from Nasal is performance hungry and should be avoided if possible by using Nasal-internal variables instead, and if it needs to be done, getprop() /setprop() offer significantly superior performance. If you dig a bit in the mailing list archive, there should be a post with the actual benchmark test results.[3]

setprop is nearly always the fastest, but not always, sometimes getValue (with a relative path) can be slower. But setprop was a maximum of ~2 times faster than the other methods. we need to investigate this further to see where the time is lost. Probably somewhere while creating Nasal objects. Maybe with directly using a ghost for property nodes it will get faster. Also the methods with relative paths and set/getprop use exactly the same code on the C++ side, so the performance is lost somewhere between the props.nas module and the C++/Nasal bindings.[4]

In the second case the property tree is being walked trough by the C++ code while in the first case it's done in Nasal. You just discovered that Nasal is 10x slower than C++ code![5]

it's definitely slower, because there's more work to do. Evaluating "node.getValue()" requires:

  • Pushing the symbol "node" onto the stack
  • Executing OP_LOCAL to look it up
  • Pushing the symbol "getValue" onto the stack
  • Executing OP_MEMBER to look it up in the object
  • Executing OP_CALL to call it as a function
  • Finally (!) calling the C++ property node function
  • Turn the output node into a Nasal object and leave it on the stack.[6]


a bigger issue might be that both getChild() and getNode() create and return a hash object that is only used once here before becoming garbage. Since this was loop with many iterations a good deal of garbage was created so possibly a significant amount of the time was spent in the garbage collector. OTOH the string handling also ought to generate garbage but the short strings are probably smaller than the hash objects.[7]

Motivation

1rightarrow.png See Shuttle ADI ball for the main article about this subject.

Airport-selection-dialog.png
Screenshot showing a Canvas GUI dialog next to the built-in Property browser, illustrating how there is a 1:1 mapping between the number of line segments/paths added, and the internal representation in the global FlightGear property tree.

We suggest to avoid the 'remove all children' logic, which is obviously unfriendly to the Canvas[8][9]. The whole removeAllChildren() method is unfriendly to the canvas, and should eventually be replaced with something more efficient.[10]

This first showed up with the taxiways at KNUQ, Moffett Federal AFLD (KNUQ) seemed to take longer to display than it did before. It always took a long time to display, during which FG hangs[11]

This was caused by massive property I/O, i.e. thousands of properties added to the global property tree to render complex 2D geometry.

Assume you have one array (Nasal vector) you have written into a path, now you want to replace the path element by one corresponding to the second array. [12]

Code

Google perftools profile showing the overhead of the property I/O when running the benchmark.
Callback-monitor-ui-prototype.png
###
# Based on, and adapted from: 
# http://wiki.flightgear.org/Canvas_Snippets#Adding_OpenVG_Paths

var (width,height) = (640,480);
var title = 'Canvas.Path.setData() test';

var window = canvas.Window.new([width,height],"dialog").set('title',title);

var myCanvas = window.createCanvas().set("background", canvas.style.getColor("bg_color"));

var root = myCanvas.createGroup();

var graph = root.createChild("group");

var x_axis = graph.createChild("path", "x-axis")
.moveTo(10, height/2)
.lineTo(width-10, height/2)
.setColor(1,0,0)
.setStrokeLineWidth(3);

var y_axis = graph.createChild("path", "y-axis")
.moveTo(10, 10)
.lineTo(10, height-10)
.setColor(0,0,1)
.setStrokeLineWidth(2);


var plot = graph.createChild("path", "plot")
.moveTo(10, height/2)
.setColor(0,1,0)
.setStrokeLineWidth(0.5);

var randomPoints = func(total=3000) {
window.set('title',title~" points="~total);

var cmds = [];
var points = [];

for(var i=0;i<=total;i+=1) {
 append(cmds, canvas.Path.VG_LINE_TO);
 var x = 10+rand() * (width-10) + rand()*5;
 var y = 10+rand() * (height-10) + rand()*8;
 append(points, [x,y]);
 # print("x/y: ", x, y);
}
return {cmds:cmds, points:points};
}


var setData_test = func(data) {
plot.setData( data.cmds, data.points);
}



var updateCallback = func(max_points=5500) {
 var totalPoints = rand() * max_points;
 var myPoints = randomPoints(totalPoints);
 debug.benchmark("setData() test, points:"~totalPoints, func setData_test(myPoints) );
}
# http://wiki.flightgear.org/Built-in_Profiler#Nasal


canvas.InputDialog.getText("Canvas.Path Benchmark", "Max number of points ?", func(btn,value) {
    if (value)  {
    fgcommand("profiler-start");
    var updateTimer = maketimer(0.5, func updateCallback(max_points:int(value)) );
    updateTimer.start();
    settimer(func updateTimer.stop(), 30);

window.del = func()
{
  print("Cleaning up window:",title,"\n");
  updateTimer.stop();
  call(canvas.Window.del, [], me);
  fgcommand("profiler-stop");
};
}
});

Ideas

How and why the removeChildren() idiom is unfriendly to the Canvas

it's really the amount of property accesses that contribute to the performance issue[13]


api.nas:

 # Set the path data (commands and coordinates)
  setData: func(cmds, coords)
  {
    me.reset();
    me._node.setValues({cmd: cmds, coord: coords});
    me._last_cmd = size(cmds) - 1;
    me._last_coord = size(coords) - 1;
    return me;
  }
Note  This should be really straightofrward to reimplement as part of nasal-props.cxx in C++ (~50 LOC) using the APIs documented at Howto:Extend_Nasal

props.nas:

##                                                                             
# Useful utility.  Sets a whole property tree from a Nasal hash                
# object, such that scalars become leafs in the property tree, hashes          
# become named subnodes, and vectors become indexed subnodes.  This            
# works recursively, so you can define whole property trees with               
# syntax like:                                                                 
#                                                                              
# dialog = {                                                                   
#   name : "exit", width : 180, height : 100, modal : 0,                       
#   text : { x : 10, y : 70, label : "Hello World!" } };                       
#                                                                              
Node.setValues = func(val) {                                                   
    foreach(var k; keys(val)) { me._setChildren(k, val[k]); }                  
}                                                                              
                                                                               
##                                                                             
# Private function to do the work of setValues().                              
# The first argument is a child name, the second a nasal scalar,               
# vector, or hash.                                                             
#                                                                              
Node._setChildren = func(name, val) {                                          
    var subnode = me.getNode(name, 1);                                         
    if(typeof(val) == "scalar") { subnode.setValue(val); }                     
    elsif(typeof(val) == "hash") { subnode.setValues(val); }                   
    elsif(typeof(val) == "vector") {                                           
        for(var i=0; i<size(val); i+=1) {                                      
            var iname = name ~ "[" ~ i ~ "]";                                  
            me._setChildren(iname, val[i]);                                    
        }                                                                      
    }                                                                          
}

props.Node is pretty bad, and passing separate arguments to getprop() is as good (if not better) than passing one argument (without concatenating it, i.e. scalar constant)[14]

The point was just that there's TONS of data in the property tree in cases like KNUQ (1200+ paths).

And I used props.copy() as a test case to see how long it takes in Nasal space to write so many properties to the tree, because that's exactly what the code is doing to: Setting 1200+ paths in the canvas.

So I wanted to see how long this operation takes in and of itself.[15]


Extend nasal-props.cxx to implement props.setValues() and props.setChildren() in C++ (or directly use Nasal/CppBind).

cppbind used for Nasal/C++ bindings, but probably it'll be better to port the whole file at once). One thing I don't like is the signature of getValue and similar methods. I'd keep it similar to the core SGPropertyNode and don't allow creating new nodes with eg. getValue but instead return a default value (which can be passed as default argument). Otherwise, having relative paths is definitely a useful addition.[16]

getprop/setprop are known for not just being "quick & dirty", but also for outperforming the props.nas magic performance-wise, including the Nasal ghosts, which is plausible - because of the API and stack machine overhead. Andy himself repeatedly mentioned that several times in the past, e.g. see: http://www.mail-archive.com/flightgear-devel@lists.sourceforge.net/msg12222.html Also, Thorsten has previously run benchmarks to compare props.nas vs. setprop/getprop.

Like Tom mentioned previously, it would probably make sense to eventually port the whole props.nas stuff to use cppbind and then proceed from there, that should leave little Nasal-space overhead, and once everything is in C++ space, it will be straightforward to see how to optimie things there using the google perftools fgcommands. Performance-wise, many fgcommands and Nasal extension functions show up - even without necessarily invoking the GC. And the constant string building is a known issue, but there are certainly workarounds or special-case optimizations possible, even if it's just a simple space/time tradeoff in the form of a cached property tree. [17]

we should not provide the wrappers in props.nas (which are naCodes) and instead directly port the C++ extensions (naCCodes) to live directly in the object? That could be done using the naRef me and a hash lookup for "_g" (which should be interned and/or cached just like arg and parents are). I'm not sure I understand your last paragraph though...[18]

I think we've been tossing with the idea of using Tom's cppbind framework to update/modernie the props.nas/props.cxx stuff - and it would obviously also help with Canvas related stuff where lots of property tree I/O is a factor (i.e. taxiways layer) - so given that 3.0 is now out, this could be a good opportunity to re-implement props.nas on top of cppbind - indeed, it would be interest to benchmark a test case first - in theory - once implemented in native code, a cached property.setValue() call should be faster than plain setprop("foo", value). [19]

Foo

Related

References

References
  1. https://forum.flightgear.org/viewtopic.php?f=71&t=30864&hilit=removeAllChildren
  2. Thorsten Renk  (Nov 28th, 2016).  Re: [Flightgear-devel] Explicit recursive listeners .
  3. Renk Thorsten  (Apr 15th, 2013).  Re: [Flightgear-devel] Nasal props API relative path support .
  4. Thomas Geymayer  (Apr 15th, 2013).  Re: [Flightgear-devel] Nasal props API relative path support .
  5. Erik Hofman  (Sep 15th, 2010).  Re: [Flightgear-devel] Musings on optimizing Nasal code .
  6. Andy Ross  (Aug 15th, 2007).  Re: [Flightgear-devel] nasal variables .
  7. Anders Gidenstam  (Sep 16th, 2010).  Re: [Flightgear-devel] Musings on optimizing Nasal code .
  8. zakalawe  (Nov 4th, 2013).  Re: How to display Airport Chart? .
  9. Hooray  (Oct 4th, 2015).  Re: Space Shuttle .
  10. Hooray  (Nov 6th, 2013).  Re: How to display Airport Chart? .
  11. stuart  (Sep 27th, 2012).  Re: Using a .
  12. Thorsten  (Oct 30th, 2016).  Re: Canvas ADI ball (shuttle) / circular clipping .
  13. Hooray  (Sep 20th, 2012).  Re: Using a canvas map in the GUI .
  14. Philosopher  (Feb 20th, 2014).  getprop() and .
  15. Hooray  (Sep 20th, 2012).  Re: .
  16. TheTom  (Apr 10th, 2013).  Re: Relative paths for Nasal .
  17. Hooray  (Apr 15th, 2013).  Re: Relative paths for Nasal .
  18. Philosopher  (Apr 17th, 2013).  Re: Relative paths for Nasal .
  19. Hooray  (Feb 20th, 2014).  Re: getprop() and .