Improving Nasal: Difference between revisions

Switch to {{gitorious url}} to fix the broken Gitorious link.
(Switch to {{gitorious url}} to fix the broken Gitorious link.)
 
(17 intermediate revisions by 4 users not shown)
Line 9: Line 9:
So, rather than having Nasal flame wars and talking about "alternatives" like Perl, Python, Javascript or Lua, the idea is to document known Nasal issues so that they can hopefully be addressed eventually.
So, rather than having Nasal flame wars and talking about "alternatives" like Perl, Python, Javascript or Lua, the idea is to document known Nasal issues so that they can hopefully be addressed eventually.


If you are aware of any major Nasal issues that are not yet covered here, please feel free to add them here, however it is also a good idea to use the FlightGear bug tracker in such cases: http://flightgear-bugs.googlecode.com/
If you are aware of any major Nasal issues that are not yet covered here, please feel free to add them here, however it is also a good idea to use the FlightGear bug tracker in such cases: {{create ticket}}


= Consider Opcode reordering =
= Consider Opcode reordering =
Line 30: Line 30:
</syntaxhighlight>
</syntaxhighlight>


Which basically means that we only need to worry about a single place when it comes to extending opcodes (and checking in run() that these invalid opcodes aren't used), which also translates into less assembly instructions that are actually run (2 CMP vs. ~12 per insn). Also, the bytecode interpreter routine itself could be simplified that way, too. In addition, it would make sense to augment the list of opcode enums by adding an OP_VERSION field that is incremented once opcodes are added/removed:
Which basically means that we only need to worry about a single place when it comes to extending opcodes (and checking in run() that these invalid opcodes aren't used), which also translates into fewer assembly instructions that are actually run (2 CMP vs. ~12 per insn). Also, the bytecode interpreter routine itself could be simplified that way, too. In addition, it would make sense to augment the list of opcode enums by adding an OP_VERSION field that is incremented once opcodes are added/removed (which would be a prerequisite for any caching/serialization schemes too):


<syntaxhighlight lang="c">
<syntaxhighlight lang="c">
Line 70: Line 70:
Besides making a full IDE (which would be ''really'' cool), there are several things that can be done by editing the source code of Nasal to enhance debugging support and increase development :
Besides making a full IDE (which would be ''really'' cool), there are several things that can be done by editing the source code of Nasal to enhance debugging support and increase development :


* Being able to dump the global namespace (see [http://flightgear.org/forums/viewtopic.php?f=30&t=19049&p=182930&#p182930 this topic] for a possible solution) or at least dump things prettily (an unreleased version of the file discussed in [[Nasal Meta-Programming]] has good support for nice formatting). This should probably be  lazy API that can dump an arbitrary namespace recursively - using the canvas, we could then map that to a TreeView
* add build time/runtime sanity checks for Nasal core internals, especially naRef/GC stuff like Andy's pointer hacks, which did cause problems in the past, especially WRT to aggressive compiler optimizations and naHash - see {{Issue|1240}} and Philosopher's comments - in the meantime, consider making naRef stuff '''volatile''' and using gcc attributes to disable any/all optimizations here [http://gcc.gnu.org/wiki/FunctionSpecificOpthttp://gcc.gnu.org/wiki/FunctionSpecificOpt] [http://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html] {{Not done}}
 
* Being able to dump the global namespace (see [http://forum.flightgear.org/viewtopic.php?f=30&t=19049&p=182930&#p182930 this topic] for a possible solution) or at least dump things prettily (an unreleased version of the file discussed in [[Nasal Meta-Programming]] has good support for nice formatting). This should probably be  lazy API that can dump an arbitrary namespace recursively - using the canvas, we could then map that to a TreeView
* Register a callback for handling errors using call() (for parser errors it will need the AST, for runtime errors it would need bytecode access)
* Register a callback for handling errors using call() (for parser errors it will need the AST, for runtime errors it would need bytecode access)
* work on abstracting the GC interface (Hooray) {{Progressbar|50}}
* work on abstracting the GC interface (Hooray) {{Progressbar|50}}
* Register a callback for OP_FCALL et al. to be able to time function calls {{Progressbar|80}}. Example [https://gitorious.org/~philosopher/nasal-standalone/nasal-experiments/blobs/extended-f_call/test.nas].
* Register a callback for OP_FCALL et al. to be able to time function calls {{Progressbar|80}}. Example [{{gitorious url|nasal-standalone|nasal-experiments|branch=extended-f_call|path=test.nas}}].
* Set breakpoints: register callbacks for values of <tt>(struct Frame*)->ip</tt>.
* Set breakpoints: register callbacks for values of <tt>(struct Frame*)->ip</tt>.
** Typically supported conditional break point types are [http://www.ofb.net/gnu/gdb/gdb_28.html][http://www.delorie.com/gnu/docs/gdb/gdb_29.html][http://winappdbg.sourceforge.net/HowBreakpointsWork.html]:
** Typically supported conditional break point types are [http://www.ofb.net/gnu/gdb/gdb_28.html][http://www.delorie.com/gnu/docs/gdb/gdb_29.html][http://winappdbg.sourceforge.net/HowBreakpointsWork.html]:
Line 80: Line 80:
* Time other parts of Nasal (not just VM) with a compile-time flag? (could be stored in the Context struct, so that sub contexts would have their own flags, i.e. recursive scripts would not affect each other)
* Time other parts of Nasal (not just VM) with a compile-time flag? (could be stored in the Context struct, so that sub contexts would have their own flags, i.e. recursive scripts would not affect each other)
* Also, add some form of Context-based debug/log-level flag for different verbosity levels and phases (parse,codegen,vm,gc) - and maybe don't write it directly to the console, but allow a container/callback to be specified - for better integration/processing by the host app (fgfs)
* Also, add some form of Context-based debug/log-level flag for different verbosity levels and phases (parse,codegen,vm,gc) - and maybe don't write it directly to the console, but allow a container/callback to be specified - for better integration/processing by the host app (fgfs)
* Better error messages {{Progressbar|30}}.
* Better error messages {{Progressbar|30}}.
** '''Parsing:''' Say something other than "parse error", like "null pointer".
** '''Parsing:''' Say something other than "parse error", like "null pointer".
** '''VM:''' Indicate type of variable if wrong type.
** '''VM:''' Indicate type of variable if wrong type.
Line 90: Line 90:
** Optimize: not started (it makes only sense to look at optimizations after we're able to instrument/profile a running FG session to come up with hot spots that are executed either frequently, or that are responsible for significant runtime overhead - i.e. due to GC pressure or other issues)
** Optimize: not started (it makes only sense to look at optimizations after we're able to instrument/profile a running FG session to come up with hot spots that are executed either frequently, or that are responsible for significant runtime overhead - i.e. due to GC pressure or other issues)
** Working with it: provide Bytecode class in Nasal: not started<!-- (the exposeOpcode() API already exists, most other machinery could be built in scripting space on top of it?)-->
** Working with it: provide Bytecode class in Nasal: not started<!-- (the exposeOpcode() API already exists, most other machinery could be built in scripting space on top of it?)-->
* Inspect Context: not started, should be easy.
* Inspect Context: {{Progressbar|80}} (passed as argument to callbacks).
* Expose Tokens to Nasal: implemented by Hooray as argument to compile(), should be extended to cover output from lex.c and after blocking in addition to the current after-prec-ing (and before freeing!) support. {{Progressbar|70}}
* Expose Tokens to Nasal: implemented by Hooray as argument to compile(), should be extended to cover output from lex.c and after blocking in addition to the current after-prec-ing (and before freeing!) support. {{Progressbar|70}}
** compile() being used by call(), it should be straightforward to also map a call() hash.callback to do the same thing here - so that there's no disparity here.
** compile() being used by call(), it should be straightforward to also map a call() hash.callback to do the same thing here - so that there's no disparity here.
Line 98: Line 98:
** Option 2: recognize assignment in the VM and if there is a bindToContext event, set the name of the function based upon either the last LOCAL/MEMBER/HINSERT or the combination of them (i.e. complex lvalues like local.fn). This presents some obvious issues, however:
** Option 2: recognize assignment in the VM and if there is a bindToContext event, set the name of the function based upon either the last LOCAL/MEMBER/HINSERT or the combination of them (i.e. complex lvalues like local.fn). This presents some obvious issues, however:
*** The right-hand side of an assignment is done before the left-hand side, thus one would have to look ahead to see the assignment, which is clearly illegal for the VM to do.
*** The right-hand side of an assignment is done before the left-hand side, thus one would have to look ahead to see the assignment, which is clearly illegal for the VM to do.
*** Or one could look behind to see a naCode constant being pushed, and give some indication to its naFunc that it now has a name. This I still somewhat illegal, but not dangerous and thus could be done.
*** Or one could look behind to see a naCode constant being pushed, and give some indication to its naFunc that it now has a name. This is still somewhat illegal, but not dangerous and thus could be done.
*** how is this supposed to deal with multiple symbols aliasing the same function ?
** Option 3: abandon <tt>var foo = func(){}</tt> for ECMAscript-like function declaration syntax <tt>function foo() {}</tt>. This would not affect the use of anonymous func expressions but would instead be applicable in cases where we want to say "this function is static (i.e. permanent) and should have a name" (as opposed the the case of temporary storage variables for functions). Regardless of the method used, a name member will have to be added to naFunc's and the VM and error handling procedures will have to be changed according.
** Option 3: abandon <tt>var foo = func(){}</tt> for ECMAscript-like function declaration syntax <tt>function foo() {}</tt>. This would not affect the use of anonymous func expressions but would instead be applicable in cases where we want to say "this function is static (i.e. permanent) and should have a name" (as opposed the the case of temporary storage variables for functions). Regardless of the method used, a name member will have to be added to naFunc's and the VM and error handling procedures will have to be changed according.
** Regarding the last comment: Providing an API to "lock" a symbol/naRef to become immutable would be generally useful, not just for functions - but also for constants (math.pi FT2M etc) and other stuff that may otherwise break consistency - ThorstenR mentioned a couple of times how he's intentionally replicating standard constants in LW/AW just to be on the safe side, because there's no such thing as a "constant" in Nasal. Providing a library function to make naRefs read-only should be straightforward, and could be easily implemented by hooking into the VM to register a callback that yields naRuntimeError() -aka die()- for any such attempts. The method would be scalable to also implement optional typing or min/max/stepping (value ranges), too
** Regarding the last comment: Providing an API to "lock" a symbol/naRef to become immutable would be generally useful, not just for functions - but also for constants (math.pi FT2M etc) and other stuff that may otherwise break consistency - ThorstenR mentioned a couple of times how he's intentionally replicating standard constants in LW/AW just to be on the safe side, because there's no such thing as a "constant" in Nasal. Providing a library function to make naRefs read-only should be straightforward, and could be easily implemented by hooking into the VM to register a callback that yields naRuntimeError() -aka die()- for any such attempts. The method would be scalable to also implement optional typing or min/max/stepping (value ranges), too.
** This should be doable, since I do a naRuntimeError(ctx,naGetError(subcontext)) which ''should'' pass errors from a callback (untested).
** This should be doable, since I do a naRuntimeError(ctx,naGetError(subcontext)) which ''should'' pass errors from a callback (untested).
* Timing parts of VM: use the callbacks and systime/unix.time to time things. Need hooks into the GC as well. Statistics worth tracking (also look at similar tools like [http://google-perftools.googlecode.com/svn/trunk/doc/cpuprofile.html google perftools]):
** To implement support for immutable symbols (constants/ private/protected encapsulation), one would really just need to either lock WRITE access or restrict visibility, which could work analogous to parents, just as embedded protected/private hashes that are honored by the codegen.
* Getting stats on Nasal performance while running: use the callbacks and systime to time things. Need hooks into the GC as well. Statistics worth tracking (also look at similar tools like [http://google-perftools.googlecode.com/svn/trunk/doc/cpuprofile.html google perftools]):
** Per function (also handles timers/listeners and other typical FG callbacks):
** Per function (also handles timers/listeners and other typical FG callbacks):
*** ncalls per frame/cumulative
*** ncalls per frame/cumulative
*** time per call on avg/cumulative
*** time per call (kept as list) {{Done}} – display min/max/avg/cumulative.
*** min/max time
*** number of GC invocations & avg/min/max time
*** number of GC invocations & avg/min/max time
*** number of naNew() calls
*** number of naNew() calls
*** number & list of names of once-use variables; n that are numbers (i.e. non-GC-managed).
*** number & list of names of once-use variables {{Done}}; n that are numbers (i.e. non-GC-managed).
*** try to come up with heuristics to track *identical* temporaries per callback invocation: [[User:Philosopher/Howto:Write Optimized Nasal Code]]
*** maybe some holistic "GC pressure" percentage over time (5,30,60,300 seconds ?)
*** maybe some holistic "GC pressure" percentage over time (5,30,60,300 seconds ?)
*** GC pressure can be computed by looking not just at new allocations, but also at realloc() events and the mark/reap phases
*** GC pressure can be computed by looking not just at new allocations, but also at realloc() events and the mark/reap phases
*** for GC stats, we can also easily access 1) size of all allocated naType pools and 2) percentage that's in use and 3) newBlock() allocations
*** for GC stats, we can also easily access 1) size of all allocated naType pools and 2) percentage that's in use and 3) newBlock() allocations
** Per context/global:
** Per context/global:
*** time spent
*** time spent
Line 125: Line 125:
*** locking overhead
*** locking overhead


= Performance / Optimizations =
* a bunch of performance issues were reported [http://www.mail-archive.com/flightgear-devel@lists.sourceforge.net/msg36668.html] to be related to:
** accidentally registering listeners/callbacks twice without noticing (or even more often)
** never freeing timers/listeners, i.e. we could make sure that issue a warning if a listener's ghost is GC'ed while the listener is still active, because there's then now way to clean up the listener
** aircraft/addon scripts setting up listeners and timers without registering a /sim/signals/reset handler that handles cleanup
** always letting timers run at frame rate
* at least overload the settimer/setlistener API to support a singleton/ONCE param that issues a warning once the VM determines that multiple instances were registered?
* Hooray: look into adapting the existing GC scheme to support multiple generations - which is a straightforward optimizations even without being a GC expert, it basically boils down to having a single typedef enum {GC_GEN0, GC_GEN1} GENERATION; in code.h and then changing all places in $SG_NASAL to use the GC_GEN0 pool by default, i.e. instead of having &globals->pools[type]; - we would have &globals->pools[GC_GEN0][type]; for starters - the GC manager would then by default only mark/reap the GEN0 (nursery, young generation), promote any objects that survived the GC phase to GEN1, and only ever mark/reap GEN1 if GEN0 has to be resized (start off with reasonably sized generations, based on real stats - e.g. GEN0 16 MB and GEN1 32MB). {{Not done}}
** http://en.wikipedia.org/wiki/Garbage_collection_(computer_science)#Generational_GC_.28ephemeral_GC.29
** http://c2.com/cgi/wiki?GenerationalGarbageCollection
** http://blogs.msdn.com/b/abhinaba/archive/2009/03/02/back-to-basics-generational-garbage-collection.aspx
* other dynamic languages like lua or python have GC hooks to customize the GC and to call it on demand: [http://lua-users.org/wiki/GarbageCollectionTutorial], [http://luatut.com/collectgarbage.html]
* another straightforward optimization would be exposing an API to allocate new objects in a certain generation (GEN0/GEN1) to directly tell the interpreter about the object's lifetime ( until reset/reinit, until aircraft change, timer/frame based).
* marking/reaping could be parallelized using several threads, for each pool - by using write barriers to sync access to naRefs
== Expose additional threading primitives ==
Consider using pthreads, Nasal's threading support is extremely basic [http://plausible.org/nasal/lib.html].


[[Category:Core developer documentation]]
[[Category:Core developer documentation]]
[[Category:Nasal]]
[[Category:Nasal]]
[[Category:Developer Plans]]
[[Category:Developer Plans]]