User talk:Philosopher/Nasal introspection: Difference between revisions

Jump to navigation Jump to search
m
Cut some unneeded stuff (that's what history is for - lucky us), add headings
m (Cut some unneeded stuff (that's what history is for - lucky us), add headings)
Line 14: Line 14:
—[[User:Hooray|Hooray]]
—[[User:Hooray|Hooray]]


: This stuff is only printed out to the log file ($FG_HOME/nasal_test_data.log) on simulator exit, which means that you have to either use the timer, the menu, or escape to exit (i.e. use the fgcommand directly/indirectly). Currently, and for reasons I don't understand, that seems to result in an infinite loop, so no you won't see anything. Frame rates are terrible here too, especially once one starts moving… [[User:Philosopher|—Philosopher]] ([[User talk:Philosopher|talk]]) 18:31, 14 August 2013 (UTC)
: This stuff is only printed out to the log file ($FG_HOME/nasal_test_data.log) on simulator exit, which means that you have to either use the timer, the menu, or escape to exit (i.e. use the fgcommand directly/indirectly). Frame rates are terrible here too, especially once one starts moving… [[User:Philosopher|—Philosopher]] ([[User talk:Philosopher|talk]]) 18:31, 14 August 2013 (UTC)


<hr/>
<hr/>


<hr/>
<hr/>
<code>Nasal garbage collection statistics: objects: 9915841, references: 19306781</code>
Regarding your changes in gc.c @ 230+ – I think we can use the same method to count active/reachable references per func, just by modifying mark() to increment a counter once it is invoked on a func, which would give us a rough idea how heavy some functions are – we could also distinguish between references to the active/lower scope and outer scope references (globals etc) —[[User:Hooray|Hooray]]
 
I just hope that this is not because of my patch that you took from the wiki – I don’t think I ever really tested it, it wasn’t intended to be used in “production code” – did it look sane to you ?
I need to check again … in other words, if the GC stats look better without my patch, it’s definitely my fault :–/
 
—[[User:Hooray|Hooray]]
 
: No, this is not due to any patch, this is definitely my fault :( —[[User:Philosopher|Philosopher]]
 
:: Regarding your changes in gc.c @ 230+ – I think we can use the same method to count active/reachable references per func, just by modifying mark() to increment a counter once it is invoked on a func, which would give us a rough idea how heavy some functions are – we could also distinguish between references to the active/lower scope and outer scope references (globals etc)
<hr/>
 
<hr/>
BTW: If I am not mistaken, all of your calls to <tt>do_worker_thread</tt> now create one thread per call – which ends up adding possibly hundreds of threads, depending on the number of calls – in general, it is more efficient to have a pool of threads, say a handful – or ideally, one for each core, and then keep those busy by filling their work queues.
 
Having many more threads than cores is usually problematic, unless those threads are in some way I/O-bound, i.e. they have to “wait”, so that they can be multiplexed across physical cores while waiting
 
—[[User:Hooray|Hooray]]
 
: Okay, for some reason (I just checked) I only saw ~6 threads in a test run (according to the Mac activity monitor), but the code currently committed is too much of a late-night hack :P —[[User:Philosopher|Philosopher]]
:: I think I have a solution now, will try and test it. Create them the same way, but limit the number of threads, and a thread only exits if there are no more jobs to be done, else it acquires a lock, picks up a new job / replaces the global queue with a new one, drops the lock, and starts work. Will see how it goes [[User:Philosopher|—Philosopher]] ([[User talk:Philosopher|talk]]) 18:21, 14 August 2013 (UTC)
:: Yeah, you can use semaphores for that.
<hr/>
<hr/>


Line 54: Line 33:
::: I just pushed another threading approach, it's something in between what you just described and what I had before. I'm still getting huge amounts of GC-findable objects, and I have a feeling that's coming from my script, but where?? The queues should be thrown out sometime, as once the worker thread is done with it it should be garbage, and I can't think of anything else that is more than a "temp". Anyways, it sorta works now, frame latencies of 100-200 when I don't garbage collect, but that happens every other frame so really it is in the thousands and going up :P. [[User:Philosopher|—Philosopher]] ([[User talk:Philosopher|talk]]) 20:29, 14 August 2013 (UTC)
::: I just pushed another threading approach, it's something in between what you just described and what I had before. I'm still getting huge amounts of GC-findable objects, and I have a feeling that's coming from my script, but where?? The queues should be thrown out sometime, as once the worker thread is done with it it should be garbage, and I can't think of anything else that is more than a "temp". Anyways, it sorta works now, frame latencies of 100-200 when I don't garbage collect, but that happens every other frame so really it is in the thousands and going up :P. [[User:Philosopher|—Philosopher]] ([[User talk:Philosopher|talk]]) 20:29, 14 August 2013 (UTC)


::: lol, pretty massive over here:
=== Dealing with collected data ===
<pre>
**** Nasal garbage collection statistics: objects: 4933721, references: 10290187
**** Nasal garbage collection statistics: objects: 10130315, references: 20934202
**** Nasal garbage collection statistics: objects: 14722874, references: 30285505
</pre>
 
As an aside, I'd use another thread to sort the data prior to writing it to a file - I am getting a bunch of stuff at the beginning of the file that simply doesn't seem very relevant in comparison to the rest.
As an aside, I'd use another thread to sort the data prior to writing it to a file - I am getting a bunch of stuff at the beginning of the file that simply doesn't seem very relevant in comparison to the rest.


Line 69: Line 42:
::: Disregard that idea, looking at the log file and type of info written there, it would actually be simpler to directly use Andy's SQLite bindings and dump each profiling run to a real DB that supports key/value lookups and searches, and it would take less space too.
::: Disregard that idea, looking at the log file and type of info written there, it would actually be simpler to directly use Andy's SQLite bindings and dump each profiling run to a real DB that supports key/value lookups and searches, and it would take less space too.


=== Whole session getting profiled ?? ===
It's confirmed: As previously suggested, your setupDebugExtras() call results in the entire sim session being profiled, see my earlier comments - ideally, just passing extras to call() should MERELY profile the specified naFunc, nothing else - according to the profile, I am seeing the entire sim session profiled during the run, not just the test func - so that's where all the data and refs are coming from.
It's confirmed: As previously suggested, your setupDebugExtras() call results in the entire sim session being profiled, see my earlier comments - ideally, just passing extras to call() should MERELY profile the specified naFunc, nothing else - according to the profile, I am seeing the entire sim session profiled during the run, not just the test func - so that's where all the data and refs are coming from.


Line 161: Line 135:
== GC Stats ==
== GC Stats ==
At the very least, we should also dump stats for each naPool - so that we know how objects are distributed (scalars, vectors, hashes, ghosts). Afterwards, we should look into exposing a function that dumps stats per namespace, so that we can use an extension function that dumps stats gathered via mark() - i.e. to tell how tell how heavy a certain namespace is.
At the very least, we should also dump stats for each naPool - so that we know how objects are distributed (scalars, vectors, hashes, ghosts). Afterwards, we should look into exposing a function that dumps stats per namespace, so that we can use an extension function that dumps stats gathered via mark() - i.e. to tell how tell how heavy a certain namespace is.
== Deadlock ==
I am seeing a deadlock during startup here with your latest code:
<pre>
Add_func try lock
Add_func holds lock
Add_func release lock
Add_func try lock
Add_func holds lock
Add_func try lock
</pre>
(it locks and then tries to lock again)
395

edits

Navigation menu