FlightGear wiki talk:Instant-Refs: Difference between revisions

Jump to navigation Jump to search
Line 257: Line 257:
* runtime of the expression
* runtime of the expression


The search space, and runtime, can be significantly reduced by looking at similarities between all examples and coming up with a subset string that contains all identical components (e.g. the <code>From:</code> part in the author regex).
The search space, and runtime, can be significantly reduced by looking at similarities between all examples and coming up with a subset string that contains all identical components (e.g. the <code>From:</code> part in the author regex) and use that for seeding the initial generations.
 
Ultimately, this would allow the script to self-update its regex/xpath expressions if/when the underlying website (themes) change, but it would also allow to add support for new websites, without ever manually adding the required xpath/regex expressions, i.e. all that is needed is a sufficiently large number of example datasets to obtain the author, date and title information, and a URL for the script to download the HTML markup of the posting in question:
 
<syntaxhighlight lang="javascript"> // vector with tests to be executed for sanity checks (unit testing)
    tests: [
      {
        url: 'https://sourceforge.net/p/flightgear/mailman/message/35059454/',
        author: 'Erik Hofman',
        date: 'May 3rd, 2016', // NOTE: using the transformed date here
        title: 'Re: [Flightgear-devel] Auto altimeter setting at startup (?)'
      },
      {
        url: 'https://sourceforge.net/p/flightgear/mailman/message/35059961/',
        author: 'Ludovic Brenta',
        date: 'May 3rd, 2016',
        title: 'Re: [Flightgear-devel] dual-control-tools and the limit on packet size'
      },
      {
        url: 'https://sourceforge.net/p/flightgear/mailman/message/20014126/',
        author: 'Tim Moore',
        date: 'Aug 4th, 2008',
        title: 'Re: [Flightgear-devel] Cockpit displays (rendering, modelling)'
      },
      {
        url: 'https://sourceforge.net/p/flightgear/mailman/message/23518343/',
        author: 'Tim Moore',
        date: 'Sep 10th, 2009',
        title: '[Flightgear-devel] Atmosphere patch from John Denker'
      } // add other tests below
 
    ], // end of vector with self-tests
</syntaxhighlight>
 
Note how this no longer contains any hard-coded xpath/regex expressions - instead, the script can refer to the website specific defaults, and try those first, and if they fail, use those to seed new generations and evolve them procedurally until all tests succeed.


Ultimately, this would allow the script to self-update its regex/xpath expressions if/when the underlying website (themes) change, but it would also allow to add support for new websites, without ever manually adding the required xpath/regex expressions, i.e. all that is needed is a sufficiently large number of example datasets to obtain the author, date and title information.


* http://regex.inginf.units.it/how.html
* http://regex.inginf.units.it/how.html

Navigation menu