20,741
edits
Line 41: | Line 41: | ||
=== Scraping === | === Scraping === | ||
{{Note|This will download roughly 4gb of data in ~17k files, for each AIRAC cycle!}} | {{Note|This will download roughly 4gb of data in ~17k files, for each AIRAC cycle!}} | ||
Alternatively, use a media pipeline <ref>http://sergeis.com/web-scraping/downloading-files-scrapy-mediapipeline/</ref> | * this should support caching | ||
* and interrupting/resuming scraping | |||
* Alternatively, use a media pipeline <ref>http://sergeis.com/web-scraping/downloading-files-scrapy-mediapipeline/</ref> | |||
<syntaxhighlight lang="python"> | <syntaxhighlight lang="python"> | ||
import os | import os |