Howto:Processing d-tpp using Python: Difference between revisions
Jump to navigation
Jump to search
Line 5: | Line 5: | ||
[[File:Chart-scraping.png|thumb|Screenshot showing scrapy scraping d-TPPs]] | [[File:Chart-scraping.png|thumb|Screenshot showing scrapy scraping d-TPPs]] | ||
Come up with the Python machinery to automatically download aviation charts and classify them for further processing/parsing (data extraction): http://155.178.201.160/d-tpp/ | Come up with the Python machinery to automatically download aviation charts and classify them for further processing/parsing (data extraction): http://155.178.201.160/d-tpp/ | ||
We will be downloading two different AIRAC cycles, i.e. at the time of writing 1712 & 1713: | |||
* http://155.178.201.160/d-tpp/1712/ | |||
* http://155.178.201.160/d-tpp/1713/ | |||
== Data sources == | == Data sources == |
Revision as of 17:16, 28 November 2017
This article is a stub. You can help the wiki by expanding it. |
Motivation
Come up with the Python machinery to automatically download aviation charts and classify them for further processing/parsing (data extraction): http://155.178.201.160/d-tpp/
We will be downloading two different AIRAC cycles, i.e. at the time of writing 1712 & 1713:
Data sources
Chart Classification
- STARs - Standard Terminal Arrivals
- IAPs - Instrument Approach Procedures
- DPs - Departure Procedures
Modules
XML Processing
http://155.178.201.160/d-tpp/1712/xml_data/d-TPP_Metafile.xml
Scraping
Downloading
Converting to images
Uploading to the GPU
Classification
OCR
Prerequisites
pip install --user
- requests
- pdf2image
Code
See also
- https://github.com/euske/pdfminer
- https://dzone.com/articles/pdf-reading
- https://automatetheboringstuff.com/chapter13/
- https://www.binpress.com/tutorial/manipulating-pdfs-with-python/167
- https://github.com/pmaupin/pdfrw