Howto:Processing d-tpp using Python: Difference between revisions

Howto:Processing d-tpp using Python (view source)

264 bytes added , 28 November 2017

20,741

edits

@@ Line 3: / Line 3: @@
 == Idea ==
+[[File:KSFO-28RILS.png|thumb|Screenshot showing the scraped, converted and transformed approach chart for KSFO 28R (aspect ratio doesn't matter for machine learning purposes, i.e. we can use random scaling/ratios here to come up with artificial training data).]]
 if processing actual PDFs to "retrieve" such navigational data procedurally is ever supposed to "fly", I think it would have to be done using OpenCV runnning in a background thread (actually a bunch of threads in a separate process), i.e. using machine learning - basically, feeding it a bunch of manually-annotated PDFs, segmenting each PDF into sub-areas (horizontal/vertical profile, frequencies, identifier etc) and running neural networks.
@@ Line 8: / Line 9: @@
 It is kind of an interesting problem and it would address a bunch of legal issues, too - just like downloading such data from the web works for a reason, but it would definitely be a rather complex piece of software I believe, and we would want to get people involved with machine learning and computer vision (OpenCV) - it is kinda a superset of doing OCR on approach charts, i.e. not just looking for a character set, but actual document structure and "iconography" for airports, navaids, route markers and so on.
 == Motivation ==
 [[File:Chart-scraping.png|thumb|Screenshot showing scrapy scraping d-TPPs]]