-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
restructure data handling for non-danish use cases #18
Comments
and re technical vs. communication layer:
tbd what to do if user provides communication layer directly. |
looks good! |
For the last comment (what to do if user provides comm layer directly), the script for creating the communicating could just check if a comm layer already exists/is provided? And if not, then it should use the tech layer to create a comm layer. |
re not hardcoding categories: very good point. then what about just having the subfolders |
re comm script: yes cool, so what about this:
|
re checking specifications: i think since we are asking for a lot of hand-crafted data sets, we could have a script to be run in the very beginning which checks whether all data sets are there and whether they are in the right format, and it not, tells you what is missing / what is wrong? |
yes that makes sense - but maybe also providing some list of names/labels in a config? I was thinking something like providing a list of polygon types ['nature','forest','bad','culture'] and then asking that the gpkg files are named accordingly (i.e. nature.gpkg, forest.gpkg, etc) |
could we also just drop the whole name-label in the config - and use the filename as name of the category? like you as a user provide, in the |
yes absolutely (and then just state that the file names are used as category names (i.e. rename your files if you want plots with nice and clean labels) |
change |
@anerv what do you think of this?
general setup:
data/raw
subfolders: this is where the user provides input.data/processed
mirorring thedata/raw
structure. contains all the gpkg output of the evaluation (currently saved toresults
) and all files that are currently saved todata/processed/workflow_steps
(distributed across subfolders, e.g. all the nodes.. and edges.. files fromdata/processed/workflow_steps
will instead go intodata/processed/network
)results
to contain only plots and stats subfolders (the gpkg outputs are saved into corresponding data/processed subfolders)Repo will look like so:
Required user input
data/raw/studyarea
study area polygondata/raw/polygon
polygon layers to evaluate (hardcoded options:nature
,culture
,agriculture
,tourism
,verify
). each layer is optional.data/raw/point
point layers to evaluate (hardcoded options:facility
,service
,poi
)data/raw/linestring
potentially: feature-match (like BikeDNA) to other network provided in linestring format? but let's rather leave that as future feature requesetdata/raw/elevation
elevation data for study areaUser configurations
In config file:
Denmark use case:
We say that in general: user needs to generate input themselves. However in the case of DK: We provide (in separate repo) all the code and data necessary to generate the inputs to
data/raw
, for a user-provided region (defined by municipality codes). Output: theoutput
folder, after running the single script ("merge study layers") for the DK municipalities indicated by the user in the config file here, will contain exactly the folders and data that are needed as user input for the "general" repo above.DK-repo will look like so:
The text was updated successfully, but these errors were encountered: