You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are organizing the HOBO measurements in a folder and file structure. That means, the folder, the file name and the MIME type have a meaning and are already encoding valuable metadata. The folder location is: /hobo/<year>/<type>/<hobo_id>.(csv|txt).
<year> is the year the data lecture took place
<type> is the interesting part here. This encodes the type of data and can be /raw/ or /hourly/.
The files have the identifier for the measuring device in their file name, which can be related to the metadata for the corresponding year.
The raw HOBO measurements are uploaded by the students each year and quality controls are worked out and implemented. This step could be automated by a Github action. This would include various steps:
identify quality checks, that work for all raw data in the repository
include a new folder called /scripts and include a qpclib.(R|py) file that defines the checks and transforms
include a script per-file type and/or year (as necessary) that consumes the qpclib.(R|py) provided functions
implement a Github action that runs the scripts, whenever new HOBO data was added
Finally, the quality checks changed a little bit with every year and in many cases, individual students made some adaptions to their implementation. Therefore the results should be persisted in yet another folder and can be compared to the provided hourly data.
Using Python over R is generally preferred for this task, as the integration in automated workflows can be quite a hassle with R.
The text was updated successfully, but these errors were encountered:
This issue is part of a DataChallenge.
We are organizing the HOBO measurements in a folder and file structure. That means, the folder, the file name and the MIME type have a meaning and are already encoding valuable metadata. The folder location is:
/hobo/<year>/<type>/<hobo_id>.(csv|txt)
.<year>
is the year the data lecture took place<type>
is the interesting part here. This encodes the type of data and can be/raw/
or/hourly/
.The raw HOBO measurements are uploaded by the students each year and quality controls are worked out and implemented. This step could be automated by a Github action. This would include various steps:
/scripts
and include aqpclib.(R|py)
file that defines the checks and transformsqpclib.(R|py)
provided functionsFinally, the quality checks changed a little bit with every year and in many cases, individual students made some adaptions to their implementation. Therefore the results should be persisted in yet another folder and can be compared to the provided
hourly
data.Using Python over R is generally preferred for this task, as the integration in automated workflows can be quite a hassle with R.
The text was updated successfully, but these errors were encountered: