Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DataChallenge] Check and Improve the Data availability. #7

Open
mmaelicke opened this issue Jan 14, 2021 · 2 comments
Open

[DataChallenge] Check and Improve the Data availability. #7

mmaelicke opened this issue Jan 14, 2021 · 2 comments
Labels
Data Challenge This issue is Data Challenge eligible

Comments

@mmaelicke
Copy link
Member

mmaelicke commented Jan 14, 2021

This is part of a DatChallenge

Description

While uploading HOBO data into the database, you will run into error messages like this:

Uploading Metdata
Some entries are missing metadata:
----------------------------------
             name   hobo_id data_available    region  latitude  longitude exposition  altitude influence description
27  S*******, Leon  10350064             no  Freiburg       NaN        NaN        NaN       NaN       NaN         NaN
Done.                                                                                               
Downloading data repository. This may take a minute.
Done.
Start uploading. You can grab a coffee...                                                           
File ./data-master/hobo/2021/raw/10350071.csv references HOBO ID=10350071, which is not found.      
Parsing file './data-master/hobo/2021/hourly/10801132_Th.csv' was not successfull.                  
Do not edit the files by hand!
                                                                                                    
Parsing file './data-master/hobo/2021/hourly/10347394_Th.csv' was not successfull.                  
Do not edit the files by hand!
                                                                                                    
Parsing file './data-master/hobo/2021/hourly/10350068_Th.csv' was not successfull.                  
Do not edit the files by hand!
                                                                                                    
Done.                                                                                               
100% (36 of 36) |#############################################| Elapsed Time: 0:00:34 ETA:  00:00:00

There are several things that can go wrong, when you load data from a remote location into a database with a script. Luckily, the hydenv CLI catched most of the errors and gave you (hopefully) expressive error messages. For this data challenge you need to run the cli over all HOBO data and identify all catched (and maybe uncatched) error. Open issues for each kind of error to describe what is going wrong and discuss possible solutions.

Assignment

Before you take action, invite @mmaelicke and @modche into the discussion. The data challenge is solved, when all problems in the HOBO data files from all the past years are discussed and solved wherever possible. That means, all issues opened in the context of this data challenge need to be verified by and closed by an instructor.

@mmaelicke mmaelicke added the Data Challenge This issue is Data Challenge eligible label Jan 14, 2021
@modche
Copy link
Contributor

modche commented Jan 14, 2021

Leon has cancelled the course, but I can ask him to give us his meta data.

@mmaelicke
Copy link
Member Author

Leon has cancelled the course, but I can ask him to give us his meta data.

That would be great anyway, so I can add the details if nobody takes up this data challenge. We'll see...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Data Challenge This issue is Data Challenge eligible
Projects
None yet
Development

No branches or pull requests

2 participants