Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests failing #31

Open
glederrey opened this issue Mar 28, 2022 · 1 comment
Open

Tests failing #31

glederrey opened this issue Mar 28, 2022 · 1 comment

Comments

@glederrey
Copy link

Dear authors,

It's me again. =)

After updating the UKCensusAPI to 1.1.6, I tried to run the tests given in the setup.py file. However, both tests are failing. See log below:

(microsynth) C:\Users\glede\Documents\EPFL\PhD\household_microsynth>python setup.py test
running test
WARNING: Testing via this command is deprecated and will be removed in a future version. Users looking for a generic test entry point independent of test runner are encouraged to use tox.
running egg_info
writing household_microsynth.egg-info\PKG-INFO
writing dependency_links to household_microsynth.egg-info\dependency_links.txt
writing requirements to household_microsynth.egg-info\requires.txt
writing top-level names to household_microsynth.egg-info\top_level.txt
reading manifest file 'household_microsynth.egg-info\SOURCES.txt'
writing manifest file 'household_microsynth.egg-info\SOURCES.txt'
running build_ext
test_hh1 (test_all.Test) ... ERROR
test_sc1 (test_all.Test) ... ERROR

======================================================================
ERROR: test_hh1 (test_all.Test)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Users\glede\Documents\EPFL\PhD\household_microsynth\tests\test_all.py", line 15, in test_hh1
    microsynth = hh_msynth.Household(region, resolution, cache)
  File "C:\Users\glede\Documents\EPFL\PhD\household_microsynth\household_microsynth\household.py", line 21, in __init__
    self.api_sc = Api_sc.NRScotland(cache_dir)
  File "C:\Users\glede\AppData\Roaming\Python\Python36\site-packages\ukcensusapi\NRScotland.py", line 112, in __init__
    self.area_lookup = pd.read_csv(str(self.cache_dir / "sc_lookup.csv"))
  File "C:\Users\glede\AppData\Roaming\Python\Python36\site-packages\pandas\io\parsers.py", line 688, in read_csv
    return _read(filepath_or_buffer, kwds)
  File "C:\Users\glede\AppData\Roaming\Python\Python36\site-packages\pandas\io\parsers.py", line 460, in _read
    data = parser.read(nrows)
  File "C:\Users\glede\AppData\Roaming\Python\Python36\site-packages\pandas\io\parsers.py", line 1198, in read
    ret = self._engine.read(nrows)
  File "C:\Users\glede\AppData\Roaming\Python\Python36\site-packages\pandas\io\parsers.py", line 2157, in read
    data = self._reader.read(nrows)
  File "pandas\_libs\parsers.pyx", line 847, in pandas._libs.parsers.TextReader.read
  File "pandas\_libs\parsers.pyx", line 862, in pandas._libs.parsers.TextReader._read_low_memory
  File "pandas\_libs\parsers.pyx", line 918, in pandas._libs.parsers.TextReader._read_rows
  File "pandas\_libs\parsers.pyx", line 905, in pandas._libs.parsers.TextReader._tokenize_rows
  File "pandas\_libs\parsers.pyx", line 2042, in pandas._libs.parsers.raise_parser_error
pandas.errors.ParserError: Error tokenizing data. C error: Expected 1 fields in line 11, saw 4

-------------------- >> begin captured logging << --------------------
urllib3.connectionpool: DEBUG: Starting new HTTPS connection (1): www.nomisweb.co.uk:443
urllib3.connectionpool: DEBUG: https://www.nomisweb.co.uk:443 "GET / HTTP/1.1" 200 38765
urllib3.connectionpool: DEBUG: Starting new HTTPS connection (1): www.scotlandscensus.gov.uk:443
urllib3.connectionpool: DEBUG: https://www.scotlandscensus.gov.uk:443 "GET / HTTP/1.1" 200 16081
--------------------- >> end captured logging << ---------------------

======================================================================
ERROR: test_sc1 (test_all.Test)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Users\glede\Documents\EPFL\PhD\household_microsynth\tests\test_all.py", line 32, in test_sc1
    microsynth = hh_msynth.Household(region, resolution, cache)
  File "C:\Users\glede\Documents\EPFL\PhD\household_microsynth\household_microsynth\household.py", line 21, in __init__
    self.api_sc = Api_sc.NRScotland(cache_dir)
  File "C:\Users\glede\AppData\Roaming\Python\Python36\site-packages\ukcensusapi\NRScotland.py", line 112, in __init__
    self.area_lookup = pd.read_csv(str(self.cache_dir / "sc_lookup.csv"))
  File "C:\Users\glede\AppData\Roaming\Python\Python36\site-packages\pandas\io\parsers.py", line 688, in read_csv
    return _read(filepath_or_buffer, kwds)
  File "C:\Users\glede\AppData\Roaming\Python\Python36\site-packages\pandas\io\parsers.py", line 460, in _read
    data = parser.read(nrows)
  File "C:\Users\glede\AppData\Roaming\Python\Python36\site-packages\pandas\io\parsers.py", line 1198, in read
    ret = self._engine.read(nrows)
  File "C:\Users\glede\AppData\Roaming\Python\Python36\site-packages\pandas\io\parsers.py", line 2157, in read
    data = self._reader.read(nrows)
  File "pandas\_libs\parsers.pyx", line 847, in pandas._libs.parsers.TextReader.read
  File "pandas\_libs\parsers.pyx", line 862, in pandas._libs.parsers.TextReader._read_low_memory
  File "pandas\_libs\parsers.pyx", line 918, in pandas._libs.parsers.TextReader._read_rows
  File "pandas\_libs\parsers.pyx", line 905, in pandas._libs.parsers.TextReader._tokenize_rows
  File "pandas\_libs\parsers.pyx", line 2042, in pandas._libs.parsers.raise_parser_error
pandas.errors.ParserError: Error tokenizing data. C error: Expected 1 fields in line 11, saw 4

-------------------- >> begin captured logging << --------------------
urllib3.connectionpool: DEBUG: Starting new HTTPS connection (1): www.nomisweb.co.uk:443
urllib3.connectionpool: DEBUG: https://www.nomisweb.co.uk:443 "GET / HTTP/1.1" 200 38765
urllib3.connectionpool: DEBUG: Starting new HTTPS connection (1): www.scotlandscensus.gov.uk:443
urllib3.connectionpool: DEBUG: https://www.scotlandscensus.gov.uk:443 "GET / HTTP/1.1" 200 16081
--------------------- >> end captured logging << ---------------------

----------------------------------------------------------------------
Ran 2 tests in 1.429s

FAILED (errors=2)
Test failed: <unittest.runner.TextTestResult run=2 errors=2 failures=0>
error: Test failed: <unittest.runner.TextTestResult run=2 errors=2 failures=0>

It seems the problem still comes from the parser after receiving the files from the API for Scotland. I checked the file sc_lookup.csv in the cache folder and it is indeed an HTML file saying that there's a 404 error.

Would it be possible for you to run the tests with the latest version of the UKCensusAPI library to see if it's a problem on my side (I have many venv, so it is possible) or if the UKCensusAPI still has issues with the Scotland API.

Thanks in advance!

@glederrey
Copy link
Author

Dear authors,

The Scotland API seems to work. =D (at least it worked on my Linux machine. I need to investigate why it still doesn't work on my Windows machine)

However, the tests are still not passing. Here's the log:

WARNING: Testing via this command is deprecated and will be removed in a future version. Users looking for a generic test entry point independent of test runner are encouraged to use tox.
/home/gael/Applications/anaconda/lib/python3.8/site-packages/pandas/_testing.py:24: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  import pandas._libs.testing as _testing
test_hh1 (test_all.Test) ... /home/gael/Applications/anaconda/lib/python3.8/site-packages/pandas/core/indexes/base.py:395: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  elif issubclass(data.dtype.type, np.bool) or is_bool_dtype(data):
/home/gael/Applications/anaconda/lib/python3.8/site-packages/pandas/core/dtypes/cast.py:214: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  elif not isinstance(r[0], (np.integer, np.floating, np.bool, int, float, bool)):
ERROR
test_sc1 (test_all.Test) ... ERROR

======================================================================
ERROR: test_hh1 (test_all.Test)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/gael/Documents/EPFL/PhD/Research/household_microsynth/tests/test_all.py", line 23, in test_hh1
    microsynth.run()
  File "/home/gael/Documents/EPFL/PhD/Research/household_microsynth/household_microsynth/household.py", line 59, in run
    constraints = seed.get_survey_TROBH() #[1,2,3,4,5,6,7]
  File "/home/gael/Documents/EPFL/PhD/Research/household_microsynth/household_microsynth/seed.py", line 31, in get_survey_TROBH
    a[tuple(pivot.index.labels)] = pivot.values.flat
AttributeError: 'MultiIndex' object has no attribute 'labels'
-------------------- >> begin captured logging << --------------------
urllib3.connectionpool: DEBUG: Starting new HTTPS connection (1): www.nomisweb.co.uk:443
urllib3.connectionpool: DEBUG: https://www.nomisweb.co.uk:443 "GET / HTTP/1.1" 200 38769
urllib3.connectionpool: DEBUG: Starting new HTTPS connection (1): www.scotlandscensus.gov.uk:443
urllib3.connectionpool: DEBUG: https://www.scotlandscensus.gov.uk:443 "GET / HTTP/1.1" 200 16081
--------------------- >> end captured logging << ---------------------

======================================================================
ERROR: test_sc1 (test_all.Test)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/gael/Documents/EPFL/PhD/Research/household_microsynth/tests/test_all.py", line 32, in test_sc1
    microsynth = hh_msynth.Household(region, resolution, cache)
  File "/home/gael/Documents/EPFL/PhD/Research/household_microsynth/household_microsynth/household.py", line 34, in __init__
    self.__get_census_data()
  File "/home/gael/Documents/EPFL/PhD/Research/household_microsynth/household_microsynth/household.py", line 317, in __get_census_data
    return self.__get_census_data_sc()
  File "/home/gael/Documents/EPFL/PhD/Research/household_microsynth/household_microsynth/household.py", line 325, in __get_census_data_sc
    self.lc4402 = self.api_sc.get_data("LC4402SC", self.region, self.resolution,
  File "/home/gael/Documents/EPFL/PhD/Research/UKCensusAPI/ukcensusapi/NRScotland.py", line 200, in get_data
    meta, raw_data = self.__get_rawdata(table, resolution)
  File "/home/gael/Documents/EPFL/PhD/Research/UKCensusAPI/ukcensusapi/NRScotland.py", line 154, in __get_rawdata
    exit(1)
  File "/home/gael/Applications/anaconda/lib/python3.8/_sitebuiltins.py", line 26, in __call__
    raise SystemExit(code)
SystemExit: 1
-------------------- >> begin captured stdout << ---------------------
Running in 'scotland mode'
Problem: The census data uses a proprietary compression algorithm (probably deflate64) and cannot be extracted by the python zip package.
Solution: manually extract this archive using a non-python extraction tool: cache/Output_Area_blk.zip
e.g. use 7zip, or (on linux):

$ unzip cache/Output_Area_blk.zip

or, if you only need a specfic table:

$ unzip cache/Output_Area_blk.zip -d cache LC4402SC.csv

Please also consider politely asking NRScotland to change the compression algorithm!


--------------------- >> end captured stdout << ----------------------
-------------------- >> begin captured logging << --------------------
urllib3.connectionpool: DEBUG: Starting new HTTPS connection (1): www.nomisweb.co.uk:443
urllib3.connectionpool: DEBUG: https://www.nomisweb.co.uk:443 "GET / HTTP/1.1" 200 38769
urllib3.connectionpool: DEBUG: Starting new HTTPS connection (1): www.scotlandscensus.gov.uk:443
urllib3.connectionpool: DEBUG: https://www.scotlandscensus.gov.uk:443 "GET / HTTP/1.1" 200 16081
--------------------- >> end captured logging << ---------------------

----------------------------------------------------------------------
Ran 2 tests in 1.754s

FAILED (errors=2)
Test failed: <unittest.runner.TextTestResult run=2 errors=2 failures=0>
error: Test failed: <unittest.runner.TextTestResult run=2 errors=2 failures=0>

While I understand that the second error comes from unzipping the data from the Scotland API (which I did, and launched a second test), would it be possible to bypass this error if the unzipped folder is present?

For the first error, I don't know where it's coming from. Would it be possible to have a look at it?

Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant