Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Another example in the wild #53

Open
juliohm opened this issue Apr 23, 2021 · 7 comments
Open

Another example in the wild #53

juliohm opened this issue Apr 23, 2021 · 7 comments

Comments

@juliohm
Copy link
Member

juliohm commented Apr 23, 2021

Similar to #51. I have updated the Drive folder with the second example that failed: https://drive.google.com/drive/folders/1EerBvkuS8h3SX20nbqjxq__FhiUaL3yO?usp=sharing

I have also updated the environment to the latest version of Shapefile.jl v0.7.1 and can confirm that the first file is now loading correctly. What do you think is happening to this one? It was failing before as well, so you don't need to worry about the latest changes merged into master.

@juliohm
Copy link
Member Author

juliohm commented Apr 27, 2021

Is there an easy method using ArchGDAL.jl to "fix" the shape file by reading and writing back to disk? I could write a helper function here to fix these shape files in the wild.

@meggart
Copy link
Member

meggart commented Apr 27, 2021

I just tried reading the file you linked with the latest master of Shapefile.jl and did not see any error. Is this OS-dependent? Are you sure you are using the latest patch?

@juliohm
Copy link
Member Author

juliohm commented Apr 27, 2021

Oh maybe it is OS-dependent then. I am on the latest patch, and this is what I get:

ERROR: EOFError: read end of file
Stacktrace:
  [1] unsafe_read(s::IOStream, p::Ptr{UInt8}, nb::UInt64)
    @ Base ./iostream.jl:425
  [2] unsafe_read
    @ ./io.jl:722 [inlined]
  [3] read!
    @ ./io.jl:740 [inlined]
  [4] read(io::IOStream, #unused#::Type{Shapefile.PolygonZ})
    @ Shapefile ~/.julia/packages/Shapefile/OsF2W/src/Shapefile.jl:260
  [5] read(io::IOStream, ::Type{Shapefile.Handle}, index::Shapefile.IndexHandle)
    @ Shapefile ~/.julia/packages/Shapefile/OsF2W/src/Shapefile.jl:355
  [6] #1
    @ ~/.julia/packages/Shapefile/OsF2W/src/Shapefile.jl:139 [inlined]
  [7] open(f::Shapefile.var"#1#2"{Shapefile.IndexHandle}, args::String; kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ Base ./io.jl:330
  [8] open
    @ ./io.jl:328 [inlined]
  [9] Handle
    @ ~/.julia/packages/Shapefile/OsF2W/src/Shapefile.jl:138 [inlined]
 [10] Handle
    @ ~/.julia/packages/Shapefile/OsF2W/src/Shapefile.jl:376 [inlined]
 [11] Shapefile.Table(path::String)
    @ Shapefile ~/.julia/packages/Shapefile/OsF2W/src/table.jl:38

My version info:

Julia Version 1.6.0
Commit f9720dc2eb (2021-03-24 12:55 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i7-9700 CPU @ 3.00GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-11.0.1 (ORCJIT, skylake)
Environment:
  JULIA_NUM_THREADS = 8

@meggart
Copy link
Member

meggart commented Apr 27, 2021

Oh, sorry, I was looking at the wrong Shapefile. Yes, I can confirm the bug now. Will try to find out what is happening

@meggart
Copy link
Member

meggart commented Apr 27, 2021

I think your file is just truncated. The branch in #54 makes it possible to read it, but there is simply some data missing in the very last entry of the Shapefile. When you read the data using GDAL, does the last record (number 3024) have data and if yes, is it complete?

@visr
Copy link
Member

visr commented Apr 27, 2021

Is there an easy method using ArchGDAL.jl to "fix" the shape file by reading and writing back to disk?

Simplest is probably calling ogr2ogr (docs) which is in GDAL_jll.

import GDAL_jll

old_shp = "has-issue.shp"
new_shp = "from-gdal.shp"

GDAL_jll.ogr2ogr_path() do ogr2ogr
    run(`$ogr2ogr -f "ESRI Shapefile" $new_shp $old_shp`)
end

@juliohm
Copy link
Member Author

juliohm commented Apr 29, 2021

I think your file is just truncated. The branch in #54 makes it possible to read it, but there is simply some data missing in the very last entry of the Shapefile. When you read the data using GDAL, does the last record (number 3024) have data and if yes, is it complete?

Sorry for the delay. Yes, it seems complete with GDAL. In the same script I shared in the Drive folder, you can inspect the result of GeoDataFrames.jl which reads using GDAL:

julia> df[3023,:]
DataFrameRow
  Row │ Estado  Mes    Class_Name    Mes_Ano  Ano    Sensor      Area     geom                    
      │ String  Int32  String        String   Int32  String      Float64  IGeometr               
──────┼───────────────────────────────────────────────────────────────────────────────────────────
 3023 │ PA         11  desmatamento  11_2018   2018  Sentinel-1   0.2428  Geometry: wkbPolygon25D

julia> df[3024,:]
DataFrameRow
  Row │ Estado  Mes    Class_Name    Mes_Ano  Ano    Sensor      Area     geom                    
      │ String  Int32  String        String   Int32  String      Float64  IGeometr               
──────┼───────────────────────────────────────────────────────────────────────────────────────────
 3024 │ PA         11  desmatamento  11_2018   2018  Sentinel-1   0.2051  Geometry: wkbPolygon25D

As you can see the last line of the data is complete.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants