Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reading the data using C++ #164

Open
jakobvinkas opened this issue Apr 12, 2024 · 1 comment
Open

Reading the data using C++ #164

jakobvinkas opened this issue Apr 12, 2024 · 1 comment

Comments

@jakobvinkas
Copy link

I have a parquet file containing transportation information that I have been able to parse in python and extract the needed data as follows:

import pandas as pd
import shapely.wkb
import shapely.geometry
import matplotlib.pyplot as plt
import utm

df = pd.read_parquet("lanes_small.parquet", engine="fastparquet")

plt.figure()
for index, object in df.iterrows():
    if not "segment" in object["id"]:
        continue
    linestring = shapely.wkb.loads(object["geometry"])
    coordinates = shapely.geometry.mapping(linestring)["coordinates"]
    xy_coordinates = [utm.from_latlon(lat, lon) for lon, lat in coordinates]
    plt.plot(
        [xy_coord[1] for xy_coord in xy_coordinates],
        [xy_coord[0] for xy_coord in xy_coordinates],
    )

plt.show()

But I am unable to do the same in C++, I have tried about 100 different ways but none of them work, maybe the closest I have gotten is this:

arrow::MemoryPool* pool = arrow::default_memory_pool();
std::shared_ptr<arrow::io::RandomAccessFile> input;
input = *arrow::io::ReadableFile::Open(path);

std::unique_ptr<parquet::arrow::FileReader> arrow_reader;
arrow::Status open_status = parquet::arrow::OpenFile(input, pool, &arrow_reader);
if(!open_status.ok()){
    std::cout << "Failed open: " << open_status.ToString();
}

std::shared_ptr<arrow::Table> table;
arrow::Status read_table_status = arrow_reader->ReadTable(&table);
if(!read_table_status.ok()){
    std::cout << "Failed read: " << read_table_status.ToString() << std::endl;
}

Which ends with Failed read: NotImplemented: Support for codec 'snappy' not built

And I have added snap to the conan project:

[requires]
snappy/1.2.0
arrow/15.0.0

[options]
arrow/*:parquet=True
arrow/*:with_boost=True
arrow/*:with_thrift=True
arrow/*:snappy=True

And even then I have no idea what I am supposed to get out from the table and the documentation is of little help.

Any input of what I should do?

@jakobvinkas jakobvinkas changed the title Reading the data in CPP Reading the data using C++ Apr 12, 2024
@jakobvinkas
Copy link
Author

Could anyone point in a direction here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant