Skip to content

libgeoda: GeoDa Cpp Library Project

Xun Li edited this page Jun 13, 2019 · 7 revisions

libgeoda: GeoDa C++ library

by Xun Li

GeoDa library (or libgeoda) is a C++ library that provides core functionalities of spatial data analysis and modeling in GeoDa, which is a desktop software focuses more on user interface. The main purpose of this GeoDa library is to wrap GeoDa's C++ code into a library, and expose GeoDa's functionalities to C++ applications or other programming languages via SWIG/Rcpp. By doing so, other research projects can easily integrate the latest and fast (thanks to the C++ implementation) algorithms of spatial data analysis in GeoDa no matter what programming language they are using.

Architecture

The design of the architecture is shown in the following Figure. The major modules, such as spatial regression/spatial analysis/clustering, will be first separated from the user interface code. The functionalities of these modules will be exposed via some public APIs. These APIs will be fianlly wrapped by using SWIG for interfacing with other languages, such as Python or Java. (Note: It is possible that we need to use Rcpp to wrap libgeoda for R.)

I/O module

The I/O module is the entry point when calling libgdal functions. We design this module to be flexible enough that it can not only take advantage of static linked gdal library to read and write spatial dataset of different formats (see discussion below), but also has the ability to interoperate with existing geo-spatial libraries, such as: GeoPandas/Pysal in Python environment, sf/rgdal in R environment, Geotools in Java environment.

1. Data I/O using libgdal

GeoDa uses GDAL/OGR to read/write vector data. That means GeoDa's public APIs will take OGR geometries as input values. Therefore, libgdal should be embeded inside GeoDa C++ library as well, so that libgeoda can either read and write spatial dataset that libgdal supports, or interface with GDAL/OGR objects from other C++ applications.

When build GeoDa Library, libgal should be static linked so that there is no need to install libgdal manually if it's not existed. The problem is that libgdal has many dependencies (e.g. for different drivers: sqlite, mysql, libgeos, libproj etc.), so GeoDa library needs a "minimal" libgdal with no dependencies or, if any, dependencies should be statically linked to libgdal to avoid any issues of installation or compatibale issues.

libgeoda
  |
  |_____ libgdal (static lib)
  |        |
  |        |_____ libgeos (static lib)
  |        |_____ libproj (static lib)
  |
  |______ boost (static lib)
  |______ ANN (static lib)
  |______ wxWidgets (non-gui, static lib)
  |______ CLAPACK/BLAS (static lib)

By default, GeoDa library will be able to support some popular vector file formats because of using libgdal internally. These file formats include:

 ESRI Shapefile -vector- (rw+v): ESRI Shapefile
 MapInfo File -vector- (rw+v): MapInfo File
 CSV -vector- (rw+v): Comma Separated Value (.csv)
 GML -vector- (rw+v): Geography Markup Language (GML)
 GPX -vector- (rw+v): GPX
 KML -vector- (rw+v): Keyhole Markup Language (KML)
 GeoJSON -vector- (rw+v): GeoJSON
 TopoJSON -vector- (rov): TopoJSON
 OpenFileGDB -vector- (rov): ESRI FileGDB
 GFT -vector- (rw+): Google Fusion Tables
 CouchDB -vector- (rw+): CouchDB / GeoCouch
 Carto -vector- (rw+): Carto

The filename or URL (e.g. for carto) of data source can be used directly as an input parameter to call the functions of GeoDa library. For example, if using GeoDa Library in Python:

import geoda

gda = geoda.read_file('/path/to/natregimes.shp')
w = gda.create_queen_weights(poly_id="fipsno")

2. Interoperation using WKB

The libgdal's I/O module is also designed to interface with existing libraries in different programming environment, such as GeoPandas/Pysal in Python environment, sf/rgdal in R environment, Geotools in Java environment.

Well-known text (WKT) is a text markup language for representing vector geometry objects on a map. A binary equivalent, known as well-known binary (WKB), is used to transfer and store the same information on many popular databases, such as Postgres/PostGIS extension, Sqlite/spatialite.

It is also supported by all the geo-spatial libraries mentioned above. For example, sf in R uese WKB serialisations written in C++/Rcpp for fast I/O with GDAL and GEOS. Therefore, we use WKB to exchange geometry objects between libgeoda and other geo-spatial libraries.

The attributes (table) data are exchanged directly in-memory between libgeoda and other programming languages. libgeoda uses STL vector to store numeric or string values. SWIG supports the conversion of data between C++ (libgdal) and other programming languages.

Here is a list of geo-spatial libraries that libgeoda is designed to interface with:

library name programming language
GeoPandas Python
Shapely Python
PySAL Python
RGDAL R
SF R
GeoTools Java
(py)GDAL Python

Interfacing with GeoPandas

GeoPandas uses Shapely to for its geometry column.

import geopandas

nat = geopanda.read_file('/path/to/natregimes.shp')
# nat is a geodatafrom object
gda = geoda.read_geopandas(nat)

w = gda.create_queen_weights(poly_id="fipsno")

Interfacing with PySAL

import pysal

shp = pysal.open('/path/to/natregimes.shp')
dbf = pysal.open('/path/to/natregimes.dbf')

gda = geoda.read_pysal(shp, dbf)

w = gda.create_queen_weights(poly_id="fipsno")

Interfacing with RGDAL in R

library(rgdal)
library(rgeoda)
nat <- readOGR('/path/to/natregimes.shp')

gda <- rgeoda(nat)

Interfacing with GeoTools in Java

import java.io.File;
import java.util.Map;
import org.locationtech.jts.io.WKBReader;
import org.locationtech.jts.io.WKBWriter;
import io.github.GeoDa

File file = new File("mayshapefile.shp");
Map<String, String> connect = new HashMap();
connect.put("url", file.toURI().toString());

DataStore dataStore = DataStoreFinder.getDataStore(connect);
String[] typeNames = dataStore.getTypeNames();
String typeName = typeNames[0];

FeatureSource featureSource = dataStore.getFeatureSource(typeName);
FeatureCollection collection = featureSource.getFeatures();
FeatureIterator iterator = collection.features();

Vector<String> collection = new Vector<String>();
while (iterator.hasNext()) {
  Feature feature = iterator.next();
  GeometryAttribute geom = feature.getDefaultGeometryProperty();

  // Geometry to WKB string
  WKBWriter wkbWriter = new WKBWriter();
  String wkb = WKBWriter.bytesToHex(wkbWriter.write(geom));
  // WKB string to Geometry
  //WKBReader wkbReader = new WKBReader();
  //Geometry geom = wkbReader.read(WKBReader.hexToBytes(wkb));
  collection.add(wkb);
}

GeoDa gda = new GeoDa(collection);

Interfacing with GDAL/OGR in Python

import os
import ogr

# using ogr to read a layer from a ESRI Shapefile
daShapefile = 'data/natregimes.shp'
driver = ogr.GetDriverByName('ESRI Shapefile')
dataSource = driver.Open(daShapefile, 0) 
layer = dataSource.GetLayer()
# featureCount = layer.GetFeatureCount()
# for feature in layer:
#   geom = feature.GetGeometryRef()
#   print(geom.Centroid().ExportToWkb())

gda = geoda.read_gdal(layer)
w = gda.create_queen_weights(poly_id="fipsno")

Interfacing with PostGIS using psycopg2 in Python

import psycopg2
from shapely import wkb

conn = psycopg2.connect('...')
curs = conn.cursor()

shps = {}  # key: gid, value: Shapely geom (wkb)
curs.execute('select gid, geom as geom from natregimes;')
for gid, geom in curs:
    shps[gid] = wkb.loads(geom, hex=True)

gda = geoda.read_pysal(shp, dbf)
w = gda.create_queen_weights(poly_id="fipsno")

Appendix

Setup on Mac OSX

Install libgdal using brew. The latest version of libgdal on brew is 2.4.1.

brew install gdal

Please note: when install GDAL for python using pip, we need to specify version 2.4.0. Otherwise, pip install GDAL will choose version 3.0, which is not compatible with the libgdal 2.4.1 installed by brew. Or, you can manually compile and install GDAL 3.0.

export LDFLAGS=-L/usr/local/Cellar/gdal/2.4.1_1/lib
export CPPFLAGS=-I/usr/local/Cellar/gdal/2.4.1_1/include
pip3 install GDAL==2.4.0

Build a minimal libgdal

Prerequisits:

libgeos

Version 3.7.2 Download link: https://github.com/libgeos/geos/archive/3.7.2.tar.gz

autogen.sh
./configure --prefix=/media/psf/Home/Github/libgeoda/deps/geos/3.7.2/

libproj

Version 5.2.0 Download link: https://github.com/OSGeo/PROJ/archive/5.2.0.tar.gz

autogen.sh
./configure --prefix=/media/psf/Home/Github/libgeoda/deps/proj/5.2.0/

libgdal needs C++11 since 2.4.0. To compatible with existing GeoDa project, we use libgdal 2.2.4 to disable C++11 using flag --without-cpp11.

Download link: https://github.com/OSGeo/gdal/archive/v2.2.4.tar.gz

For example

./configure --prefix=/Users/xunli/Downloads/test \
            --without-cpp11 \
            --with-pg=no \
            --with-xml2=no \
            --without-mrf \
            --with-libz=internal \
            --with-jpeg=internal \
            --without-grib \
            --without-openjpeg \
            --with-libiconv-prefix="-L/usr/lib" \
            --without-ld-shared \
            CFLAGS="-Os -arch x86_64" CXXFLAGS="-Os -arch x86_64" LDFLAGS="-arch x86_64"

# osx
./configure --prefix=/Users/xunli/Downloads/test --without-cpp11 --with-pg=no --with-xml2=no --without-mrf --with-libz=internal --with-jpeg=internal --without-grib --without-openjpeg --with-libiconv-prefix="-L/usr/lib" --without-ld-shared CFLAGS="-Os -arch x86_64" CXXFLAGS="-Os -arch x86_64" LDFLAGS="-arch x86_64"


# bionic
./configure --without-cpp11 --with-pg=no --with-xml2=no --without-mrf --with-libz=internal --with-jpeg=internal --without-grib --without-openjpeg --with-libiconv-prefix="-L/usr/local/lib" --without-ld-shared

After running ./confgiure libgdal, there is a GDALmake.opt file being created. In this file, the line starting with LIBS = shows how the libgeos and libproj4 are linked (e.g. `-lgeos_c)

In file GDALmake.opt, change the dynamic linking flags of libgeos and libproj to static linking flags. For example:

Old New
LIBS = $(SDE_LIB) -L/usr/local/opt/proj/lib -lproj -L/usr/local/Cellar/geos/3.7.2/lib -lgeos_c -lpthread -ldl LIBS = $(SDE_LIB) /usr/local/opt/sqlite3/lib/libsqlite3.a /usr/local/opt/proj/lib/libproj.a /usr/local/Cellar/geos/3.7.2/lib/libgeos.a /usr/local/Cellar/geos/3.7.2/lib/libgeos_c.a -L/usr/lib -liconv -lpthread -ldl

Ubuntu 18.04


sudo apt-get install build-essential autoconf libtool m4 automake libtoolize

wget http://ftp.gnu.org/pub/gnu/libiconv/libiconv-1.13.tar.gz
tar -xf libiconv-1.13.tar.gz
cd libiconv-1.13/
./configure --enable-static
make && make install
cd ..

wget https://github.com/libgeos/geos/archive/3.7.2.tar.gz
tar -xf 3.7.2.tar.gz 
cd geos-3.7.2/
./autogen.sh
./configure
make && make install
cd ..

wget https://github.com/OSGeo/PROJ/archive/5.2.0.tar.gz
tar -xf 5.2.0.tar.gz 
cd PROJ-5.2.0/
./autogen.sh
./configure
make && make install
cd ..


tar -xf gdal-2.2.4.tar.gz
cd gdal-2.2.4/gdal/
/configure --without-cpp11 --with-pg=no --with-xml2=no --without-mrf --with-libz=internal --without-grib --without-openjpeg --with-libiconv-prefix=/usr/local --without-ld-shared --with-static-proj4=/usr/local --without-jpeg12 --without-gif --without-jepg --disable-shared --enable-static
make && make install

NOTE: in Ubuntu, gdal will be configured with -g flag by default, so the static file will be huge. One needs to remove the -g flag manually to keep the static file small.

wxWidgets 3.1.2 non-ui build

./configure --with-cocoa \
            --disable-shared \
            --enable-monolithic \
            --disable-gui \
            --prefix=/Users/xunli/Downloads/test/wx \
            --with-macosx-version-min=10.13