Skip to content

Latest commit

 

History

History
221 lines (189 loc) · 14.9 KB

hydrographr.md

File metadata and controls

221 lines (189 loc) · 14.9 KB

R package hydrographr

The R package hydrographr provides a collection of R function wrappers for GDAL and GRASS-GIS functions to efficiently work with the newly created Hydrography90m dataset and spatial biodiversity data. The easy-to-use functions process large raster and vector data directly on disk in parallel, such that the memory of R does not get overloaded. This allows creating scalable data processing and analysis workflows in R, even though the data is not processed directly in R. Below is a list of the functions that are implemented in hydrographr while additional functions are currently being developed.

The package was described in a publication in Methods in Ecology and Evolution (doi.org/10.1111/2041-210X.14226). A detailed explanation and examples can be found on its website, and its source code is openly available on GitHub.

Development team: Afroditi Grigoropoulou, Marlene Schürz, Sami Domisch, Jaime García Márquez, Yusdiel Torres-Cambas, Thomas Tomiczek, Merret Buurman, Christoph Schürz, Vanessa Bremerich
Contact information: [email protected]
Bug reports and feature requests: github.com/glowabio/hydrographr/issues

Project funding: NFDI4Biodiversity (DFG), NFDI4Earth (DFG)

This work has been funded as a Use Case of NFDI4Biodiversity (DFG project number 442032008, nfdi4biodiversity.org) within the German National Research Data Infrastructure (NFDI, www.nfdi.de). In addition, this work has been funded by NFDI4Earth (DFG project no. 460036893, www.nfdi4earth.de).

Citation

Please cite the hydrographr package as follows:

Schürz, M., Grigoropoulou, A., Garcia Marquez, J.R., Torres-Cambas, Y., Tomiczek, T., Floury, M., Bremerich, V., Schürz, C., Amatulli, G., Grossart, H.-P., Domisch, S. (2023). hydrographr: an R package for scalable hydrographic data processing. Methods in Ecology and Evolution, 14, 2953–2963. doi.org/10.1111/2041-210X.14226

Please also cite the Hydrography90m dataset as follows:

Amatulli, G., Garcia Marquez, J.R., Sethi, T., Kiesel, J., Grigoropoulou, A., Üblacker, M., Shen, L., Domisch, S. (2022). Hydrography90m: A new high-resolution global hydrographic dataset. Earth System Science Data, 14(10), 4525–4550. doi.org/10.5194/essd-14-4525-2022


hydrographr functions

<style> tr:hover {background-color: #D5EEFF;} table, th, td { border: 1px solid lightgrey; padding: 1em; width: 100%; } td.icon { width: 5% } td.funame { width: 17% } td.descrip { width: 78% } </style>

Downloading

get_tile_id() Identifies the ID of the regular tile(s) of the Hydrography90m, where the input points are located. The output IDs are required to download the data using the function download_tiles.
get_regional_unit_id() Identifies the ID of the regional unit(s) of the Hydrography90m, where the input points are located. The output IDs are required to download the data using the function download_tiles.
download_tiles() Downloads data of the Hydrography90m dataset.
download_test_data() Downloads the test data of the Hydrography90m dataset, required to run the examples of the manual.

Processing

merge_tiles() Merges raster (.tif) or vector (.gpkg) files using the GDAL functions gdalbuilvrt and gdal_translate for raster files and ogrmerge.py and ogr2ogr for vector files.
crop_to_extent() Crops a raster (.tif) file to a polygon border line or to the extent of a bounding box using the GDAL function gdalwarp.
snap_to_network() Snaps point to the next stream segment within a defined radius or a defined radius and a minimum flow accumulation using the GRASS GIS function r.stream.snap.
snap_to_subc_segment() Snaps points to the stream segment of the sub-catchment where the points are located in using the GRASS GIS functions v.net and v.distance.
set_no_data() Sets a NoData value to input files using the GDAL function gdal_edit.py.
reclass_raster() Reclassifies an integer raster (.tif) layer using the GRASS GIS function r.reclass.

Reading & data extraction

extract_ids() Extracts the ID value of the drainiage basin and/or sub-catchment raster layer at given point locations using the GDAL function gdallocationinfo.
report_no_data() Reports the NoData value of input files using the GDAL function gdalinfo.
extract_zonal_stat() Calculates zonal statistics based on one or more environmental variable raster .tif layers across a set (or all) or sub-catchments in a spatial extent using the GRASS GIS function r.univar.
read_geopackage() Loads a .gpkg file, or only part of it, as a data.table, graph, sf, or spatVector object.
get_upstream_catchment() Calculates the upstream basin taking each point as the outlet using the GRASS GIS function g.region.

Distance related

get_distance() Calculates the euclidean or within-network distance between points using the GRASS GIS function v.distance or v.net.allpairs.

Graph-based connectivity analyses

get_segment_neighbours() Reports the up- and/or downstream stream segments that are connected to the input segments within a neighbour order. Provides the option to summarise attributes across these segments.
get_catchment_graph() Extracts the upstream, downstrea or entire catchment of the input stream segments from a network graph.
get_distance_graph() Calculates the network distance between all input sub-catchment IDs from node to node (outlet of the stream segment).
get_pfafstetter_basins() Delineates Pfafstetter sub-basins for the input stream network.