Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorporate NHDPlus HR catchment datasets into get_catchment_characteristics() #414

Open
mjcashman opened this issue Dec 9, 2024 · 2 comments

Comments

@mjcashman
Copy link

A nice feature enhancement would be to enable the reference_fabric argument to accept a change from "nhdplusv2" to the "nhdplusHR", which would then switch to alternative dataset(s) produced for NHDPlusHR.

A new dataset has been released by Gressler et al. 2024 for catchment characteristics attributed and accumulated at the NHDPlus High Res for the Chesapeake Bay watershed (HUCs 0205, 0206, 0207 and 0208).

These datasets are analogous to the NHDPlus Medium Res datasets produced by Mike Wieczorek, which are currently used as the data backbone for the existing get_catchment_characteristics() function. They are also cloud-hosted on S3 within sciencebase, in parquet file format, and have some existing data dictionary tables for use in variable querying and retrievals. The tables would need minor modification by joining them into a single master lookup table for joining and have s3 urls added, but that would be pretty straightforward.

There are a few potential bigger conceptual limitations to implementation at this time, that I'll document below. But this is a good time to document this feature enhancement regardless.

  1. These HR data are only available within the Chesapeake Bay watershed, not for CONUS. Therefore, even if this enhancement were implemented, it might be a bit of a mismatch for medium-res to have data available for CONUS, while high-res only providing data for one region. This might be less incongruous if/when more regional datasets are available at the HR (although I do not know of existing plans for more regional high-res datasets at this point).
  2. There is no current functionality in nhdplusTools or NLDI to navigate the NHDPlusHR, so there's no native discovery method for the NHDPlusHR identifiers. This wouldn't be an issue for retrieving data if the end-user already had their locations of interest, but that could be a mismatch with the "full-service" set of tools nhdplusTools currently provides.
@dblodgett-usgs
Copy link
Collaborator

Thanks for writing this up @mjcashman -- The plan has always been that we would eventually have NHDPlusHR and possibly other catchment sets that would work with this.

I'm excited that we might be there. I'd like to start chipping away at this functionality starting with #415 and with that, would be happy to add support for some attributes that link to NHDPlusHR. Regional datasets are fine -- I think we'll just want to indicate that certain variables are only available regionally in the index / metadata about them.

An open question is where we want to assemble the index of variables in this case where we will have many sources of characteristics rather than the one big one that we have for NHDPlusV2. I could see the index being checked in to the nhdplusTools repository or managed separately... doing it separately may be wise since hyRiver would logically also track with this work.

@dblodgett-usgs
Copy link
Collaborator

An initial goal here could be just getting the base NHDPlusHR attributes populated.

> names(hr$NHDFlowline)
 [1] "Permanent_Identifier"        "fdate"                       "resolution"                 
 [4] "gnis_id"                     "gnis_name"                   "LENGTHKM"                   
 [7] "REACHCODE"                   "flowdir"                     "wbarea_permanent_identifier"
[10] "FTYPE"                       "FCODE"                       "mainpath"                   
[13] "innetwork"                   "visibilityfilter"            "COMID"                      
[16] "VPUID"                       "Shape_Length"                "Enabled"                    
[19] "Shape"                       "streamleve"                  "StreamOrde"                 
[22] "StreamCalc"                  "FromNode"                    "ToNode"                     
[25] "Hydroseq"                    "LevelPathI"                  "Pathlength"                 
[28] "TerminalPa"                  "ArbolateSu"                  "Divergence"                 
[31] "StartFlag"                   "TerminalFl"                  "UpLevelPat"                 
[34] "UpHydroseq"                  "DnLevel"                     "DnLevelPat"                 
[37] "DnHydroseq"                  "DnMinorHyd"                  "dndraincou"                 
[40] "FromMeas"                    "ToMeas"                      "rtndiv"                     
[43] "thinner"                     "vpuin"                       "vpuout"                     
[46] "AreaSqKM"                    "TotDASqKM"                   "divdasqkm"                  
[49] "maxelevraw"                  "minelevraw"                  "maxelevsmo"                 
[52] "minelevsmo"                  "slope"                       "slopelenkm"                 
[55] "elevfixed"                   "hwtype"                      "hwnodesqkm"                 
[58] "statusflag"                 
> names(hr$NHDPlusCatchment)
[1] "COMID"        "sourcefc"     "gridcode"     "AreaSqKM"     "VPUID"        "SHAPE_Length"
[7] "SHAPE_Area"   "SHAPE"  

If we started with a parquet containing:

  • nhdplusid / COMID
  • Permanent_Identifier
  • reachcode
  • fdate
  • gnis_id, gnis_name
  • lengthkm
  • wbarea_permanent_identifier
  • ftype / fcode
  • vpuid
  • areasqkm / totdasqkm
  • minelev / maxelev
  • slope / slopelenkm

I could see doing a second file with a couple other things from MainstemsV3 such as:

  • mainstemid (where known)
  • outlet NHDPlusV2 COMID
  • inlet NHDPlusV2 COMID
  • id3dhp / id3dhp match type

Then, perhaps a "nice" NHDPlusHR network to facilitate network navigation could be added on -- but that's pretty far afield from the initial functionality of get_catchment_characteristics. Thoughts @mjcashman or others watching?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants