3DEP Elevation Grid-to-line intersection

3DEP Elevation Grid-to-line intersection#

Introduction#

In this example, a set of NHD stream-segments accessed using the HyRiver package. A USGS 3DEP Elevation dataset is pulled in using the ClimateR-Catalog. Finally, gdptools is used to interpolate the 3DEP elevation to the NHD stream-segments.

Geo Data Processing Tools (gdptools)#

gdptools is a python package for calculating area-weighted statistics of grid-to-polygon. grid-to-line, and polygon-to-polygon interpolations. gdptools utilizes metadata from the OPeNDAP catalog to aid generalizing gdptools processing capabilities. There are also methods for calculating area-weighted statistics for any gridded dataset that can be opened by xarray, however some of the metadata encapsulated in the json entries will have to be added by the user. Examples can be found in Non-Catalog Examples.

Open Source Python Tools#

xarray: Package to work with labelled multi-dimensional arrays.
geopandas: Package to work with geospatial data.

Data#

3DEP: The USGS 3D Elevation Program (3DEP) Datasets are the primary elevation data product produced and distributed by the USGS.
NHDPlus_V2: National Hydrography Dataset Plus Version 2, a selection of flowline geometries from HUC 0708

Maintenance Schedule#

The intent of this notebook is simply a demonstration of an application or workflow. As we progress in the build-out of NGHF project tools, this notebook may be updated or replaced. There is no other planned maintenance at this time.

Description#

This workflow demonstrates the grid to line interpoltion functionality from the gdptools package. The purpose of this tool is to attribute gridded data to line geometries. This is done by interpolating points along each line geometry and extracting the underlaying gridded data values at those points. From these point values, statistics are calculated and attributed to the original input lines.

The inputs to the line interpolation tool are a Geopandas geodataframe containing the line geometries, and a Pandas dataframe containing information about the gridMET gridded data. The line interpolation tool can process multi-dimensional xarray data, outputing the statistics organized by each time segment. Also, multiple gridded datasets can be processed, so for a gridMET example, three different climate variables can be accessed and processed.

The outputs are a dataframe containing the statistics for each line geometries calculated for each gridded dataset at each time segment and a geodataframe containing the interpolated point geometries which have the linearly interpolated values from the gridded dataset. Depending on the user’s inputs, the sample points can be interpolated from the line geometry at regular intervals, or the line vertices can be used instead.

Create conda environment#

gdptools-examples can be downloaded from the project repo

import warnings

import geoviews as gv
import pandas as pd
import pynhd
from pynhd import NLDI, WaterData

from gdptools import InterpGen, UserTiffData

warnings.filterwarnings("ignore")

gv.extension("bokeh")

ⓘ

Create target lines#

The target lines will be the vector dataset to which GDPTools will attribute with gridded data values. The PyNHD package is used to access stream geometries upstream from the USGS stream gage 06730200.

nldi = NLDI()
station_id = "06730200"

flw_main = nldi.navigate_byid(
    fsource="nwissite",
    fid=f"USGS-{station_id}",
    navigation="upstreamMain",
    source="flowlines",
    distance=1000,
)

nhdp_mr = WaterData("nhdflowline_network")
comids = [int(c) for c in flw_main.nhdplus_comid.to_list()]
nhd_mr_fl = nhdp_mr.byid("comid", comids)
# nhd_mr_fl.columns.to_list()
flw = pynhd.prepare_nhdplus(nhd_mr_fl, 0, 0, purge_non_dendritic=True) # .rename(columns={"comid": "ID"}) # , "tonode": "toID"
strm_segs = pynhd.topoogical_sort(flw, id_col="hydroseq", toid_col="dnhydroseq")

# copy the sorted comids into a list and remove the last element
# List will be used to sort ouput into topological order later
sort_ids = strm_segs[0][:-1]

# create a new dataframe with the sorted comids
mr_fl = nhd_mr_fl.set_index("hydroseq").loc[sort_ids]
mr_fl = mr_fl.reset_index()
mr_fl.head()

	hydroseq	geometry	comid	fdate	resolution	gnis_id	gnis_name	lengthkm	reachcode	flowdir	...	qc_12	vc_12	qe_12	ve_12	lakefract	surfarea	rareahload	rpuid	vpuid	enabled
0	550197285.0	MULTILINESTRING ((-105.63211 40.05203, -105.62...	2889852	2000-12-22T05:00:00Z	Medium	178467	North Boulder Creek	0.373	10190005000122	With Digitized	...	0.376	-9998.00000	0.376	-9998.00000	1.0	50667.042715	10.004544	10c	10L	1
1	550124544.0	MULTILINESTRING ((-105.62824 40.05363, -105.62...	2889186	1999-05-28T04:00:00Z	Medium	178467	North Boulder Creek	0.552	10190005000121	With Digitized	...	0.498	0.57306	0.498	0.57306	NaN	NaN	NaN	10c	10L	1
2	550099070.0	MULTILINESTRING ((-105.62207 40.05388, -105.62...	2889770	1999-05-28T04:00:00Z	Medium	178467	North Boulder Creek	0.478	10190005000120	With Digitized	...	0.592	-9998.00000	0.592	-9998.00000	1.0	52090.136727	6.691098	10c	10L	1
3	550084363.0	MULTILINESTRING ((-105.61719 40.05556, -105.61...	2889188	1999-05-28T04:00:00Z	Medium	178467	North Boulder Creek	0.396	10190005000119	With Digitized	...	0.659	0.62836	0.659	0.62836	NaN	NaN	NaN	10c	10L	1
4	550074675.0	MULTILINESTRING ((-105.61573 40.05258, -105.61...	2889772	1999-05-28T04:00:00Z	Medium	178467	North Boulder Creek	0.257	10190005000118	With Digitized	...	0.785	-9998.00000	0.785	-9998.00000	1.0	48345.911611	4.708286	10c	10L	1

5 rows × 138 columns

Plot the target lines#

gv.tile_sources.EsriTerrain * gv.Path(mr_fl, vdims="hydroseq").opts(
    color="hydroseq",
    cmap= "category20",
    width=800,
    height=450,
    line_width=5,
    show_legend=True,
    colorbar=True,
    title="NHDPlus MR Flowlines upstream of USGS Gage 06730200",
)

Define source data#

This is the gridded dataset from which GDPTools will be pulling values along the target lines.

import pprint
climater_cat = "https://github.com/mikejohnson51/climateR-catalogs/releases/download/June-2024/catalog.parquet"
cat = pd.read_parquet(climater_cat)

_id = "USGS 3DEP"
_asset = "10m CONUS DEM"
_vars = ["elevation"]
# create list of catalog parameters for each variable
cat_params = [
    cat.query("id == @_id & asset == @_asset & variable == @_vars").to_dict(orient="records")[0]
    for _var in _vars
]
# create dictionary of variable names and catalog parameters
cat_dict = dict(zip(_vars, cat_params))
pprint.pprint(cat_dict)

{'elevation': {'T_name': nan,
               'URL': '/vsicurl/https://prd-tnm.s3.amazonaws.com/StagedProducts/Elevation/13/TIFF/USGS_Seamless_DEM_13.vrt',
               'X1': -180.0005555560936,
               'X_name': nan,
               'Xn': 180.00000026239937,
               'Y1': -180.0005555560936,
               'Y_name': nan,
               'Yn': 72.0005555562949,
               'asset': '10m CONUS DEM',
               'crs': '+proj=longlat +datum=NAD83 +no_defs',
               'description': '10m CONUS DEM',
               'dim_order': nan,
               'duration': nan,
               'ensemble': nan,
               'id': 'USGS 3DEP',
               'interval': nan,
               'model': nan,
               'nT': nan,
               'ncols': 3888006.0,
               'nrows': 939612.0,
               'resX': 9.259259266022042e-05,
               'resY': 9.25925926599094e-05,
               'scenario': nan,
               'tiled': '',
               'toptobottom': None,
               'type': 'vrt',
               'units': 'meters',
               'variable': 'elevation',
               'varname': 'elevation'}}

Define `UserTiffData` object#

This data class ingests the data used for zonal stats processing of geotiffs. The inputs are:

(ds): the gridded dataset as an Xarray object
(proj_ds): the projection EPSG number for the gridded datset
(x_coord and y_coord): the name on the X and Y coords of the gridded dataset
(f_feature): the geodataframe containing the lines
(id_feature): the name of the ID column of the lines table (there can be no duplicates!)

user_data = UserTiffData(
    source_ds=ds_3dep,
    source_crs=4269,
    source_x_coord="x",
    source_y_coord="y",
    target_gdf=mr_fl,
    target_id="hydroseq",
    bname="band",
    band=1,
    source_var="tiff",
)

Define `InterpGen` object and run the interpolation#

The InterpGen class takes the gridded and vector datasets from the UserTiffData class and when the calc_interp function is called, it preforms the grid to line interpolation regularly spaced intervals. The inputs for the InterpGen class are:

(user_data): an instance of a UserData class
(pt_spacing): the spacing in meters inbetween the interpolation points
(stat): the statistics to calculate from the interpolated points
(interp_method): the Xarray dataset.interp() method for interpolating the sample points

interp_object1 = InterpGen(
    user_data,
    pt_spacing=100,
    stat='all',
    interp_method="linear"
)
stats, pts = interp_object1.calc_interp()

Plot the sub-setted 3DEP 10m DEM#

This is the gridded dataset from which the interpolated plots have pull their values.

ds_ss = user_data.get_source_subset(key="").to_dataset(name="elevation")

import hvplot.xarray
ds_ss.elevation.hvplot.quadmesh(
    x="x",
    y="y",
    geo=True,
    project=True,
    rasterize=True,
    cmap="viridis",
    width=600,
    height=300,
    aspect="equal",
    clabel="3DEP 10m DEM",
    xlabel="Longitude",
    ylabel="Latitude",
    title="3DEP 10m DEM",
)

Display the grided values for each of the interpolated points#

pts.head()

	pt	band	values	dist	hydroseq	geometry	varname
0	0	1	1560.632596	0.0	550025946.0	POINT (-105.18739 40.05048)	tiff
1	1	1	1560.127862	100.0	550025946.0	POINT (-105.18633 40.05041)	tiff
2	2	1	1560.555034	200.0	550025946.0	POINT (-105.18528 40.05026)	tiff
3	3	1	1559.350482	300.0	550025946.0	POINT (-105.18424 40.05007)	tiff
4	4	1	1558.909865	400.0	550025946.0	POINT (-105.18319 40.05005)	tiff

Display the statistics for each stream segment#

stats.head()

	hydroseq	varname	mean	median	std	min	max
0	550025946.0	tiff	1558.948776	1558.909865	1.177185	1556.995965	1560.632596
1	550026201.0	tiff	1562.467624	1562.329515	1.200751	1560.514595	1564.282919
2	550026469.0	tiff	1566.994821	1567.276414	1.405481	1564.551665	1569.605381
3	550026743.0	tiff	1570.061471	1569.926945	1.177150	1568.664712	1572.449691
4	550027025.0	tiff	1572.780278	1572.780278	0.221672	1572.558607	1573.001950

Plot the mean elevation by stream segment#

import geopandas as gpd
mr_fl = mr_fl.merge(stats, on="hydroseq", how='outer')
pmin = mr_fl["mean"].min()
pmax = mr_fl["mean"].max()

gv.tile_sources.EsriTerrain * gv.Path(
        mr_fl,
        vdims=["hydroseq", "mean", "median"]
    ).opts(
            width=800,
            height=450,
            aspect="equal",
            color="mean",
            cmap="viridis",
            line_width=5,
            colorbar=True,
            clim=(pmin, pmax),
            clabel="Mean Elevation (m)",
            title="Mean Elevation by Stream Segment",
            # tools=['hover']
        )

Plot the interpolated data as a scatter plot by comid#

import plotly.graph_objects as go
fig = go.Figure()
pts = pts.groupby("hydroseq")
for index in range(len(pts)):
    group = pts.get_group(sort_ids[index])
    fig.add_trace(go.Scatter(x=group["dist"], y=group["values"], mode='markers', marker=dict(size=5, color=group["values"], colorscale='Viridis', cmin=pmin, cmax=pmax), name=str(sort_ids[index])))
fig.update_layout(title="Stream-segment Interpolated Elevation 10m DEM", xaxis_title="Distance (m)", yaxis_title="Elevation (m)")
fig.show()

3DEP Elevation Grid-to-line intersection

Contents

3DEP Elevation Grid-to-line intersection#

Introduction#

Geo Data Processing Tools (gdptools)#

Open Source Python Tools#

Data#

Maintenance Schedule#

Description#

Create conda environment#

Create target lines#

Plot the target lines#

Define source data#

Load the source data#

Define `UserTiffData` object#

Define `InterpGen` object and run the interpolation#

Plot the sub-setted 3DEP 10m DEM#

Display the grided values for each of the interpolated points#

Display the statistics for each stream segment#

Plot the mean elevation by stream segment#

Plot the interpolated data as a scatter plot by comid#

3DEP Elevation Grid-to-line intersection

Contents

3DEP Elevation Grid-to-line intersection#

Introduction#

Geo Data Processing Tools (gdptools)#

Open Source Python Tools#

Data#

Maintenance Schedule#

Description#

Create conda environment#

Create target lines#

Plot the target lines#

Define source data#

Load the source data#

Define UserTiffData object#

Define InterpGen object and run the interpolation#

Plot the sub-setted 3DEP 10m DEM#

Display the grided values for each of the interpolated points#

Display the statistics for each stream segment#

Plot the mean elevation by stream segment#

Plot the interpolated data as a scatter plot by comid#

Define `UserTiffData` object#

Define `InterpGen` object and run the interpolation#