Data Classes#

User data classes#

Prepare user data for weight generation.

class gdptools.data.user_data.ClimRCatData(*, cat_dict: dict[str, dict[str, Any]], f_feature: Union[str, Path, GeoDataFrame], id_feature: str, period: List[str])#

Instance of UserData using Climate-R catalog data.

Parameters
  • cat_dict (dict[str, dict[str, Any]]) –

  • f_feature (Union[str, Path, GeoDataFrame]) –

  • id_feature (str) –

  • period (List[str]) –

__init__(*, cat_dict: dict[str, dict[str, Any]], f_feature: Union[str, Path, GeoDataFrame], id_feature: str, period: List[str]) None#

Initialize ClimRCatData class.

This class uses wraps the ClimateR-catalogs developed by Mike Johnson and available here mikejohnson51/climateR-catalogs.

This can be queried in pandas to return the dictionary associated with a specific source and variable in the ClimateR-catalog. The cat_dict argument is composed of a key defined by the variable name and a dictionary of the corresponding ClimateR-catalog dictionary from the variable.

Parameters
  • cat_dict (dict[str, dict[str, Any]]) – Parameter metadata from

  • climateR-catalog.

  • f_feature (Union[str, Path, gpd.GeoDataFrame]) – GeoDataFrame or any path-like object that can be read by geopandas.read_file().

  • id_feature (str) – Header in GeoDataFrame to use as index for weights.

  • period (List[str]) – List containing date strings defining start and end time slice for aggregation.

  • self (ClimRCatData) –

Raises

KeyError – Raises error if id_feature not in f_feature columns.

Return type

None

Example

# Example of using climateR-catalog to prep cat_dict parameter
>>> cat_url = "https://mikejohnson51.github.io/climateR-catalogs/catalog.json"
>>> cat = pd.read_json(cat_url)
>>> _id = "terraclim"
>>> cat_vars = ["aet", "pet", "PDSI"]
>>> cat_params = [
... cat.query("id == @_id & variable == @_var")
... .to_dict(orient="records")[0]
... for _var in cat_vars
... ]
>>> cat_dict = dict(zip(cat_vars, cat_params))
>>> cat_dict.get("aet")
{'id': 'terraclim',
 'asset': 'agg_terraclimate_aet_1958_CurrentYear_GLOBE',
 'URL': 'http://thredds.northwestknowledge.net:8080/thredds/dodsC/agg_terraclimate_aet_1958_CurrentYear_GLOBE.nc',  # noqa: B950
 'type': 'opendap',
 'varname': 'aet',
 'variable': 'aet',
 'description':
 'water_evaporation_amount',
 'units': 'mm',
 'model': nan,
 'ensemble': nan,
 'scenario': 'total',
 'T_name': 'time',
 'duration':'1958-01-01/2021-12-01',
 'interval': '1 months',
 'nT': 768.0,
 'X_name': 'lon',
 'Y_name': 'lat',
 'X1': -179.9792,
 'Xn': 179.9792,
 'Y1': 89.9792,
 'Yn': -89.9792,
 'resX': 0.0417,
 'resY': 0.0417,
 'ncols': 8640.0,
 'nrows': 4320.0,
 'crs': '+proj=longlat +a=6378137 +f=0.00335281066474748 +pm=0 +no_defs',  # noqa: B950
 'toptobottom': 0.0,
 'tiled': '
}
get_class_type() str#

Abstract method for returning the type of the data class.

Return type

str

get_feature_id() str#

Return id_feature.

Return type

str

get_source_subset(key: str) DataArray#

get_source_subset Get subset of source data by key.

_extended_summary_

Parameters

key (str) – _description_

Returns

_description_

Return type

xr.DataArray

get_vars() list[str]#

Return list of param_dict keys, proxy for varnames.

Return type

list[str]

prep_agg_data(key: str) AggData#

Prepare ClimRCatData data for aggregation methods.

Parameters

key (str) – _description_

Returns

_description_

Return type

AggData

prep_interp_data(key: str, poly_id: int) AggData#

Prep AggData from ClimRCatData.

Parameters
  • key (str) – Name of the xarray grided data variable

  • poly_id (int) – ID number of the geodataframe geometry to clip the gridded data to

Returns

An instance of the AggData class

Return type

AggData

prep_wght_data() WeightData#

Prepare and return WeightData for weight generation.

Return type

WeightData

class gdptools.data.user_data.ODAPCatData(*, param_dict: dict[str, dict[str, Any]], grid_dict: dict[str, dict[str, Any]], f_feature: Union[str, Path, GeoDataFrame], id_feature: str, period: List[str])#

Instance of UserData using OPeNDAP catalog data.

Parameters
  • param_dict (dict[str, dict[str, Any]]) –

  • grid_dict (dict[str, dict[str, Any]]) –

  • f_feature (Union[str, Path, GeoDataFrame]) –

  • id_feature (str) –

  • period (List[str]) –

__init__(*, param_dict: dict[str, dict[str, Any]], grid_dict: dict[str, dict[str, Any]], f_feature: Union[str, Path, GeoDataFrame], id_feature: str, period: List[str]) None#

Initialize ODAPCatData class.

This class uses a parameter and grid json catalog developed by Mike Johnson and available here:

Param: <https://mikejohnson51.github.io/opendap.catalog/cat_params.json> Grids: <https://mikejohnson51.github.io/opendap.catalog/cat_grids.json>

These can be queried in pandas to return the dictionary associated with a specific source and variable as in the OPENDaP Catalog examples. The param_dict and grid_dict arguments are composed of a key defined by the variable name and a dictionary of the corresponding param and grid json string from the OPENDaP Catelog.

Parameters
  • param_dict (dict[str, dict]) – Parameter metadata from OPeNDAP catalog.

  • grid_dict (dict[str, dict]) – Grid metadata from OPeNDAP catalog.

  • f_feature (Union[str, Path, gpd.GeoDataFrame]) – GeoDataFrame or any path-like object that can be read by geopandas.read_file().

  • id_feature (str) – Header in GeoDataFrame to use as index for weights.

  • period (List[str]) – List containing date strings defining start and end time slice for aggregation.

Raises

KeyError – Raises error if id_feature not in f_feature columns.

Return type

None

get_class_type() str#

Abstract method for returning the type of the data class.

Return type

str

get_feature_id() str#

Return id_feature.

Return type

str

get_source_subset(key: str) DataArray#

get_source_subset Get data subset from source by key.

_extended_summary_

Parameters

key (str) – _description_

Returns

_description_

Return type

xr.DataArray

get_vars() list[str]#

Return list of param_dict keys, proxy for varnames.

Return type

list[str]

prep_agg_data(key: str) AggData#

Prepare ODAPCatData data for aggregation methods.

Parameters

key (str) – _description_

Returns

_description_

Return type

AggData

prep_interp_data(key: str, poly_id: int) AggData#

Prep AggData from ODAPCatData.

Parameters
  • key (str) – Name of the xarray grided data variable

  • poly_id (int) – ID number of the geodataframe geometry to clip the gridded data to

Returns

An instance of the AggData class

Return type

AggData

prep_wght_data() WeightData#

Prepare and return WeightData for weight generation.

Return type

WeightData

class gdptools.data.user_data.UserCatData(*, ds: Union[str, Dataset], proj_ds: Any, x_coord: str, y_coord: str, t_coord: str, var: Union[str, List[str]], f_feature: Union[str, Path, GeoDataFrame], proj_feature: Any, id_feature: str, period: List[str])#

Instance of UserData using minimum input variables to map to ODAPCatData.

Parameters
  • ds (Union[str, Dataset]) –

  • proj_ds (Any) –

  • x_coord (str) –

  • y_coord (str) –

  • t_coord (str) –

  • var (Union[str, List[str]]) –

  • f_feature (Union[str, Path, GeoDataFrame]) –

  • proj_feature (Any) –

  • id_feature (str) –

  • period (List[str]) –

__init__(*, ds: Union[str, Dataset], proj_ds: Any, x_coord: str, y_coord: str, t_coord: str, var: Union[str, List[str]], f_feature: Union[str, Path, GeoDataFrame], proj_feature: Any, id_feature: str, period: List[str]) None#

__init__ Contains data preparation methods based on UserData.

_extended_summary_

Parameters
  • ds (Union[str, Path, xr.Dataset]) – Xarray Dataset or str, URL or Path object that can be read by xarray.

  • proj_ds (Any) – Any object that can be passed to pyproj.crs.CRS.from_user_input for ds

  • x_coord (str) – String of x coordinate name in ds

  • y_coord (str) – String of y coordinate name in ds

  • t_coord (str) – string of time coordinate name in ds

  • var (Union[str, List[str]]) – List of variables to be used in aggregation. They must be present in ds.

  • f_feature (Union[str, Path, gpd.GeoDataFrame]) – GeoDataFrame or str, URL or Path object that can be read by geopandas.

  • proj_feature (Any) – Any object that can be passed to pyproj.crs.CRS.from_user_input for f_feature

  • id_feature (str) – String of id column name in f_feature.

  • period (List[str]) – List of two strings of the form ‘YYYY-MM-DD’ that define the start and end of the period to be used in aggregation. The format may be ‘YYYY-MM-DD’ or ‘YYYY-MM-DD HH:MM:SS’. depending on the format of the time coordinate in ds.

Raises

KeyError – Raises error if id_feature not in f_feature columns.

Return type

None

get_class_type() str#

Abstract method for returning the type of the data class.

Return type

str

get_feature_id() str#

Return id_feature.

Return type

str

get_source_subset(key: str) DataArray#

get_source_subset Get source subset by key.

_extended_summary_

Parameters

key (str) – _description_

Returns

_description_

Return type

xr.DataArray

get_vars() list[str]#

Return list of vars in data.

Return type

list[str]

prep_agg_data(key: str) AggData#

Prep AggData from UserData.

Parameters

key (str) –

Return type

AggData

prep_interp_data(key: str, poly_id: int) AggData#

Prep AggData from UserCatData.

Parameters
  • key (str) – Name of the xarray grided data variable

  • poly_id (int) – ID number of the geodataframe geometry to clip the gridded data to

Returns

An instance of the AggData class

Return type

AggData

prep_wght_data() WeightData#

Prepare and return WeightData for weight generation.

Return type

WeightData

class gdptools.data.user_data.UserData#

Prepare data for different sources for weight generation.

abstract __init__() None#

Init class.

Return type

None

abstract get_class_type() str#

Abstract method for returning the type of the data class.

Return type

str

abstract get_feature_id() str#

Abstract method for returning the id_feature parameter.

Return type

str

abstract get_source_subset(key: str) DataArray#

Abstract method for getting subset of source data.

Parameters

key (str) –

Return type

DataArray

abstract get_vars() list[str]#

Return a list of variables.

Return type

list[str]

abstract prep_agg_data(key: str) AggData#

Abstract method for preparing data for aggregation.

Parameters

key (str) –

Return type

AggData

abstract prep_interp_data(key: str, poly_id: int) AggData#

Abstract method for preparing data for interpolation.

Parameters
  • key (str) –

  • poly_id (int) –

Return type

AggData

abstract prep_wght_data() WeightData#

Abstract interface for generating weight data.

Return type

WeightData

class gdptools.data.user_data.UserTiffData(var: str, ds: Union[str, DataArray, Dataset], proj_ds: Any, x_coord: str, y_coord: str, bname: str, band: int, f_feature: Union[str, Path, GeoDataFrame], id_feature: str, proj_feature: Any)#

Instance of UserData for zonal stats processing of geotiffs.

Parameters
  • var (str) –

  • ds (Union[str, DataArray, Dataset]) –

  • proj_ds (Any) –

  • x_coord (str) –

  • y_coord (str) –

  • bname (str) –

  • band (int) –

  • f_feature (Union[str, Path, GeoDataFrame]) –

  • id_feature (str) –

  • proj_feature (Any) –

__init__(var: str, ds: Union[str, DataArray, Dataset], proj_ds: Any, x_coord: str, y_coord: str, bname: str, band: int, f_feature: Union[str, Path, GeoDataFrame], id_feature: str, proj_feature: Any) None#

Initialize UserTiffData.

UserTiffData is a structure that aids calculating zonal stats.

Parameters
  • var (str) – _description_

  • ds (Union[str, xr.Dataset]) – _description_

  • proj_ds (Any) – _description_

  • x_coord (str) – _description_

  • y_coord (str) – _description_

  • bname (str) – _description_

  • band (int) – _description_

  • f_feature (Union[str, Path, gpd.GeoDataFrame]) – _description_

  • id_feature (str) – _description_

  • proj_feature (Any) – _description_

Return type

None

get_class_type() str#

Abstract method for returning the type of the data class.

Return type

str

get_feature_id() str#

Get Feature id.

Return type

str

get_source_subset(key: str) DataArray#

get_source_subset Get subset of source data.

_extended_summary_

Parameters

key (str) – _description_

Returns

_description_

Return type

xr.DataArray

get_vars() list[str]#

Return list of varnames.

Return type

list[str]

prep_agg_data(key: str) AggData#

Prepare data for aggregation or zonal stats.

Parameters

key (str) –

Return type

AggData

prep_interp_data(key: str, poly_id: int) AggData#

Prep AggData from UserTiffData.

Parameters
  • key (str) – Name of the xarray grided data variable

  • poly_id (int) – ID number of the geodataframe geometry to clip the gridded data to

Returns

An instance of the AggData class

Return type

AggData

prep_wght_data() WeightData#

Prepare data for weight generation.

Return type

WeightData

Data passed to WeightGen class#

Data classes used in aggregation.

class gdptools.data.weight_gen_data.WeightData(feature: GeoDataFrame, id_feature: str, grid_cells: GeoDataFrame)#

Simple dataclass for tranferring prepared user data to the CalcWeightEngine.

Parameters
  • feature (GeoDataFrame) –

  • id_feature (str) –

  • grid_cells (GeoDataFrame) –

Data passed to AggGen class#

class gdptools.data.agg_gen_data.AggData(variable: str, cat_param: CatParams, cat_grid: CatGrids, da: DataArray, feature: GeoDataFrame, id_feature: str, period: List[str])#

AggData is a convenience container of data necessary for aggregation.

Data provided in one of UserData inherited classes will be prepped for aggregation, including subsetting the gridded data by the features bounding box, and by the time- period selected. In addition if the gridded data is defined between 0-360 degrees longitude it will be shifted to -180 - 180 degrees. For each variable in the user_data attribute of either the WeightGen or AggGen classes, a dict of {var: AggData} will be generated in the AggGen.calculate_agg() method.

Parameters
  • variable (str) – Variable name.

  • cat_param (CatParams) – Catparams data class containing parameter metadata.

  • cat_grid (CatGrids) – CatGrids data class containing grid metadata.

  • da (DataArray) – (DataArray): The spatially and temporally subsetted variable DataArray.

  • feature (GeoDataFrame) – The user-supplied feature file represented as a GeoDataFrame.

  • id_feature (str) – The feature id (column header) in the GeoDataFrame.

  • period (List[str]) – A list of dates representing the starting and ending date to process.

OPeNDAP Catalog data classes#

OpenDAP Catalog Data classes.

class gdptools.data.odap_cat_data.CatClimRItem(*, id: str, asset: Optional[str] = None, URL: str, varname: str, variable: Optional[str] = None, description: Optional[str] = None, units: Optional[str] = None, model: Optional[str] = None, ensemble: Optional[str] = None, scenario: Optional[str] = None, T_name: Optional[str] = None, duration: Optional[str] = None, interval: Optional[str] = None, nT: Optional[int] = 0, X_name: str, Y_name: str, X1: float, Xn: float, Y1: float, Yn: float, resX: float, resY: float, ncols: int, nrows: int, crs: str, toptobottom: str, tiled: Optional[str] = None)#

Mike Johnson’s CatClimRItem class.

Source data from which this is derived comes from:

https://mikejohnson51.github.io/climateR-catalogs/catalog.json

Parameters
  • id (str) –

  • asset (Optional[str]) –

  • URL (str) –

  • varname (str) –

  • variable (Optional[str]) –

  • description (Optional[str]) –

  • units (Optional[str]) –

  • model (Optional[str]) –

  • ensemble (Optional[str]) –

  • scenario (Optional[str]) –

  • T_name (Optional[str]) –

  • duration (Optional[str]) –

  • interval (Optional[str]) –

  • nT (Optional[int]) –

  • X_name (str) –

  • Y_name (str) –

  • X1 (float) –

  • Xn (float) –

  • Y1 (float) –

  • Yn (float) –

  • resX (float) –

  • resY (float) –

  • ncols (int) –

  • nrows (int) –

  • crs (str) –

  • toptobottom (str) –

  • tiled (Optional[str]) –

class Config#

interior class to direct pydantic’s behavior.

class gdptools.data.odap_cat_data.CatGrids(*, grid_id: Optional[int] = None, X_name: str, Y_name: str, X1: Optional[float] = None, Xn: Optional[float] = None, Y1: Optional[float] = None, Yn: Optional[float] = None, resX: Optional[float] = None, resY: Optional[float] = None, ncols: Optional[int] = None, nrows: Optional[int] = None, proj: str, toptobottom: int, tile: Optional[str] = None, **extra_data: Any)#

Class representing elements of Mike Johnsons OpenDAP catalog grids.

https://mikejohnson51.github.io/opendap.catalog/cat_grids.json

Parameters
  • grid_id (Optional[int]) –

  • X_name (str) –

  • Y_name (str) –

  • X1 (Optional[float]) –

  • Xn (Optional[float]) –

  • Y1 (Optional[float]) –

  • Yn (Optional[float]) –

  • resX (Optional[float]) –

  • resY (Optional[float]) –

  • ncols (Optional[int]) –

  • nrows (Optional[int]) –

  • proj (str) –

  • toptobottom (int) –

  • tile (Optional[str]) –

  • extra_data (Any) –

classmethod get_toptobottom(v: int) int#

Convert str to int.

Parameters

v (int) –

Return type

int

class gdptools.data.odap_cat_data.CatParams(*, id: Optional[str] = None, URL: str, grid_id: Optional[int] = - 1, variable: Optional[str] = None, varname: str, long_name: Optional[str], T_name: Optional[str], duration: Optional[str] = None, units: Optional[str], interval: Optional[str] = None, nT: Optional[int] = 0, tiled: Optional[str] = None, model: Optional[str] = None, ensemble: Optional[str] = None, scenario: Optional[str] = None)#

Class representing elements of Mike Johnsons OpenDAP catalog params.

https://mikejohnson51.github.io/opendap.catalog/cat_params.json

Parameters
  • id (Optional[str]) –

  • URL (str) –

  • grid_id (Optional[int]) –

  • variable (Optional[str]) –

  • varname (str) –

  • long_name (Optional[str]) –

  • T_name (Optional[str]) –

  • duration (Optional[str]) –

  • units (Optional[str]) –

  • interval (Optional[str]) –

  • nT (Optional[int]) –

  • tiled (Optional[str]) –

  • model (Optional[str]) –

  • ensemble (Optional[str]) –

  • scenario (Optional[str]) –

classmethod set_grid_id(v: int) int#

Convert to int.

Parameters

v (int) –

Return type

int

classmethod set_nt(v: int) int#

Convert to int.

Parameters

v (int) –

Return type

int

gdptools.data.odap_cat_data.climr_to_odap(climr: CatClimRItem) Tuple[CatParams, CatGrids]#

Convert a CatClimRItem to a CatParams and CatGrids object.

Parameters

climr (CatClimRItem) – The CatClimRItem object to convert.

Returns

The CatParams and CatGrids objects.

Return type

CatParams, CatGrids