Data Classes#
User data classes#
Prepare user data for weight generation.
- class gdptools.data.user_data.ClimRCatData(*, cat_dict: dict[str, dict[str, Any]], f_feature: Union[str, Path, GeoDataFrame], id_feature: str, period: List[str])#
Instance of UserData using Climate-R catalog data.
- Parameters
cat_dict (dict[str, dict[str, Any]]) –
f_feature (Union[str, Path, GeoDataFrame]) –
id_feature (str) –
period (List[str]) –
- __init__(*, cat_dict: dict[str, dict[str, Any]], f_feature: Union[str, Path, GeoDataFrame], id_feature: str, period: List[str]) None #
Initialize ClimRCatData class.
This class uses wraps the ClimateR-catalogs developed by Mike Johnson and available here mikejohnson51/climateR-catalogs.
This can be queried in pandas to return the dictionary associated with a specific source and variable in the ClimateR-catalog. The cat_dict argument is composed of a key defined by the variable name and a dictionary of the corresponding ClimateR-catalog dictionary from the variable.
- Parameters
cat_dict (
dict[str, dict[str, Any]]
) – Parameter metadata fromclimateR-catalog. –
f_feature (
Union[str, Path, gpd.GeoDataFrame]
) – GeoDataFrame or any path-like object that can be read by geopandas.read_file().id_feature (
str
) – Header in GeoDataFrame to use as index for weights.period (
List[str]
) – List containing date strings defining start and end time slice for aggregation.self (ClimRCatData) –
- Raises
KeyError – Raises error if id_feature not in f_feature columns.
- Return type
None
Example
# Example of using climateR-catalog to prep cat_dict parameter >>> cat_url = "https://mikejohnson51.github.io/climateR-catalogs/catalog.json" >>> cat = pd.read_json(cat_url) >>> _id = "terraclim" >>> cat_vars = ["aet", "pet", "PDSI"] >>> cat_params = [ ... cat.query("id == @_id & variable == @_var") ... .to_dict(orient="records")[0] ... for _var in cat_vars ... ] >>> cat_dict = dict(zip(cat_vars, cat_params)) >>> cat_dict.get("aet") {'id': 'terraclim', 'asset': 'agg_terraclimate_aet_1958_CurrentYear_GLOBE', 'URL': 'http://thredds.northwestknowledge.net:8080/thredds/dodsC/agg_terraclimate_aet_1958_CurrentYear_GLOBE.nc', # noqa: B950 'type': 'opendap', 'varname': 'aet', 'variable': 'aet', 'description': 'water_evaporation_amount', 'units': 'mm', 'model': nan, 'ensemble': nan, 'scenario': 'total', 'T_name': 'time', 'duration':'1958-01-01/2021-12-01', 'interval': '1 months', 'nT': 768.0, 'X_name': 'lon', 'Y_name': 'lat', 'X1': -179.9792, 'Xn': 179.9792, 'Y1': 89.9792, 'Yn': -89.9792, 'resX': 0.0417, 'resY': 0.0417, 'ncols': 8640.0, 'nrows': 4320.0, 'crs': '+proj=longlat +a=6378137 +f=0.00335281066474748 +pm=0 +no_defs', # noqa: B950 'toptobottom': 0.0, 'tiled': ' }
- get_class_type() str #
Abstract method for returning the type of the data class.
- Return type
str
- get_feature_id() str #
Return id_feature.
- Return type
str
- get_source_subset(key: str) DataArray #
get_source_subset Get subset of source data by key.
_extended_summary_
- Parameters
key (
str
) – _description_- Returns
_description_
- Return type
xr.DataArray
- get_vars() list[str] #
Return list of param_dict keys, proxy for varnames.
- Return type
list[str]
- prep_agg_data(key: str) AggData #
Prepare ClimRCatData data for aggregation methods.
- Parameters
key (
str
) – _description_- Returns
_description_
- Return type
- prep_interp_data(key: str, poly_id: int) AggData #
Prep AggData from ClimRCatData.
- Parameters
key (
str
) – Name of the xarray grided data variablepoly_id (
int
) – ID number of the geodataframe geometry to clip the gridded data to
- Returns
An instance of the AggData class
- Return type
- prep_wght_data() WeightData #
Prepare and return WeightData for weight generation.
- Return type
- class gdptools.data.user_data.ODAPCatData(*, param_dict: dict[str, dict[str, Any]], grid_dict: dict[str, dict[str, Any]], f_feature: Union[str, Path, GeoDataFrame], id_feature: str, period: List[str])#
Instance of UserData using OPeNDAP catalog data.
- Parameters
param_dict (dict[str, dict[str, Any]]) –
grid_dict (dict[str, dict[str, Any]]) –
f_feature (Union[str, Path, GeoDataFrame]) –
id_feature (str) –
period (List[str]) –
- __init__(*, param_dict: dict[str, dict[str, Any]], grid_dict: dict[str, dict[str, Any]], f_feature: Union[str, Path, GeoDataFrame], id_feature: str, period: List[str]) None #
Initialize ODAPCatData class.
This class uses a parameter and grid json catalog developed by Mike Johnson and available here:
Param: <https://mikejohnson51.github.io/opendap.catalog/cat_params.json> Grids: <https://mikejohnson51.github.io/opendap.catalog/cat_grids.json>
These can be queried in pandas to return the dictionary associated with a specific source and variable as in the OPENDaP Catalog examples. The param_dict and grid_dict arguments are composed of a key defined by the variable name and a dictionary of the corresponding param and grid json string from the OPENDaP Catelog.
- Parameters
param_dict (
dict[str, dict]
) – Parameter metadata from OPeNDAP catalog.grid_dict (
dict[str, dict]
) – Grid metadata from OPeNDAP catalog.f_feature (
Union[str, Path, gpd.GeoDataFrame]
) – GeoDataFrame or any path-like object that can be read by geopandas.read_file().id_feature (
str
) – Header in GeoDataFrame to use as index for weights.period (
List[str]
) – List containing date strings defining start and end time slice for aggregation.
- Raises
KeyError – Raises error if id_feature not in f_feature columns.
- Return type
None
- get_class_type() str #
Abstract method for returning the type of the data class.
- Return type
str
- get_feature_id() str #
Return id_feature.
- Return type
str
- get_source_subset(key: str) DataArray #
get_source_subset Get data subset from source by key.
_extended_summary_
- Parameters
key (
str
) – _description_- Returns
_description_
- Return type
xr.DataArray
- get_vars() list[str] #
Return list of param_dict keys, proxy for varnames.
- Return type
list[str]
- prep_agg_data(key: str) AggData #
Prepare ODAPCatData data for aggregation methods.
- Parameters
key (
str
) – _description_- Returns
_description_
- Return type
- prep_interp_data(key: str, poly_id: int) AggData #
Prep AggData from ODAPCatData.
- Parameters
key (
str
) – Name of the xarray grided data variablepoly_id (
int
) – ID number of the geodataframe geometry to clip the gridded data to
- Returns
An instance of the AggData class
- Return type
- prep_wght_data() WeightData #
Prepare and return WeightData for weight generation.
- Return type
- class gdptools.data.user_data.UserCatData(*, ds: Union[str, Dataset], proj_ds: Any, x_coord: str, y_coord: str, t_coord: str, var: Union[str, List[str]], f_feature: Union[str, Path, GeoDataFrame], proj_feature: Any, id_feature: str, period: List[str])#
Instance of UserData using minimum input variables to map to ODAPCatData.
- Parameters
ds (Union[str, Dataset]) –
proj_ds (Any) –
x_coord (str) –
y_coord (str) –
t_coord (str) –
var (Union[str, List[str]]) –
f_feature (Union[str, Path, GeoDataFrame]) –
proj_feature (Any) –
id_feature (str) –
period (List[str]) –
- __init__(*, ds: Union[str, Dataset], proj_ds: Any, x_coord: str, y_coord: str, t_coord: str, var: Union[str, List[str]], f_feature: Union[str, Path, GeoDataFrame], proj_feature: Any, id_feature: str, period: List[str]) None #
__init__ Contains data preparation methods based on UserData.
_extended_summary_
- Parameters
ds (
Union[str, Path, xr.Dataset]
) – Xarray Dataset or str, URL or Path object that can be read by xarray.proj_ds (
Any
) – Any object that can be passed to pyproj.crs.CRS.from_user_input for dsx_coord (
str
) – String of x coordinate name in dsy_coord (
str
) – String of y coordinate name in dst_coord (
str
) – string of time coordinate name in dsvar (
Union[str, List[str]]
) – List of variables to be used in aggregation. They must be present in ds.f_feature (
Union[str, Path, gpd.GeoDataFrame]
) – GeoDataFrame or str, URL or Path object that can be read by geopandas.proj_feature (
Any
) – Any object that can be passed to pyproj.crs.CRS.from_user_input for f_featureid_feature (
str
) – String of id column name in f_feature.period (
List[str]
) – List of two strings of the form ‘YYYY-MM-DD’ that define the start and end of the period to be used in aggregation. The format may be ‘YYYY-MM-DD’ or ‘YYYY-MM-DD HH:MM:SS’. depending on the format of the time coordinate in ds.
- Raises
KeyError – Raises error if id_feature not in f_feature columns.
- Return type
None
- get_class_type() str #
Abstract method for returning the type of the data class.
- Return type
str
- get_feature_id() str #
Return id_feature.
- Return type
str
- get_source_subset(key: str) DataArray #
get_source_subset Get source subset by key.
_extended_summary_
- Parameters
key (
str
) – _description_- Returns
_description_
- Return type
xr.DataArray
- get_vars() list[str] #
Return list of vars in data.
- Return type
list[str]
- prep_interp_data(key: str, poly_id: int) AggData #
Prep AggData from UserCatData.
- Parameters
key (
str
) – Name of the xarray grided data variablepoly_id (
int
) – ID number of the geodataframe geometry to clip the gridded data to
- Returns
An instance of the AggData class
- Return type
- prep_wght_data() WeightData #
Prepare and return WeightData for weight generation.
- Return type
- class gdptools.data.user_data.UserData#
Prepare data for different sources for weight generation.
- abstract __init__() None #
Init class.
- Return type
None
- abstract get_class_type() str #
Abstract method for returning the type of the data class.
- Return type
str
- abstract get_feature_id() str #
Abstract method for returning the id_feature parameter.
- Return type
str
- abstract get_source_subset(key: str) DataArray #
Abstract method for getting subset of source data.
- Parameters
key (str) –
- Return type
DataArray
- abstract get_vars() list[str] #
Return a list of variables.
- Return type
list[str]
- abstract prep_agg_data(key: str) AggData #
Abstract method for preparing data for aggregation.
- Parameters
key (str) –
- Return type
- abstract prep_interp_data(key: str, poly_id: int) AggData #
Abstract method for preparing data for interpolation.
- Parameters
key (str) –
poly_id (int) –
- Return type
- abstract prep_wght_data() WeightData #
Abstract interface for generating weight data.
- Return type
- class gdptools.data.user_data.UserTiffData(var: str, ds: Union[str, DataArray, Dataset], proj_ds: Any, x_coord: str, y_coord: str, bname: str, band: int, f_feature: Union[str, Path, GeoDataFrame], id_feature: str, proj_feature: Any)#
Instance of UserData for zonal stats processing of geotiffs.
- Parameters
var (str) –
ds (Union[str, DataArray, Dataset]) –
proj_ds (Any) –
x_coord (str) –
y_coord (str) –
bname (str) –
band (int) –
f_feature (Union[str, Path, GeoDataFrame]) –
id_feature (str) –
proj_feature (Any) –
- __init__(var: str, ds: Union[str, DataArray, Dataset], proj_ds: Any, x_coord: str, y_coord: str, bname: str, band: int, f_feature: Union[str, Path, GeoDataFrame], id_feature: str, proj_feature: Any) None #
Initialize UserTiffData.
UserTiffData is a structure that aids calculating zonal stats.
- Parameters
var (
str
) – _description_ds (
Union[str, xr.Dataset]
) – _description_proj_ds (
Any
) – _description_x_coord (
str
) – _description_y_coord (
str
) – _description_bname (
str
) – _description_band (
int
) – _description_f_feature (
Union[str, Path, gpd.GeoDataFrame]
) – _description_id_feature (
str
) – _description_proj_feature (
Any
) – _description_
- Return type
None
- get_class_type() str #
Abstract method for returning the type of the data class.
- Return type
str
- get_feature_id() str #
Get Feature id.
- Return type
str
- get_source_subset(key: str) DataArray #
get_source_subset Get subset of source data.
_extended_summary_
- Parameters
key (
str
) – _description_- Returns
_description_
- Return type
xr.DataArray
- get_vars() list[str] #
Return list of varnames.
- Return type
list[str]
- prep_agg_data(key: str) AggData #
Prepare data for aggregation or zonal stats.
- Parameters
key (str) –
- Return type
- prep_interp_data(key: str, poly_id: int) AggData #
Prep AggData from UserTiffData.
- Parameters
key (
str
) – Name of the xarray grided data variablepoly_id (
int
) – ID number of the geodataframe geometry to clip the gridded data to
- Returns
An instance of the AggData class
- Return type
- prep_wght_data() WeightData #
Prepare data for weight generation.
- Return type
Data passed to WeightGen class#
Data classes used in aggregation.
- class gdptools.data.weight_gen_data.WeightData(feature: GeoDataFrame, id_feature: str, grid_cells: GeoDataFrame)#
Simple dataclass for tranferring prepared user data to the CalcWeightEngine.
- Parameters
feature (GeoDataFrame) –
id_feature (str) –
grid_cells (GeoDataFrame) –
Data passed to AggGen class#
- class gdptools.data.agg_gen_data.AggData(variable: str, cat_param: CatParams, cat_grid: CatGrids, da: DataArray, feature: GeoDataFrame, id_feature: str, period: List[str])#
AggData is a convenience container of data necessary for aggregation.
Data provided in one of UserData inherited classes will be prepped for aggregation, including subsetting the gridded data by the features bounding box, and by the time- period selected. In addition if the gridded data is defined between 0-360 degrees longitude it will be shifted to -180 - 180 degrees. For each variable in the user_data attribute of either the WeightGen or AggGen classes, a dict of {var: AggData} will be generated in the AggGen.calculate_agg() method.
- Parameters
variable (
str
) – Variable name.cat_param (
CatParams
) – Catparams data class containing parameter metadata.cat_grid (
CatGrids
) – CatGrids data class containing grid metadata.da (DataArray) – (DataArray): The spatially and temporally subsetted variable DataArray.
feature (
GeoDataFrame
) – The user-supplied feature file represented as a GeoDataFrame.id_feature (
str
) – The feature id (column header) in the GeoDataFrame.period (
List[str]
) – A list of dates representing the starting and ending date to process.
OPeNDAP Catalog data classes#
OpenDAP Catalog Data classes.
- class gdptools.data.odap_cat_data.CatClimRItem(*, id: str, asset: Optional[str] = None, URL: str, varname: str, variable: Optional[str] = None, description: Optional[str] = None, units: Optional[str] = None, model: Optional[str] = None, ensemble: Optional[str] = None, scenario: Optional[str] = None, T_name: Optional[str] = None, duration: Optional[str] = None, interval: Optional[str] = None, nT: Optional[int] = 0, X_name: str, Y_name: str, X1: float, Xn: float, Y1: float, Yn: float, resX: float, resY: float, ncols: int, nrows: int, crs: str, toptobottom: str, tiled: Optional[str] = None)#
Mike Johnson’s CatClimRItem class.
- Source data from which this is derived comes from:
‘https://mikejohnson51.github.io/climateR-catalogs/catalog.json’
- Parameters
id (str) –
asset (Optional[str]) –
URL (str) –
varname (str) –
variable (Optional[str]) –
description (Optional[str]) –
units (Optional[str]) –
model (Optional[str]) –
ensemble (Optional[str]) –
scenario (Optional[str]) –
T_name (Optional[str]) –
duration (Optional[str]) –
interval (Optional[str]) –
nT (Optional[int]) –
X_name (str) –
Y_name (str) –
X1 (float) –
Xn (float) –
Y1 (float) –
Yn (float) –
resX (float) –
resY (float) –
ncols (int) –
nrows (int) –
crs (str) –
toptobottom (str) –
tiled (Optional[str]) –
- class Config#
interior class to direct pydantic’s behavior.
- class gdptools.data.odap_cat_data.CatGrids(*, grid_id: Optional[int] = None, X_name: str, Y_name: str, X1: Optional[float] = None, Xn: Optional[float] = None, Y1: Optional[float] = None, Yn: Optional[float] = None, resX: Optional[float] = None, resY: Optional[float] = None, ncols: Optional[int] = None, nrows: Optional[int] = None, proj: str, toptobottom: int, tile: Optional[str] = None, **extra_data: Any)#
Class representing elements of Mike Johnsons OpenDAP catalog grids.
https://mikejohnson51.github.io/opendap.catalog/cat_grids.json
- Parameters
grid_id (Optional[int]) –
X_name (str) –
Y_name (str) –
X1 (Optional[float]) –
Xn (Optional[float]) –
Y1 (Optional[float]) –
Yn (Optional[float]) –
resX (Optional[float]) –
resY (Optional[float]) –
ncols (Optional[int]) –
nrows (Optional[int]) –
proj (str) –
toptobottom (int) –
tile (Optional[str]) –
extra_data (Any) –
- classmethod get_toptobottom(v: int) int #
Convert str to int.
- Parameters
v (int) –
- Return type
int
- class gdptools.data.odap_cat_data.CatParams(*, id: Optional[str] = None, URL: str, grid_id: Optional[int] = - 1, variable: Optional[str] = None, varname: str, long_name: Optional[str], T_name: Optional[str], duration: Optional[str] = None, units: Optional[str], interval: Optional[str] = None, nT: Optional[int] = 0, tiled: Optional[str] = None, model: Optional[str] = None, ensemble: Optional[str] = None, scenario: Optional[str] = None)#
Class representing elements of Mike Johnsons OpenDAP catalog params.
https://mikejohnson51.github.io/opendap.catalog/cat_params.json
- Parameters
id (Optional[str]) –
URL (str) –
grid_id (Optional[int]) –
variable (Optional[str]) –
varname (str) –
long_name (Optional[str]) –
T_name (Optional[str]) –
duration (Optional[str]) –
units (Optional[str]) –
interval (Optional[str]) –
nT (Optional[int]) –
tiled (Optional[str]) –
model (Optional[str]) –
ensemble (Optional[str]) –
scenario (Optional[str]) –
- classmethod set_grid_id(v: int) int #
Convert to int.
- Parameters
v (int) –
- Return type
int
- classmethod set_nt(v: int) int #
Convert to int.
- Parameters
v (int) –
- Return type
int
- gdptools.data.odap_cat_data.climr_to_odap(climr: CatClimRItem) Tuple[CatParams, CatGrids] #
Convert a CatClimRItem to a CatParams and CatGrids object.
- Parameters
climr (
CatClimRItem
) – The CatClimRItem object to convert.- Returns
The CatParams and CatGrids objects.
- Return type