Changelog#
All notable changes to gdptools are documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
Current Version: 0.3.14#
Unreleased#
Deprecated: The
daskprocessing engine is deprecated acrossWeightGen,WeightGenP2P,AggGen, andInterpGenand will be removed in gdptools 0.4.0. Selecting it now emits aFutureWarning. Use'parallel'for large datasets or'serial'for small datasets instead. (ZonalGen’s dask engine was already deprecated in favor of'exactextract'.)
0.3.15 (2026-06-05)#
Fixed:
AggGen.calculate_aggnow returnsnumpy.nan(not the undeclaredNC_FILL_DOUBLEsentinel9.969209968386869e+36) for float target features with no source-grid overlap. The sentinel was an in-band missing-value marker with no matching_FillValue/missing_valueattribute, so CF-aware tools could not detect it and any reduction that did not pre-screen the array (max(skipna=True),np.fmax, plotting, statistics) silently picked it up as real data. NaN is self-identifying, propagates correctly throughskipna=True, and is encoded back to_FillValueby the NetCDF writer. Integer outputs, which cannot represent NaN, retain the netCDF integer fill value (declared as_FillValuebybuild_cf_dataset) (#100).
0.3.14 (2026-06-04)#
Fixed: Weighted zonal engines now assign a value to target geometries smaller than a single raster cell. Such sub-pixel zones previously raised an uncaught
NoDataInBoundsfromrioxarray’sclip_box— aborting the calculation — instead of being scored. They now retain the pixel(s) they fall within and receive that value through area-weighted intersection. Reported and originally patched by Angelica De Ros (Diagram Spa) (#101).Security: Upgraded 8 locked dependencies to resolve 16 CVEs disclosed in 2026:
aiohttp,idna,mistune,nbconvert,pip,pyarrow,starlette, andurllib3. Raised thepyarrow,urllib3(runtime), andnbconvert(dev) version floors inpyproject.tomlto match.Internal: Removed the deprecated
daskweighted-zonal regression test and its baseline ahead of the planned removal of the dask zonal engine.
0.3.13 (2026-04-17)#
Fixed: Stricter and more consistent
source_time_periodvalidation across allUserDatasubclasses. Reversed ranges (start > end) and periods that select no time steps — out of range, between labels for monthly or 8-day data, or inside a sub-step window — now raiseValueErrorwith a diagnostic naming the dataset’s actual time range and the nearest available timestamps. Previously these produced silent empty slices that surfaced as confusingIndexErrors during aggregation (#98).Fixed:
NHGFStacZarrData,NHGFStacTiffData, and_NHGFStacBase._select_itemnow routesource_time_periodthrough the shared_process_periodvalidator. Malformed date strings are rejected at construction time instead of surfacing as opaque xarray errors later in.sel().Fixed: The “source data time coordinate is formatted as X, you specified Y” format-mismatch hint now applies to all three prep paths (
prep_wght_data,prep_agg_data,prep_interp_data), not onlyprep_wght_data.Fixed:
build_cf_datasetnow propagates the source time coord’s attrs and encoding onto the returned Dataset. Previously CF metadata (standard_name,axis,bounds) and netCDF encoding (units,calendar) were silently dropped, so output time axes were not CF-recognizable and non-standard calendars (noleap, 360_day) could be corrupted on NetCDF write (#99).Changed: Aggregation engines’ empty-time-slice log message no longer claims the
source_time_periodargument was specified incorrectly. The new message — “Empty time slice after subsetting” — describes the real cause.Internal: New helpers
_assert_period_intersects_time_axisand_sel_with_time_diagnosticingdptools.utils._process_periodnow rejects reversed ranges.Docs:
UserDatasubclass docstrings now document xarray’s inclusive-both-ends slice semantics and the monthly / 8-day / sub-daily gotchas forsource_time_period.
0.3.12 (2026-04-17)#
Security: Upgraded 4 locked dependencies to resolve 13 CVEs:
aiohttp(10 CVEs),pygments(CVE-2026-4539),pytest(CVE-2025-71176), andrequests(CVE-2026-25645).Fixed:
UserCatDataandNHGFStacZarrDatano longer mutate the caller’s source dataset when rotating 0–360° longitude into -180…180°. Constructing multiple instances from a single sharedxr.Datasetnow works; previously the second instance failed withKeyErroron a non-monotonic index (#97).Fixed:
AggGen.calculate_agg()now returns a CF-1.8-compliantxr.Datasetmatching the structure written byNetCDFWriter: adds thecrsscalar variable referenced bygrid_mapping, along withlat/loncentroid coordinates in EPSG:4326 (#97).Internal: Extracted a shared
build_cf_dataset()helper (new modulegdptools.agg._cf_dataset) soAggGen._gen_xarray_returnandNetCDFWriter.create_out_filecannot drift.Internal: Extracted a
_rotate_longitude_if_neededhelper ingdptools.data.user_data, removing the duplicated rotation block betweenUserCatData.__init__andNHGFStacZarrData.__init__.Internal:
NetCDFWriterno longer leaks a process-wideUserWarningfilter viawarnings.filterwarnings(...). The centroid-computation suppression is now scoped to awarnings.catch_warnings()block insidebuild_cf_dataset, so other code running after a NetCDF save sees unchanged warning behavior.Internal: Removed stale
hvplotentry fromdevgroup inuv.lock;hvplotis a docs-only dependency and was never listed inpyproject.tomldev group.
0.3.11 (2026-02-27)#
Deprecated:
serial,parallel, anddaskzonal engines now emitFutureWarning. Usezonal_engine="exactextract"instead, which provides better performance and handles large datasets with bounded memory.Changed: Moved
dask.distributedfrom top-level import to lazy import insideZonalEngineDask.weighted_zonal_stats(), so thedistributedpackage is no longer required at import time.Removed:
safetydev dependency (redundant withpip-audit; its transitivenltkdependency had CVE-2025-14009).Removed: Unused
.safety-policy.ymlconfiguration file.Documentation: Updated zonal statistics and logging demo notebooks with deprecation notices and revised engine comparison table.
0.3.10 (2026-02-21)#
Fixed: Read coordinate names (
x,y,time) from STACcube:dimensionsmetadata instead of hardcoding, enabling support for collections with non-standard dimension names.Fixed: Longitude rotation in
utils._get_cat_data_from_urlused hardcoded"lon"instead of the actual coordinate name variable.
0.3.9 (2026-02-21)#
Added: Structured logging throughout all source modules, replacing ~86
print()calls withlogger.<level>().Added: Logging configuration guide in Getting Started docs and
logging_demo.ipynbexample notebook.Fixed: Format spec
{:0.04}corrected to{:0.4f}in zonal_engines.Fixed: Removed leftover debug
print()calls.Fixed: Typo “anitmeridian” → “antimeridian” in utils.
CI: Added cross-platform conda-forge dependency validation job.
CI: Updated lockfile for yanked
virtualenv20.37.0 → 20.38.0.CI: Fixed pip-audit to use
--skip-editableinstead of--strict(editable local installs aren’t on PyPI).Security: Upgraded bokeh (CVE-2026-21883), pillow (CVE-2026-25990), and nbconvert (CVE-2025-53000).
0.3.8 (2026-02-18)#
Fixed: Pydantic
ValidationErrorinCatClimRItemwhenassetorscenariofields containfloat('nan')from the ClimateR catalog parquet.Changed: Switched test fixtures and expected values from TerraClimate to gridMET (
tmmx/tmmn/pr) after upstream TerraClimate v1.1 reprocessing broke expected values.Changed: Replaced TerraClimate example notebooks with gridMET equivalents:
gridmet_drb_polygon.ipynb(wasterraclime_et.ipynb) andgridmet_grid_to_line.ipynb(wasTerraclimate-Grid-to-Line.ipynb).Documentation: Expanded
NHGFStacDatadocumentation to describe the factory + subclass hierarchy (NHGFStacZarrData,NHGFStacTiffData) withautoclassdirectives and usage examples for both Zarr and GeoTIFF collections.Documentation: Added NLCD notebook to table of contents and examples table in Getting Started guide.
Documentation: Added all user data classes to the API Reference
autosummary.
0.3.7 (2026-02-16)#
Added: GeoTIFF-backed STAC collection support via new
NHGFStacTiffDataclass.NHGFStacDatanow auto-detects collection format (Zarr or GeoTIFF) and returns the appropriate handler transparently.Added:
NHGFStacData_NLCD.ipynbexample notebook demonstrating categorical zonal statistics of NLCD land cover classes for HUC12 basins.Added:
_stac_href_to_url()helper for robust S3/HTTPS/bare-path URL construction from STAC asset hrefs.Added: Fast-path direct URL lookup in
get_stac_collection()to avoid slow recursive STAC catalog traversal and API rate limiting.Added: Per-link error handling in
get_stac_collection()to skip broken STAC child links instead of aborting the entire traversal.Performance: Remote GeoTIFFs are opened with rasterio windowed reading, fetching only the tiles covering the target polygons via HTTP range requests instead of loading the full raster into memory.
Fixed: CI shell redirection bug where
python>=${PYTHON_REQ}was parsed as file redirection in.gitlab-ci.yml.Fixed: Timezone handling in STAC item selection now normalizes both sides to UTC instead of stripping timezone info.
Fixed: Updated
test_interp_gen_with_climaterexpected values after TerraClimate v1.1 upstream reprocessing (ERA5, 2026-02-06).Security: Upgraded 8 locked dependencies to resolve 15 CVEs (aiohttp, cryptography, distributed, filelock, pip, urllib3, virtualenv, wheel).
0.3.6 (2026-01-02)#
Added: GitLab CI job
conda-forge-validateto test dependency resolution against conda-forge with strict channel priority, catching compatibility issues before feedstock submission.Changed: Bumped minimum Python version from 3.10 to 3.11 due to
exactextract>=0.3.0dependency chain requiringpyproj>=3.7.2, which only has conda builds for Python 3.11+.Changed: Updated
pyprojminimum version from 3.3.0 to 3.7.2 to matchexactextractrequirements.Changed: Removed upper bounds on dependencies for better conda-forge compatibility:
xarray>=2024.7.0(removed<2025.0.0)rasterio>=1.2.9(removed<1.5.0)rioxarray>=0.15(removed<0.21)pystac>=1.10(removed<2.0)statsmodels>=0.14(removed<1.0)fastparquet>=2024.2(removed<=2024.5)pyarrow>=10.0.0(removed<18.0.0)
Changed: Migrated development setup from Poetry to UV exclusively - removed conda dependency from development workflow.
Fixed: Conda-forge packaging failures caused by strict channel priority and upper bound constraints.
Documentation: Updated README.md and environment-examples.yml to reflect Python 3.11+ requirement.
0.3.5 (2026-01-01)#
Added: Spatial partitioning for parallel weight generation engine using Hilbert curves to improve cache locality and reduce memory fragmentation.
Added: Test coverage for
calculate_weights(intersections=True)in polygon-to-polygon weight generation.Added:
@pytest.mark.slowmarkers for network-dependent tests (STAC catalog, notebooks).Added:
tests-fullnox session to run all tests including slow ones; defaulttestssession now skips slow tests (~2.5 min vs ~7 min).Performance: Optimized area-weighted statistics methods in
stats_methods.py:MAWeightedMeanandMAWeightedStd: 2-3x speedup by vectorizing masked array operations.MASum,MAMin,MAMax: 2.5-11x speedup by replacingnp.ma.masked_arraywithnp.nansum,np.nanmin,np.nanmax.SerialAgg.calc_agg: Reduced overhead by extracting single time slice before iterating polygon-by-polygon.
Fixed:
source_poly_idxstring indexing bug inWeightGenP2Pwheresource_poly_idx[0]on a string returned first character instead of column name.Removed:
test_coverage_summary.py(zero functional value).Removed: Redundant date format tests in
test_serial.py(already covered intest_weight_agg_gen.py).Changed: Renamed test functions in
test_dask.pyfromtest_parallel_*totest_dask_*for clarity.
0.3.4 (2025-12-31)#
Added: Vectorized
_get_cells_poly_fastfunction inutils_optimized.pyproviding 100-280x speedup for regular projected grids with 1D coordinates.Added: Memory-safe chunked mode (
mode="chunked") for processing very large grids without exhausting memory.Added:
estimate_memory_gb()helper to predict memory requirements before processing.Added: Benchmark script
scripts/test_cells_poly_optimization.pyfor performance testing.Fixed: Renovate configuration error by removing unsupported
uvmanager frommatchManagers.
0.3.3 (2025-12-30)#
Added:
exactextractzonal statistics engine for high-performance raster-to-polygon statistics using the exactextract library.Added: Support for categorical data in
exactextractengine with fraction-based output matching serial/parallel/dask engines.Added: Exactextract examples in the zonal statistics notebook (
docs/Examples/Rasters/zonal_stats.ipynb).Added: Tests verifying exactextract output format consistency with existing engines.
Changed: Zonal statistics engines now output consistent column names across all engines:
Continuous:
count,mean,std,min,25%,50%,75%,max,sumCategorical: integer category columns +
count
0.3.2 (2025-12-30)#
Changed: Migrated from conda/poetry to uv for dependency management.
Performance: Optimized spatial intersection calculations in
calc_weight_engines.pyusing contained/boundary partitioning withshapely.within()to skip expensiveintersection()calls for fully contained polygons.Performance: Optimized pixel weight calculation in
zonal_engines.pywith same contained/boundary partitioning strategy.Fixed: Consistent
n_jobs=-1default behavior across all engine functions (Parallel and Dask now both default tocpu_count()/2).Fixed: Coverage session failure in CI due to missing Cython source files.
Fixed: CVE-2025-53000 in pip-audit (ignored until nbconvert fix is released).
Fixed: rioxarray dependency updated to
>=0.15,<0.21.CI: Temporarily disabled mypy and lint jobs pending uv migration stabilization.
0.3.1 (2025-12-18)#
Added:
get_stac_collection()helper function to fetch collections from the NHGF STAC catalog with recursive search.Added:
STACCatalogErrorexception class for STAC catalog access errors.Fixed: urllib3 CVE-2025-66418 and CVE-2025-66471 by pinning
urllib3>=2.6.0.Fixed: STAC-dependent tests resilient to rate limiting and catalog unavailability using
xfail.Changed: Modernized
.gitlab-ci.yml: consolidated to single stage, added dependency-aware caching, removed redundantapt-get upgrade.Changed: Updated
nox -s coverageto generatecoverage.xmlfor GitLab CI artifacts.Changed: Removed explicit
pyogriofrom environment files (transitive dependency via geopandas).Changed: Switched from
mambatocondain CI (libmamba is now the default solver in miniforge3).Documentation: Updated NHGF STAC example notebooks (
NHGFStacData_CONUS404,NHGFStacData_Grid_to_Line) to useget_stac_collection()helper instead of direct pystac calls.
0.3.0 (2025-11-26)#
Added:
gdptools.depreciation_utils.deprecate_kwargshelper so high-level classes retain backwards compatibility while emitting structured warnings for renamed parameters.Changed: Standardized keyword names across the data-prep classes and documented the legacy aliases that now trigger deprecation warnings:
ClimRCatData:cat_dict → source_cat_dict,f_feature → target_gdf,id_feature → target_id,period → source_time_period.UserCatData:ds → source_ds,proj_ds → source_crs,x_coord → source_x_coord,y_coord → source_y_coord,t_coord → source_t_coord,var → source_var,f_feature → target_gdf,proj_feature → target_crs,id_feature → target_id,period → source_time_period.NHGFStacData:collection → source_collection,var → source_var,f_feature → target_gdf,id_feature → target_id,period → source_time_period.UserTiffData:ds → source_ds,proj_ds → source_crs,x_coord → source_x_coord,y_coord → source_y_coord,t_coord → source_t_coord,var → source_var,f_feature → target_gdf,proj_feature → target_crs,id_feature → target_id,period → source_time_period.
Changed: Updated the
nox -s lintsession to executepre-commit run --all-files, ensuring local lint checks match the enforced pre-commit workflow.Documentation: Expanded helper and data-class documentation to call out the canonical keyword names, the new deprecation behavior, and the preferred CRS guidance for weight generation.
0.2.20 (2024-XX-XX)#
Changed: Broadly relaxed dependencies
Changed: Bumped pydantic dependency to >= 2.0.0
0.2.18#
Added:
NHGFStacDataclass to interface with the NHGF Stac Catalog.Added:
NHGFStacDataexample use cases to documentation.Fixed: Bug in SerialAgg by loading subsetted data before aggregating, improving performance. This was the origin state it was inadvertently changed in a previous commit.
0.2.11 (2024-08-21)#
Changed: Updated categorical zonal stats to return the fraction of each category in each polygon.
Added: Precision parameter to zonal stats such that the number of significant digits in the output can be set.
0.2.10 (2024-07-18)#
Added: New class NHGFStacData as an interface to the NHGF Stac Catalog (still in development).
0.2.9 (2024-04-06)#
Added: Ability to specify precision of output
0.2.8 (2024-04-03)#
Added:
sumandmasked_sumstatistical methods.
0.2.5 (2023-11-1)#
Fixed: Bug in WeightGenP2P. Target polygons are now dissolved by the specified target_poly_idx. The generated weights file should have a unique set of source ids for each target id.
0.2.2 (2023-08-08)#
Fixed: Bug in output of AggGen. “parallel” and “dask” engines were not writing the feature_ids with the output.
0.0.1 (2022-03-22)#
Added: Original starting version.
Quick Version Info#
Current version:
Recent Highlights#
New Features#
Enhanced parallel processing capabilities
Improved STAC catalog integration
Better error handling and validation
Expanded statistical functions
Performance Improvements#
Optimized spatial indexing
Reduced memory footprint for large datasets
Faster coordinate transformations
Improved Dask integration
Documentation Enhancements#
Comprehensive API documentation
Interactive examples and tutorials
Better error message explanations
Enhanced getting started guide
Installation Matrix#
Python Version |
Status |
Installation |
|---|---|---|
3.9 |
✅ Supported |
|
3.10 |
✅ Supported |
|
3.11 |
✅ Supported |
|
3.12 |
✅ Supported |
|
3.13 |
⚠️ Testing |
|
Dependencies#
Core Dependencies#
geopandas >= 0.12.0pandas >= 1.5.0numpy >= 1.21.0xarray >= 2022.6.0shapely >= 2.0.0pyproj >= 3.4.0
Optional Dependencies#
dask[distributed]for distributed computingexactextractfor high-performance zonal statisticsbokehfor interactive visualizationsholoviewsfor advanced plotting
Platform Support#
Platform |
Status |
Notes |
|---|---|---|
Linux |
✅ Full Support |
Recommended for production |
macOS |
✅ Full Support |
Intel and Apple Silicon |
Windows |
⚠️ Limited |
Some features may require WSL |
Getting Help#
Documentation: You’re reading it! 📚
Examples: Check out our tutorials
Issues: Report bugs
Discussions: Ask questions
Contributing#
We welcome contributions! See our contributing guide for:
How to set up development environment
Code style guidelines
Testing requirements
Documentation standards