Changelog#
All notable changes to gdptools are documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
Current Version: 0.2.21#
0.3.13 (2026-04-17)#
Fixed: Stricter and more consistent
source_time_periodvalidation across allUserDatasubclasses. Reversed ranges (start > end) and periods that select no time steps — out of range, between labels for monthly or 8-day data, or inside a sub-step window — now raiseValueErrorwith a diagnostic naming the dataset’s actual time range and the nearest available timestamps. Previously these produced silent empty slices that surfaced as confusingIndexErrors during aggregation (#98).Fixed:
NHGFStacZarrData,NHGFStacTiffData, and_NHGFStacBase._select_itemnow routesource_time_periodthrough the shared_process_periodvalidator. Malformed date strings are rejected at construction time instead of surfacing as opaque xarray errors later in.sel().Fixed: The “source data time coordinate is formatted as X, you specified Y” format-mismatch hint now applies to all three prep paths (
prep_wght_data,prep_agg_data,prep_interp_data), not onlyprep_wght_data.Fixed:
build_cf_datasetnow propagates the source time coord’s attrs and encoding onto the returned Dataset. Previously CF metadata (standard_name,axis,bounds) and netCDF encoding (units,calendar) were silently dropped, so output time axes were not CF-recognizable and non-standard calendars (noleap, 360_day) could be corrupted on NetCDF write (#99).Changed: Aggregation engines’ empty-time-slice log message no longer claims the
source_time_periodargument was specified incorrectly. The new message — “Empty time slice after subsetting” — describes the real cause.Internal: New helpers
_assert_period_intersects_time_axisand_sel_with_time_diagnosticingdptools.utils._process_periodnow rejects reversed ranges.Docs:
UserDatasubclass docstrings now document xarray’s inclusive-both-ends slice semantics and the monthly / 8-day / sub-daily gotchas forsource_time_period.
0.3.12 (2026-04-17)#
Security: Upgraded 4 locked dependencies to resolve 13 CVEs:
aiohttp(10 CVEs),pygments(CVE-2026-4539),pytest(CVE-2025-71176), andrequests(CVE-2026-25645).Fixed:
UserCatDataandNHGFStacZarrDatano longer mutate the caller’s source dataset when rotating 0–360° longitude into -180…180°. Constructing multiple instances from a single sharedxr.Datasetnow works; previously the second instance failed withKeyErroron a non-monotonic index (#97).Fixed:
AggGen.calculate_agg()now returns a CF-1.8-compliantxr.Datasetmatching the structure written byNetCDFWriter: adds thecrsscalar variable referenced bygrid_mapping, along withlat/loncentroid coordinates in EPSG:4326 (#97).Internal: Extracted a shared
build_cf_dataset()helper (new modulegdptools.agg._cf_dataset) soAggGen._gen_xarray_returnandNetCDFWriter.create_out_filecannot drift.Internal: Extracted a
_rotate_longitude_if_neededhelper ingdptools.data.user_data, removing the duplicated rotation block betweenUserCatData.__init__andNHGFStacZarrData.__init__.Internal:
NetCDFWriterno longer leaks a process-wideUserWarningfilter viawarnings.filterwarnings(...). The centroid-computation suppression is now scoped to awarnings.catch_warnings()block insidebuild_cf_dataset, so other code running after a NetCDF save sees unchanged warning behavior.Internal: Removed stale
hvplotentry fromdevgroup inuv.lock;hvplotis a docs-only dependency and was never listed inpyproject.tomldev group.
0.3.11 (2026-02-27)#
Deprecated:
serial,parallel, anddaskzonal engines now emitFutureWarning. Usezonal_engine="exactextract"instead, which provides better performance and handles large datasets with bounded memory.Changed: Moved
dask.distributedfrom top-level import to lazy import insideZonalEngineDask.weighted_zonal_stats(), so thedistributedpackage is no longer required at import time.Removed:
safetydev dependency (redundant withpip-audit; its transitivenltkdependency had CVE-2025-14009).Removed: Unused
.safety-policy.ymlconfiguration file.Documentation: Updated zonal statistics and logging demo notebooks with deprecation notices and revised engine comparison table.
0.3.10 (2026-02-21)#
Fixed: Read coordinate names (
x,y,time) from STACcube:dimensionsmetadata instead of hardcoding, enabling support for collections with non-standard dimension names.Fixed: Longitude rotation in
utils._get_cat_data_from_urlused hardcoded"lon"instead of the actual coordinate name variable.
0.3.9 (2026-02-21)#
Added: Structured logging throughout all source modules, replacing ~86
print()calls withlogger.<level>().Added: Logging configuration guide in Getting Started docs and
logging_demo.ipynbexample notebook.Fixed: Format spec
{:0.04}corrected to{:0.4f}in zonal_engines.Fixed: Removed leftover debug
print()calls.Fixed: Typo “anitmeridian” → “antimeridian” in utils.
CI: Added cross-platform conda-forge dependency validation job.
CI: Updated lockfile for yanked
virtualenv20.37.0 → 20.38.0.CI: Fixed pip-audit to use
--skip-editableinstead of--strict(editable local installs aren’t on PyPI).Security: Upgraded bokeh (CVE-2026-21883), pillow (CVE-2026-25990), and nbconvert (CVE-2025-53000).
0.3.8 (2026-02-18)#
Fixed: Pydantic
ValidationErrorinCatClimRItemwhenassetorscenariofields containfloat('nan')from the ClimateR catalog parquet.Changed: Switched test fixtures and expected values from TerraClimate to gridMET (
tmmx/tmmn/pr) after upstream TerraClimate v1.1 reprocessing broke expected values.Changed: Replaced TerraClimate example notebooks with gridMET equivalents:
gridmet_drb_polygon.ipynb(wasterraclime_et.ipynb) andgridmet_grid_to_line.ipynb(wasTerraclimate-Grid-to-Line.ipynb).Documentation: Expanded
NHGFStacDatadocumentation to describe the factory + subclass hierarchy (NHGFStacZarrData,NHGFStacTiffData) withautoclassdirectives and usage examples for both Zarr and GeoTIFF collections.Documentation: Added NLCD notebook to table of contents and examples table in Getting Started guide.
Documentation: Added all user data classes to the API Reference
autosummary.
0.3.7 (2026-02-16)#
Added: GeoTIFF-backed STAC collection support via new
NHGFStacTiffDataclass.NHGFStacDatanow auto-detects collection format (Zarr or GeoTIFF) and returns the appropriate handler transparently.Added:
NHGFStacData_NLCD.ipynbexample notebook demonstrating categorical zonal statistics of NLCD land cover classes for HUC12 basins.Added:
_stac_href_to_url()helper for robust S3/HTTPS/bare-path URL construction from STAC asset hrefs.Added: Fast-path direct URL lookup in
get_stac_collection()to avoid slow recursive STAC catalog traversal and API rate limiting.Added: Per-link error handling in
get_stac_collection()to skip broken STAC child links instead of aborting the entire traversal.Performance: Remote GeoTIFFs are opened with rasterio windowed reading, fetching only the tiles covering the target polygons via HTTP range requests instead of loading the full raster into memory.
Fixed: CI shell redirection bug where
python>=${PYTHON_REQ}was parsed as file redirection in.gitlab-ci.yml.Fixed: Timezone handling in STAC item selection now normalizes both sides to UTC instead of stripping timezone info.
Fixed: Updated
test_interp_gen_with_climaterexpected values after TerraClimate v1.1 upstream reprocessing (ERA5, 2026-02-06).Security: Upgraded 8 locked dependencies to resolve 15 CVEs (aiohttp, cryptography, distributed, filelock, pip, urllib3, virtualenv, wheel).
0.3.6 (2026-01-02)#
Added: GitLab CI job
conda-forge-validateto test dependency resolution against conda-forge with strict channel priority, catching compatibility issues before feedstock submission.Changed: Bumped minimum Python version from 3.10 to 3.11 due to
exactextract>=0.3.0dependency chain requiringpyproj>=3.7.2, which only has conda builds for Python 3.11+.Changed: Updated
pyprojminimum version from 3.3.0 to 3.7.2 to matchexactextractrequirements.Changed: Removed upper bounds on dependencies for better conda-forge compatibility:
xarray>=2024.7.0(removed<2025.0.0)rasterio>=1.2.9(removed<1.5.0)rioxarray>=0.15(removed<0.21)pystac>=1.10(removed<2.0)statsmodels>=0.14(removed<1.0)fastparquet>=2024.2(removed<=2024.5)pyarrow>=10.0.0(removed<18.0.0)
Changed: Migrated development setup from Poetry to UV exclusively - removed conda dependency from development workflow.
Fixed: Conda-forge packaging failures caused by strict channel priority and upper bound constraints.
Documentation: Updated README.md and environment-examples.yml to reflect Python 3.11+ requirement.
0.3.5 (2026-01-01)#
Added: Spatial partitioning for parallel weight generation engine using Hilbert curves to improve cache locality and reduce memory fragmentation.
Added: Test coverage for
calculate_weights(intersections=True)in polygon-to-polygon weight generation.Added:
@pytest.mark.slowmarkers for network-dependent tests (STAC catalog, notebooks).Added:
tests-fullnox session to run all tests including slow ones; defaulttestssession now skips slow tests (~2.5 min vs ~7 min).Performance: Optimized area-weighted statistics methods in
stats_methods.py:MAWeightedMeanandMAWeightedStd: 2-3x speedup by vectorizing masked array operations.MASum,MAMin,MAMax: 2.5-11x speedup by replacingnp.ma.masked_arraywithnp.nansum,np.nanmin,np.nanmax.SerialAgg.calc_agg: Reduced overhead by extracting single time slice before iterating polygon-by-polygon.
Fixed:
source_poly_idxstring indexing bug inWeightGenP2Pwheresource_poly_idx[0]on a string returned first character instead of column name.Removed:
test_coverage_summary.py(zero functional value).Removed: Redundant date format tests in
test_serial.py(already covered intest_weight_agg_gen.py).Changed: Renamed test functions in
test_dask.pyfromtest_parallel_*totest_dask_*for clarity.
0.3.4 (2025-12-31)#
Added: Vectorized
_get_cells_poly_fastfunction inutils_optimized.pyproviding 100-280x speedup for regular projected grids with 1D coordinates.Added: Memory-safe chunked mode (
mode="chunked") for processing very large grids without exhausting memory.Added:
estimate_memory_gb()helper to predict memory requirements before processing.Added: Benchmark script
scripts/test_cells_poly_optimization.pyfor performance testing.Fixed: Renovate configuration error by removing unsupported
uvmanager frommatchManagers.
0.3.3 (2025-12-30)#
Added:
exactextractzonal statistics engine for high-performance raster-to-polygon statistics using the exactextract library.Added: Support for categorical data in
exactextractengine with fraction-based output matching serial/parallel/dask engines.Added: Exactextract examples in the zonal statistics notebook (
docs/Examples/Rasters/zonal_stats.ipynb).Added: Tests verifying exactextract output format consistency with existing engines.
Changed: Zonal statistics engines now output consistent column names across all engines:
Continuous:
count,mean,std,min,25%,50%,75%,max,sumCategorical: integer category columns +
count
0.3.2 (2025-12-30)#
Changed: Migrated from conda/poetry to uv for dependency management.
Performance: Optimized spatial intersection calculations in
calc_weight_engines.pyusing contained/boundary partitioning withshapely.within()to skip expensiveintersection()calls for fully contained polygons.Performance: Optimized pixel weight calculation in
zonal_engines.pywith same contained/boundary partitioning strategy.Fixed: Consistent
n_jobs=-1default behavior across all engine functions (Parallel and Dask now both default tocpu_count()/2).Fixed: Coverage session failure in CI due to missing Cython source files.
Fixed: CVE-2025-53000 in pip-audit (ignored until nbconvert fix is released).
Fixed: rioxarray dependency updated to
>=0.15,<0.21.CI: Temporarily disabled mypy and lint jobs pending uv migration stabilization.
0.3.1 (2025-12-18)#
Added:
get_stac_collection()helper function to fetch collections from the NHGF STAC catalog with recursive search.Added:
STACCatalogErrorexception class for STAC catalog access errors.Fixed: urllib3 CVE-2025-66418 and CVE-2025-66471 by pinning
urllib3>=2.6.0.Fixed: STAC-dependent tests resilient to rate limiting and catalog unavailability using
xfail.Changed: Modernized
.gitlab-ci.yml: consolidated to single stage, added dependency-aware caching, removed redundantapt-get upgrade.Changed: Updated
nox -s coverageto generatecoverage.xmlfor GitLab CI artifacts.Changed: Removed explicit
pyogriofrom environment files (transitive dependency via geopandas).Changed: Switched from
mambatocondain CI (libmamba is now the default solver in miniforge3).Documentation: Updated NHGF STAC example notebooks (
NHGFStacData_CONUS404,NHGFStacData_Grid_to_Line) to useget_stac_collection()helper instead of direct pystac calls.
0.3.0 (2025-11-26)#
Added:
gdptools.depreciation_utils.deprecate_kwargshelper so high-level classes retain backwards compatibility while emitting structured warnings for renamed parameters.Changed: Standardized keyword names across the data-prep classes and documented the legacy aliases that now trigger deprecation warnings:
ClimRCatData:cat_dict → source_cat_dict,f_feature → target_gdf,id_feature → target_id,period → source_time_period.UserCatData:ds → source_ds,proj_ds → source_crs,x_coord → source_x_coord,y_coord → source_y_coord,t_coord → source_t_coord,var → source_var,f_feature → target_gdf,proj_feature → target_crs,id_feature → target_id,period → source_time_period.NHGFStacData:collection → source_collection,var → source_var,f_feature → target_gdf,id_feature → target_id,period → source_time_period.UserTiffData:ds → source_ds,proj_ds → source_crs,x_coord → source_x_coord,y_coord → source_y_coord,t_coord → source_t_coord,var → source_var,f_feature → target_gdf,proj_feature → target_crs,id_feature → target_id,period → source_time_period.
Changed: Updated the
nox -s lintsession to executepre-commit run --all-files, ensuring local lint checks match the enforced pre-commit workflow.Documentation: Expanded helper and data-class documentation to call out the canonical keyword names, the new deprecation behavior, and the preferred CRS guidance for weight generation.
0.2.20 (2024-XX-XX)#
Changed: Broadly relaxed dependencies
Changed: Bumped pydantic dependency to >= 2.0.0
0.2.18#
Added:
NHGFStacDataclass to interface with the NHGF Stac Catalog.Added:
NHGFStacDataexample use cases to documentation.Fixed: Bug in SerialAgg by loading subsetted data before aggregating, improving performance. This was the origin state it was inadvertently changed in a previous commit.
0.2.11 (2024-08-21)#
Changed: Updated categorical zonal stats to return the fraction of each category in each polygon.
Added: Precision parameter to zonal stats such that the number of significant digits in the output can be set.
0.2.10 (2024-07-18)#
Added: New class NHGFStacData as an interface to the NHGF Stac Catalog (still in development).
0.2.9 (2024-04-06)#
Added: Ability to specify precision of output
0.2.8 (2024-04-03)#
Added:
sumandmasked_sumstatistical methods.
0.2.5 (2023-11-1)#
Fixed: Bug in WeightGenP2P. Target polygons are now dissolved by the specified target_poly_idx. The generated weights file should have a unique set of source ids for each target id.
0.2.2 (2023-08-08)#
Fixed: Bug in output of AggGen. “parallel” and “dask” engines were not writing the feature_ids with the output.
0.0.1 (2022-03-22)#
Added: Original starting version.
Quick Version Info#
Current version:
Recent Highlights#
New Features#
Enhanced parallel processing capabilities
Improved STAC catalog integration
Better error handling and validation
Expanded statistical functions
Performance Improvements#
Optimized spatial indexing
Reduced memory footprint for large datasets
Faster coordinate transformations
Improved Dask integration
Documentation Enhancements#
Comprehensive API documentation
Interactive examples and tutorials
Better error message explanations
Enhanced getting started guide
Installation Matrix#
Python Version |
Status |
Installation |
|---|---|---|
3.9 |
✅ Supported |
|
3.10 |
✅ Supported |
|
3.11 |
✅ Supported |
|
3.12 |
✅ Supported |
|
3.13 |
⚠️ Testing |
|
Dependencies#
Core Dependencies#
geopandas >= 0.12.0pandas >= 1.5.0numpy >= 1.21.0xarray >= 2022.6.0shapely >= 2.0.0pyproj >= 3.4.0
Optional Dependencies#
dask[distributed]for distributed computingexactextractfor high-performance zonal statisticsbokehfor interactive visualizationsholoviewsfor advanced plotting
Platform Support#
Platform |
Status |
Notes |
|---|---|---|
Linux |
✅ Full Support |
Recommended for production |
macOS |
✅ Full Support |
Intel and Apple Silicon |
Windows |
⚠️ Limited |
Some features may require WSL |
Getting Help#
Documentation: You’re reading it! 📚
Examples: Check out our tutorials
Issues: Report bugs
Discussions: Ask questions
Contributing#
We welcome contributions! See our contributing guide for:
How to set up development environment
Code style guidelines
Testing requirements
Documentation standards