# Getting Started

This guide will help you get up and running with `gdptools` quickly and efficiently.

```{contents}
:local:
:depth: 2
```

(installation)=

## Installation

### Quick Installation

The easiest way to install `gdptools` is via conda or pip:

```bash
# Via conda (recommended)
conda install -c conda-forge gdptools

# Via pip
pip install gdptools
```

### Development Installation

For development or to get the latest features:

```bash
git clone https://code.usgs.gov/wma/nhgf/toolsteam/gdptools.git
cd gdptools
conda env create -f environment.yml
conda activate gdptools
poetry install
pre-commit install --install-hooks
```

### Offline CRS configuration

`pyproj` downloads grid-shift files the first time you reproject to certain CRSs. If you work behind a
firewall or on an air-gapped network, provide those grids locally and disable network fetches:

1. Install the grid bundle on a machine with internet access. The most reliable option is the
   `proj-data` package from conda-forge:

   ```bash
   mamba install -c conda-forge proj-data
   ```

   Copy the resulting `share/proj` directory to the offline machine (for example `/opt/proj/share/proj`).

2. Point PROJ at that directory and disable remote downloads before running `gdptools`:

   ```bash
   export PROJ_NETWORK=OFF
   export PROJ_DATA=/opt/proj/share/proj  # or PROJ_LIB for older PROJ builds
   export PROJ_CURL_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt  # optional, fixes TLS interception
   ```

   On Windows PowerShell, use `setx PROJ_NETWORK OFF` and `setx PROJ_DATA C:\\proj\\share\\proj`.

3. Verify your configuration:

   ```bash
   python - <<'PY'
   from pyproj import datadir, network

   print("PROJ data dir:", datadir.get_data_dir())
   print("Network enabled:", network.is_network_enabled())
   PY
   ```

When these variables are set, `gdptools` CRS helpers surface a clear error directing colleagues to the
same steps instead of waiting on blocked downloads.

## Core Concepts

### Spatial Weight Calculation

`gdptools` calculates **area-weighted intersections** between:

- **Gridded datasets** (NetCDF, Zarr) and **polygon geometries**
- **Two sets of polygon geometries** (watershed-to-county, etc.)

### Processing Workflow

1. **Data Input**: Load your gridded data and target geometries
2. **Weight Generation**: Calculate spatial intersection weights
3. **Aggregation**: Apply statistical operations using the weights
4. **Output**: Export results in multiple formats

## Configuring Logging

`gdptools` uses Python's standard `logging` module. By default it emits **no log output** because the library registers a `NullHandler` — this follows the [recommended practice for libraries](https://docs.python.org/3/howto/logging.html#configuring-logging-for-a-library).

### Enabling log output

To see log messages, configure logging in your application or notebook:

```python
import logging

logging.basicConfig(level=logging.INFO)
```

This enables INFO-level output from all libraries. To control `gdptools` independently from other packages, set its logger directly:

```python
# Show detailed gdptools output while silencing other libraries
logging.basicConfig(level=logging.WARNING)
logging.getLogger("gdptools").setLevel(logging.DEBUG)
```

### Log levels

`gdptools` emits messages at the following levels:

| Level     | What you'll see                                                             |
| :-------- | :-------------------------------------------------------------------------- |
| `DEBUG`   | Internal state, geometry validation details, intersection data              |
| `INFO`    | Workflow milestones, timing summaries, data dimensions, weight-gen progress |
| `WARNING` | Recoverable issues (e.g., antimeridian wrapping, CRS validation fallbacks)  |
| `ERROR`   | Failures that precede an exception being raised                             |

See the [logging demo notebook](Examples/Rasters/logging_demo.md) for a hands-on walkthrough.

(examples)=

## Examples

The following table summarizes the example notebooks available in the documentation.

### ClimateR-Catalog Examples

These tutorials demonstrate how to use `ClimRCatData` to access and process climate data from the ClimateR-Catalog. See the [ClimateR-Catalog documentation](catalog_datasets.md) for a table of some of the common datasets available in the catalog.

| Example Notebook            | Description                                                                   | Link                                                                      |
| :-------------------------- | :---------------------------------------------------------------------------- | :------------------------------------------------------------------------ |
| GridMET Grid-to-Polygon     | Aggregates daily GridMET climate data to HUC12 polygons.                      | [View Notebook](Examples/ClimateR-Catalog/gridmet.md)                     |
| GridMET DRB Grid-to-Polygon | Aggregates daily gridMET climate data to Delaware River Basin HUC12 polygons. | [View Notebook](Examples/ClimateR-Catalog/gridmet_drb_polygon.md)         |
| 3DEP Grid-to-Line           | Interpolates 3DEP elevation data from a GeoTIFF along NHD stream segments.    | [View Notebook](Examples/ClimateR-Catalog/3DEP-Elevation-grid-to-line.md) |
| GridMET Grid-to-Line        | Interpolates daily gridMET variables along stream flowlines.                  | [View Notebook](Examples/ClimateR-Catalog/gridmet_grid_to_line.md)        |

### NHGF STAC Examples

These tutorials demonstrate how to use `NHGFStacData` to access and process climate data from the USGS NHGF Stac Catalog. See the [NHGF Stac Catalog](nhgf_stac_datasets.md) for a table of some of the common datasets available in the catalog.

| Example Notebook                    | Description                                                               | Link                                                             |
| :---------------------------------- | :------------------------------------------------------------------------ | :--------------------------------------------------------------- |
| CONUS404 Daily Data Grid-to-Polygon | Aggregates daily CONUS404 data to HUC12 polygons using `NHGFStacData`.    | [View Notebook](Examples/NHGF-STAC/NHGFStacData_CONUS404.md)     |
| CONUS404 Daily Data Grid-to-Line    | Interpolates gridded data along lines using `NHGFStacData`.               | [View Notebook](Examples/NHGF-STAC/NHGFStacData_Grid_to_Line.md) |
| NLCD Land Cover Zonal Statistics    | GeoTIFF-backed STAC collection for categorical land cover classification. | [View Notebook](Examples/NHGF-STAC/NHGFStacData_NLCD.md)         |

### Non-Catalog Examples

For custom datasets not in the ClimateR-Catalog or NHGF STAC, you can use `UserCatData` to access data from OPeNDAP endpoints or other sources.

| Example Notebook           | Description                                                                                                                                                     | Link                                                          |
| :------------------------- | :-------------------------------------------------------------------------------------------------------------------------------------------------------------- | :------------------------------------------------------------ |
| GridMET Non-Catalog        | Demonstrates using `UserCatData` with a non-catalog OPeNDAP endpoint for GridMET data.                                                                          | [View Notebook](Examples/Non-Catalog/Gridmet_non_catalog.md)  |
| CONUS404 Daily Non-Catalog | Demonstrates using `UserCatData` as an alternative to `NHGFStacData` for NHGF STAC data. Can be used as a template for reading in data for other STAC catalogs. | [View Notebook](Examples/Non-Catalog/UserCatData_CONUS404.md) |

### Polygon-to-Polygon Examples

For workflows involving two sets of polygons, such as watershed-to-county or county-to-state, use the `WeightGenP2P` class to calculate intersection weights. The Area-Weighted Aggregation can then be performed as demonstrated in the second Extensive vs Intensive variables example.

| Example Notebook                                                                                 | Description                                                                                       | Link                                                                     |
| :----------------------------------------------------------------------------------------------- | :------------------------------------------------------------------------------------------------ | :----------------------------------------------------------------------- |
| Polygon-to-polygon weight calculation                                                            | Calculate the intersection weights between source and target polygons. Uses `WeightGenP2P` class. | [View Notebook](Examples/PolyToPoly/PolyToPoly_weights.md)               |
| Area-Weighted Aggregation of Polygonal Datasets, including `intensive` and `extensive` variables | Aggregates an idealized set of polygons each with extensive and intensive variables.              | [View Notebook](Examples/PolyToPoly/Extensive_vs_intensive_variables.md) |

### Rasters

`gdptools` supports raster data processing with multiple computational engines. For standard zonal statistics, choose between `serial`, `parallel`, `dask`, or `exactextract` engines. The `exactextract` engine uses the [exactextract](https://github.com/isciences/exactextract) library for high-performance computation with fractional pixel coverage.

| Example Notebook        | Description                                                                                                            | Link                                              |
| :---------------------- | :--------------------------------------------------------------------------------------------------------------------- | :------------------------------------------------ |
| Raster Zonal Statistics | Demonstrates zonal statistics using serial, parallel, and exactextract engines for continuous and categorical rasters. | [View Notebook](Examples/Rasters/zonal_stats.md)  |
| Configuring Logging     | Shows how to enable, customize, and filter gdptools log output.                                                        | [View Notebook](Examples/Rasters/logging_demo.md) |

```{admonition} Which class should I use?
:class: tip

- Grid → Polygon: `ClimRCatData` or `UserCatData` + `WeightGen` + `AggGen` (set `weight_gen_crs=6931`; choose engine via `serial|parallel|dask`; start with a modest `jobs` value—each worker loads the source dataset, so `jobs=-1` can quickly exhaust memory).
- Polygon → Polygon: `WeightGenP2P` (handle intensive vs extensive stats accordingly).
- Rasters: `UserTiffData` + `ZonalGen`/`WeightedZonalGen` (zonal statistics; no weight generation step).
```
