6. Configuration via a single Config dataclass (env + defaults)¶
Status¶
Accepted
Context¶
Slice 1 (ingestion) is the first code that needs runtime configuration — the site
code, where raw volumes land, and where the SQLite index lives. CLAUDE.md requires
that these come from config and are never hardcoded, and points at "one config
file / env." Later slices (notably the collect service) will need the same
values plus more (poll interval, failover list), so the mechanism has to scale
without a rewrite of every call site.
Decision¶
A single frozen Config dataclass (src/backscatter/config.py) is the one
source of truth. Every module takes a Config; no module reads the environment
on its own.
load_config() resolves each field with precedence CLI argument > environment
variable > built-in default:
site— CLI positional (e.g.backscatter pull KFTG) >BACKSCATTER_SITE>KFTG.data_dir—BACKSCATTER_DATA_DIR>./data.db_path—BACKSCATTER_DB_PATH><data_dir>/backscatter.db.
No config-file parsing yet. The loader is the only place that knows where values come from, so a TOML (or similar) file loader slots in there later as one more precedence layer (file sitting between env and defaults) without touching callers.
Consequences¶
- Tests and ad-hoc runs configure everything via env vars or by constructing a
Configdirectly — no global state, easy to isolate in atmp_path. - Adding a setting = one dataclass field + one line in
load_config(). - The defaults make
backscatter pullwork out of the box for the operator's KFTG default while staying fully overridable for any CONUS site.
Alternatives considered¶
- A config file (TOML) now. Rejected for this slice as premature: more
machinery than ingestion needs, and the dataclass-as-source-of-truth shape means
we can add it later with no churn. Revisit when
collectneeds richer config. - Scattered
os.environreads at each call site. Rejected: no single source of truth, hard to test, and exactly what CLAUDE.md warns against.
Update — Slice 2 (location-based site resolution)¶
The primary location input is now a lat/lon, not a site code. Config gains
lat and lon; the active site is resolved at load as the nearest radar to
the lat/lon against the bundled NEXRAD table (ADR-0005), via
sites.select.nearest_site.
Resolution and precedence:
- lat / lon — arg > BACKSCATTER_LAT / BACKSCATTER_LON > default
(Elizabeth, CO: 39.3603, -104.5969, which resolves to KFTG).
- site — an explicit override (site arg, e.g. backscatter pull KTLX, or
BACKSCATTER_SITE) still wins, upper-cased. Absent that, site is the resolved
nearest radar. There is no longer a hardcoded default site.
The single-source-of-truth and "env reads live only in config.py" invariants are
unchanged. pull is untouched — it still reads config.site, which is now
resolved rather than hardcoded.
Update — Slice 8 (multiple locations)¶
Config generalizes from a single lat/lon to a list of named locations, exactly
one flagged the default ("Home"). Config.locations: tuple[Location, ...]; each
Location carries name, lat, lon, site, is_default, site_override. The former
flat fields (lat/lon/site/site_override) are now read-only properties delegating
to the default location, so every existing single-location consumer keeps working.
- Source:
BACKSCATTER_LOCATIONS— a JSON list of{name, lat, lon, default?}. Absent it, the singleBACKSCATTER_LAT/BACKSCATTER_LONform is treated as a one-entry list named "Home" (back-compat). TOML can replace the JSON env later without touching call sites. - Validation (at load, raises
ValueError): ≥1 location, exactly one default, unique names (case-insensitive). - Site override applies to the default location only.
BACKSCATTER_SITE(or thepullpositional) pins Home'ssite; every other location resolves its own nearest radar. Documented because it's the one non-obvious interaction. - Frames are per-radar, not per-location. The
volumesindex is unchanged ((site, scan_time)); a location maps to frames via its resolvedsite. Two co-located locations share one radar's frames —collectstores each volume once (dedupe on(site, scan_time)), never per-location. The API resolves alocationparam to its site.