OpenWQ

Model Calibration
A comprehensive guide to calibrating OpenWQ models
β€” sensitivity analysis, optimization algorithms, and best practices

Contents

β€’ Calibration Framework Overview

β€’ Parameter Categories (106+)

Section 01: Sensitivity Analysis

β€’ Morris Screening

β€’ Sobol Sensitivity Analysis

Section 02: DDS Optimization

β€’ DDS Algorithm

β€’ Objective Functions

β€’ Temporal Resolution

Section 03: Calibration Workflow

β€’ Step-by-Step Workflow

β€’ Observation Data Sources

β€’ GRQA Database Integration

β€’ CSV Format

Section 04: Implementation

β€’ Priority-Based Calibration

β€’ Running Calibration

β€’ Calibration Output

β€’ Best Practices

Section 05: Post-Calibration Reporting

β€’ Interactive HTML Reports

β€’ Basin Maps & Spatial Analysis

β€’ Multi-Variant Basin Reports

Calibration Framework Overview

OpenWQ includes a comprehensive calibration framework with:

  • 106+ calibratable parameters
  • DDS optimization algorithm
  • Morris screening for sensitivity
  • Sobol analysis for detailed SA
  • Docker/Apptainer deployment
  • Checkpoint/restart capability
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚     Calibration Workflow            β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                     β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”‚
β”‚  β”‚ 1. Morris Screening (~200)  β”‚    β”‚
β”‚  β”‚    β†’ Identify sensitive     β”‚    β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚
β”‚                 β–Ό                   β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”‚
β”‚  β”‚ 2. Sobol Analysis (~1000)   β”‚    β”‚
β”‚  β”‚    β†’ Rank top parameters    β”‚    β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚
β”‚                 β–Ό                   β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”‚
β”‚  β”‚ 3. DDS Optimization (~300)  β”‚    β”‚
β”‚  β”‚    β†’ Optimize top 10-15     β”‚    β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚
β”‚                 β–Ό                   β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”‚
β”‚  β”‚ 4. Validation               β”‚    β”‚
β”‚  β”‚    β†’ Independent period     β”‚    β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚
β”‚                                     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                            

Parameter Categories (106+ total)

Category Count Examples Typical Range
BGC (NATIVE_BGC_FLEX) 26 k_nitrification, k_denitrification, theta 0.001 - 1.0 /day
PHREEQC 22 Initial concentrations, pCO2, SI Varies
Sorption 16 Kfr, qmax, KL, bulk_density 0.01 - 100 L/kg
Sediment Transport 14 erosion_index, erodibility, cohesion 0.1 - 10
Transport 5 dispersion_x/y/z, characteristic_length 0.1 - 100 mΒ²/s
Lateral Exchange 4 k_exchange (river-soil, soil-GW) 0.0001 - 0.1 /s
Source/Sink 21 Load scaling, export coefficients 0.1 - 5.0Γ—
Key Insight: Source/sink parameters are typically THE MOST SENSITIVE β€” start calibration there!
01

Sensitivity Analysis

Morris Screening (Elementary Effects)

Efficient method to identify influential parameters with minimal model runs.

Cost: O(r Γ— (k+1))

r = 10-20 trajectories, k = parameters

Example: 30 params Γ— 15 trajectories = ~465 runs

Metrics

ΞΌ* (mu-star)Mean absolute effect β†’ Overall influence
Οƒ (sigma)Standard deviation β†’ Non-linearity

Interpretation

ΞΌ* Οƒ Meaning
High Low Linear influence β†’ Calibrate
High High Non-linear β†’ Calibrate carefully
Low Low Non-influential β†’ Fix at default
Low High Interactive β†’ Investigate
Typical result: Reduces 100+ params to 30-40 for detailed analysis

Sobol Sensitivity Analysis

Variance-based method for detailed sensitivity ranking.

Cost: O(N Γ— (2k+2))

N = 1000-5000 samples, k = parameters

Example: 30 params Γ— 2000 samples = ~124,000 runs

Indices

S1First-order: Direct parameter effect
STTotal-order: Including interactions

Interpretation

  • S1 β‰ˆ ST: Parameter acts independently
  • ST >> S1: Strong interactions
  • Sum(S1) < 1: Significant interactions exist
Use after Morris: Apply Sobol only to the ~30 parameters identified as potentially influential
02

DDS Optimization

Dynamically Dimensioned Search (DDS)

Derivative-free optimization designed for expensive models.

Key Features

  • Reduces search dimensionality over time
  • Early: Explore many parameters
  • Late: Focus on few promising ones
  • No gradient computation needed
Typical budget: 200-500 evaluations for 10-15 parameters
   DDS Behavior Over Iterations

   Early (exploration)
   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
   β”‚ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ β”‚ Many params
   β”‚ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ           β”‚ perturbed
   β”‚ β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ                 β”‚
   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

   Late (exploitation)
   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
   β”‚ β–ˆβ–ˆβ–ˆ                          β”‚ Few params
   β”‚ β–ˆβ–ˆ                           β”‚ refined
   β”‚ β–ˆ                            β”‚
   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

   Probability of perturbation:
   P(i) = 1 - ln(i)/ln(max_iter)
                            

Objective Functions

RMSE

√(Σ(obs-sim)²/n)

Root Mean Square Error

0 β†’ ∞ (lower better)

NSE

1-Σ(o-s)²/Σ(o-ō)²

Nash-Sutcliffe Efficiency

-∞ β†’ 1 (1 perfect)

KGE

1-√((r-1)²+(α-1)²+(β-1)²)

Kling-Gupta Efficiency

Recommended!

PBIAS

100Γ—Ξ£(s-o)/Ξ£(o)

Percent Bias

-∞ β†’ +∞ (0 perfect)
KGE advantages: Balances correlation (r), variability ratio (Ξ±), and bias ratio (Ξ²) β€” better for multi-species calibration

Temporal Resolution

Aggregate observations and model outputs to a common temporal scale before computing objective functions.

Available Resolutions

OptionUse Case
nativeOriginal timestamps (no aggregation)
dailyDaily patterns matter
weeklySmooth day-to-day noise
monthlySeasonal patterns, sparse obs
yearlyLong-term trends, budgets

Aggregation Methods

mean (default) | sum | median | min | max

Use mean for concentrations, sum for loads/fluxes

# Configuration in calibration file
temporal_resolution = "monthly"
aggregation_method = "mean"

# How it works:
# 1. Group obs by reach, species, month
# 2. Extract model outputs for same periods
# 3. Aggregate both to monthly means
# 4. Compute KGE/NSE/RMSE on aggregated data
                        
Tip: Use monthly for sparse grab samples β€” ensures fair comparison with continuous model output
Output: Performance plots generated at specified resolution (time series, scatter, residuals)
03

Calibration Workflow

Step-by-Step Workflow

1

Prepare Observation Data

Choose source: Manual CSV, GRQA database extraction, or Copernicus synthetic generation

2

Morris Screening (~200 runs)

Test all 100+ parameters β†’ identify ~30-40 influential ones

3

Sobol Analysis (~1000 runs)

Detailed SA on 30 params β†’ rank and select top 10-15

4

DDS Optimization (~300-500 runs)

Optimize top 10-15 parameters, fix others at defaults

5

Validation

Test calibrated parameters on independent time period

Observation Data Sources

Two options for preparing observation data:

πŸ“„ Option 1: Manual CSV

observation_data_source = "csv"

  • Prepare data manually
  • Full control over format
  • Use any data source (USGS, EPA, local monitoring, etc.)

Best for: Custom datasets, local monitoring networks

🌍 Option 2: GRQA Database

observation_data_source = "grqa"

  • 43 water quality parameters
  • ~100 million observations worldwide
  • Auto station-to-reach matching
  • Local data or Zenodo download

Best for: Large-scale studies, data-rich regions

Local Data Support: GRQA supports pointing to local data folders if already downloaded from Zenodo

GRQA Database Integration

Global River Water Quality Archive

  • 43 water quality parameters
  • ~100 million observations worldwide
  • Automatic Zenodo download
  • Spatial station-reach matching
grqa_config = {
    # Local or download from Zenodo
    "local_data_path": "/data/GRQA",
    "river_network_shapefile": "rivers.shp",
    "max_station_distance_m": 500,
    "species_mapping": {
        "NO3": "NO3-N",
        "NH4": "NH4-N"
    }
}
                        

Species Mapping (GRQA β†’ Model)

GRQAModel Species
NO3NO3-N
NH4NH4-N
TNTN
PO4PO4-P
TPTP
TSSTSS
DOCDOC
Run extraction:
python my_calibration.py --prepare-obs-only

Observation Data CSV Format

# observations.csv (all sources produce this format)
datetime,reach_id,species,value,units,source,uncertainty,quality_flag
2018-01-15 10:00:00,1200014181,NO3-N,2.50,mg/l,USGS_station_A,0.25,GOOD
2018-01-15 10:00:00,1200014181,NH4-N,0.15,mg/l,USGS_station_A,0.02,GOOD
2018-02-01 10:00:00,1200014181,NO3-N,3.10,mg/l,USGS_station_A,0.31,GOOD
2018-02-01 10:00:00,1200014181,TP,0.08,mg/l,USGS_station_A,0.01,GOOD
...
                

Required Columns

datetimeYYYY-MM-DD HH:MM:SS
reach_idMatching model output
speciesCase-sensitive name
valueMeasured concentration
unitsmg/l, ug/l, etc.

Optional Columns

sourceData provider ID
uncertaintyMeasurement error
quality_flagGOOD, SUSPECT, BAD

Priority-Based Calibration

When resources are limited, focus on the most influential parameters:

TIER 1 Must Calibrate

  • Source/sink scaling factors
  • k_nitrification
  • k_denitrification
  • Kfr_PO4 or qmax_PO4
  • dispersion_x

5-8 parameters

TIER 2 Important

  • Temperature coefficients (ΞΈ)
  • k_mineralization
  • k_P_adsorption
  • erosion_index
  • Secondary sorption params

Next 5-8 parameters

TIER 3 Refinement

  • Half-saturation constants
  • Lateral exchange rates
  • Volatilization rates
  • Langmuir exponents

Remaining parameters

Running Calibration

Copy the template to your working directory, configure parameters, then run:

# 1) Copy and edit the template
cp calibration_config_template.py my_calibration.py

# 2) Run in different modes:
python my_calibration.py                    # Full calibration
python my_calibration.py --sensitivity-only # SA only
python my_calibration.py --prepare-obs-only # Obs data only
python my_calibration.py --dry-run          # Validate config
python my_calibration.py --resume           # Resume from checkpoint
                        
Template pattern: Copy β†’ Edit β†’ Run β€” no need to modify library code

Command-Line Flags

--sensitivity-onlyRun Morris/Sobol only
--prepare-obs-onlyPrepare observations (GRQA/CSV)
--dry-runValidate without running
--resumeContinue from checkpoint

In-File Options

run_sensitivity_first = True
β†’ Auto SA β†’ Calibration pipeline

Calibration Output

Output Files

best_parameters.jsonOptimal values
calibration_history.jsonAll evaluations
parameter_definitions.jsonParameter metadata & bounds
matched_data.csvObs-model matched pairs
calibration_report.htmlInteractive HTML report
basin_report.htmlPer-basin multi-variant report
sensitivity_results.jsonSA results (if run)

Runtime Estimates

Model Runtime300 Evals
5 min~25 hours
15 min~75 hours
30 min~150 hours
HPC Tip: Use job arrays to run evaluations in parallel on cluster

Best Practices

βœ… Do

  • Start with Morris screening to reduce parameters
  • Use log transform for rate constants
  • Split data: calibration + validation periods
  • Use KGE for multi-species calibration
  • Document parameter choices
  • Check physical plausibility of results

❌ Don't

  • Calibrate all 100+ parameters at once
  • Ignore parameter correlations
  • Use entire dataset for calibration
  • Accept physically unrealistic values
  • Skip sensitivity analysis
  • Overfit to noisy observations
Golden Rule: Fewer well-chosen parameters beats many poorly constrained ones
05

Post-Calibration Reporting

Interactive HTML Reports

Calibration automatically generates self-contained HTML reports with interactive Plotly.js charts.

6 Diagnostic Charts

  • Convergence curve (objective vs iteration)
  • Parameter evolution trajectories
  • Time series: observed vs simulated
  • Scatter plot with 1:1 line
  • Residual analysis
  • Parameter sensitivity ranking
Self-contained: Single HTML file with embedded data β€” share via email, no server needed

Report Features

Plotly.jsZoom, pan, hover tooltips
Dark/Light modeTheme toggle included
Parameter tableBest values + bounds + position bar
Species metricsPer-species KGE/NSE/RMSE
Sidebar navJump to any section

Auto-Generated

Reports are created automatically when calibration completes β€” no extra steps needed.

Interactive Basin Maps

Calibration reports include interactive Leaflet.js maps showing the basin spatial context.

Map Layers

  • HRU polygons β€” colored by area quintiles
  • River network β€” styled by Strahler order
  • Observation stations β€” red markers with popup info

Basemaps

3 selectable basemaps: CARTO Light, OpenTopoMap, Esri Satellite

Data source: GeoPackage files (*_basinHru.gpkg, *_riverNetwork.gpkg)

Basin Info Grid

Total HRUsCount from GeoPackage
River reachesNetwork segments
Total areakmΒ² from HRU polygons
Network lengthkm of river network
Max Strahler orderStream hierarchy
Obs stationsMatched monitoring sites

Interactive Controls

Layer toggle, zoom, scale bar, legend, click-for-info popups

Per-Basin Multi-Variant Reports

When running multiple variants (A/B/C/D), a consolidated basin report is auto-generated comparing all variants.

Basin Report Contents

  • Overview KPIs β€” best KGE per variant
  • Basin map β€” shared spatial context
  • Variant comparison table β€” side-by-side metrics
  • Detail cards β€” per-variant parameter tables
  • Links β€” to detailed per-variant reports
Auto-discovery: Sibling workspaces (workspace_{basin}_{variant}) are detected automatically

Post-Calibration Pipeline

Calibration completes
        β”‚
        β–Ό
parameter_definitions.json  ← saved
matched_data.csv            ← saved
        β”‚
        β–Ό
calibration_report.html     ← per-variant
        β”‚
        β–Ό
Detect sibling workspaces
        β”‚
        β–Ό
basin_report.html           ← multi-variant
                                

Summary

πŸ” Screen

Morris screening to identify influential parameters (106 β†’ 30)

πŸ“Š Rank

Sobol analysis for detailed sensitivity ranking (30 β†’ 10-15)

🎯 Optimize

DDS calibration of top parameters (200-500 evaluations)

πŸ“‹ Report

Auto-generated interactive HTML reports with maps & charts

Key Message: Hierarchical approach (Screen β†’ Rank β†’ Optimize β†’ Report) with
automated post-calibration diagnostics for every basin and variant

Thank You

Questions?

Calibration scripts: supporting_scripts/Calibration/

Observation data: CSV | GRQA Database | Reports: HTML with interactive maps & charts