TECHNICAL DOCUMENTATION

Documentation

Architecture, methodology, API reference, and integration guide for Thagorus’s causal demand intelligence platform.

Read the science

⚡

How it works

Causal identification using weather as a natural experiment — no A/B test required

⚙

Architecture

From raw weather signals to dollar-denominated recommendations in a single-node pipeline

🔌

API reference

REST endpoints, webhook patterns, and a step-by-step integration guide

🛡

Safety model

Confidence intervals, break conditions, shadow mode, and circuit breakers

The Problem

You’re spending more on ads every quarter and trusting the numbers less. The platforms say your campaigns are working. Your CFO says prove it. And nobody can separate weather-driven sales from ad-driven sales — because the two are confounded.

If you sell sunscreen, outdoor furniture, beverages, HVAC, apparel, or anything else where weather moves the needle, you already feel this. A sunny weekend explodes demand and your bids don’t move fast enough. A cold snap kills conversions and you’re still spending yesterday’s budget. The weather is in your dashboards — you just can’t act on it.

The Complexity Gap

Marketing teams write rules: “boost sunscreen ads when it’s hot.” But weather-demand relationships are multi-variable, non-linear, and category-specific. Temperature alone has interaction effects with humidity, UV index, wind speed, precipitation, and the trailing five-day trajectory. A hot day in Phoenix (where 105°F is normal) is categorically different from a hot day in Seattle (where 85°F is an event). Rules collapse the moment reality gets complicated.

The gap between what a rule system can represent and what weather actually does to demand is enormous. Thagorus closes that gap with a causal model that captures the full surface of weather-demand interaction — not the <0.001% a rule system covers.

◆See the gap in actionBuild weather rules and watch them lose to the model on out-of-sample data.→

How It Works

The core question Thagorus answers is not “does weather affect demand?” (it obviously does). The question is: how much of the sales lift you see on a hot weekend was caused by your ads, and how much would have happened anyway because of the weather?

Weather as a Natural Experiment

The econometric solution to this is instrumental variables — finding a source of variation that shifts demand but is uncorrelated with advertising decisions. Philip G. Wright used weather as the first instrument in the history of econometrics (1928), estimating demand elasticities for agricultural commodities using regional rainfall.

Weather works as an instrument because it satisfies the core requirements by construction: advertisers cannot cause the weather, daily weather realizations are not anticipated by budget cycles set weeks in advance, and weather measurably shifts consumer demand across dozens of product categories. Dell, Jones & Olken (2014) survey 83 papers in top-5 economics journals that use weather for causal identification.

Continuous Identification

Google’s Meridian recommends geo-holdout experiments for causal calibration. Each experiment costs real revenue, takes 4-8 weeks, and produces a single noisy estimate for a single channel. Most brands run one or two per year.

1000+

Natural experiments per year across every geography, at zero cost, for every channel simultaneously.

Weather-shock identification inverts this tradeoff: it provides thousands of natural experiments per year across every geography, at zero cost, for every channel simultaneously. The LCDM uses weather shocks — deviations from seasonal norms — as continuous natural experiments. Every cold snap, heat wave, and rainstorm that departs from expectations provides exogenous demand variation that separates weather-driven sales from ad-driven sales.

This is not weather-triggered ad targeting. It is weather-driven causal identification — a fundamentally different capability.

★Full methodologyPeer-review level treatment of the identification strategy, estimation, and inference.→

Architecture

The LCDM is not a single model. It is a stack of complementary models, each contributing a different capability — running on a single node, no Spark, no warehouse, no distributed compute overhead.

architecture.weathervane.io

Thagorus v1.0 system architecture — single-node pipeline, no distributed compute

Pipeline Stages

Weather Data: 9 signals × 47 DMAs. We ingest from NOAA ISD, ERA5 reanalysis, and numerical weather prediction models. Raw observations are quality-controlled, gap-filled, and transformed into the 9-signal weather tensor: temperature, UV index, humidity, precipitation, wind speed, cloud cover, dewpoint, barometric pressure, and visibility — plus temporal derivatives (rate of change, acceleration) and geographic deviations from seasonal norms. When Phoenix hits 103°F, the model doesn’t just see “hot” — it sees 8°F above the June baseline, Day 3 of a warming trend, with UV 9.2 and humidity at 18%.

Causal Engine: IV-based identification. Panel ridge regression with empirical Bayes shrinkage. The engine uses weather shocks as instrumental variables to separate what the weather caused from what your ads caused. When sunscreen sales spike 41% during a Phoenix heat wave, the engine decomposes that into +33% from UV/temperature and +8% from your campaign — not the other way around.

Evidence Bundles. Every recommendation ships with a complete audit trail. Example: “Phoenix heat wave, June 14-16 → sunscreen +41% with 90% CI [$38K, $52K]. Break conditions: cloud cover >60% pauses the call; temp >105°F triggers indoor-inversion monitoring.” Every number is inspectable. Nothing is a black box.

REST API: POST /v1/recommend. Send weather context in — get dollar-denominated actions out. FastAPI, versioned at /v1, API key authentication. Returns structured JSON with product-level lift estimates, budget shift amounts in USD, confidence intervals, and the specific weather conditions that would reverse the recommendation.

Your Stack: DSP, CRM, or Dashboard. Webhook integration to your existing systems. When UV hits 9+ in Phoenix, your DSP gets a budget-shift webhook within 15 minutes. Recommendations can be consumed via API polling, webhooks, or the upcoming dashboard UI.

What the Model Sees

When the model receives “82°F in Seattle,” it doesn’t just see a number.

It sees: 82°F (temperature), 20 degrees above the DMA baseline of 62°F (σ+2.8 anomaly), Day 2 of a warming trend (d₁=+3.1°F/day), accelerating (d₂=+0.8), with UV 8, humidity 45%, wind 7mph, no rain in 6 days (dry streak building), and a 72-hour forecast showing cooling by Thursday.

That’s the weather tensor — and every element matters.

v1.0

The Network

James and Stein proved something in 1961 that startled the statistics community: estimating three or more quantities separately is always worse than estimating them together. Always. This result is so counterintuitive it sparked a decade of debate. Efron & Morris (1975) demonstrated it using baseball batting averages — a 71% error reduction.

Cross-Tenant Learning

Most advertisers on the platform have limited history: perhaps 6-12 months of daily data across a handful of markets. Estimating market-category-specific demand elasticities from this alone is noisy. We address this with empirical Bayes shrinkage: each tenant’s parameter estimates are pulled toward the network-wide posterior, with the degree of pull determined by data quality.

A new sunscreen brand joining the platform instantly inherits UV-demand patterns learned across all personal care brands on the network. A mature brand with two years of data retains its own estimates with minimal shrinkage. The math is settled (James & Stein, 1961; Efron & Morris, 1975). The question is not whether to pool — it is how aggressively (Gelman & Hill, 2006).

Safety Guards

Sign Agreement

Sunscreen and ski equipment brands must agree on temperature coefficient direction before pooling.

Dispersion Guards

If cross-tenant variance is too high, the pool is uninformative and shrinkage is disabled.

Range Ratio

Tenant estimates must fall within a plausible range of the pool to be included.

●See the network learnWatch confidence intervals tighten as tenants join the platform.→

Integration

The API accepts weather context and returns dollar-denominated spend recommendations with full evidence trails. Built on FastAPI with OpenAPI schema generation. All endpoints are versioned under /v1 and require API key authentication via the X-WV-Key header. Responses include product-level lift estimates, budget shift amounts in USD, confidence intervals, and the specific conditions that would reverse the recommendation.

Sample Request

terminal

curl -X POST https://api.weathervane.ai/v1/recommend \
  -H "X-WV-Key: wv_live_abc123..." \
  -H "Content-Type: application/json" \
  -d '{
    "brand_id": "sunco-sunscreen",
    "dma": "phoenix-az",
    "date_range": "2026-06-14/2026-06-16",
    "weather_context": {
      "temperature_f": 103,
      "uv_index": 9.2,
      "humidity_pct": 18,
      "wind_mph": 5,
      "cloud_cover_pct": 8,
      "temp_derivative_1": 2.5,
      "consecutive_hot_days": 3
    }
  }'

Sample Response

terminal

{
  "recommendation": {
    "action": "increase_spend",
    "budget_shift_usd": 23400,
    "confidence_interval_90": [19800, 27100],
    "products": [
      {
        "sku": "spf50-beach-lotion",
        "lift_pct": 41,
        "note": "UV-driven, peak elasticity zone"
      },
      {
        "sku": "after-sun-aloe",
        "lift_pct": 28,
        "note": "24h UV lag → burn-care window"
      },
      {
        "sku": "zinc-face-shield",
        "lift_pct": 18,
        "note": "Outdoor athlete segment"
      }
    ],
    "break_conditions": [
      "cloud_cover > 60% → pause recommendation",
      "temperature > 105F → monitor for indoor-inversion",
      "forecast_uncertainty > 8F → widen intervals"
    ]
  },
  "evidence_bundle_url": "/v1/evidence/phx-20260614-sunscreen"
}

Webhook Pattern

Register a webhook URL to receive real-time alerts when weather conditions cross actionable thresholds. Webhooks fire within 15 minutes with the full evidence bundle: the triggering weather signal, the dollar-denominated recommendation, the confidence interval, and the conditions that would reverse the call.

terminal

curl -X POST https://api.weathervane.ai/v1/webhooks \
  -H "X-WV-Key: wv_live_abc123..." \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://your-dsp.com/wv-trigger",
    "events": [
      "weather_shock.uv_spike",
      "weather_shock.heat_wave",
      "weather_shock.cold_snap",
      "weather_shock.precip_event"
    ],
    "dma_filter": ["phoenix-az", "denver-co", "chicago-il"],
    "categories": ["sunscreen", "cold-beverage", "outerwear"],
    "min_budget_shift_usd": 5000
  }'

Getting Started

Connect a data source

We support NOAA, OpenWeather, Tomorrow.io, or bring your own. Thagorus normalizes everything into the 9-signal tensor the model expects.

Define your product mappings

Sunscreen maps to UV + temp + humidity. Hot cocoa maps to temp + wind chill + precip. HVAC maps to heating-degree-days + forecast delta. You define the mapping; the model learns the elasticities.

Run a backtest

Replay 90 days of history to see how weather explains variance you’ve been attributing to creative or pricing. That unexplained spike in Denver last March? A 3-day dry streak after weeks of rain. The model finds it; you verify it.

Go live

Enable real-time recommendations. When conditions cross actionable thresholds, your DSP gets a budget-shift webhook within 15 minutes with dollar amounts, confidence intervals, and break conditions.

Ready to integrate?

We will provision sandbox credentials and walk you through the integration.

Safety & Guardrails

Thagorus is designed to be wrong safely. Every automated system eventually encounters conditions outside its training distribution. The question is not whether the model will be wrong — it is whether the system degrades gracefully when it is.

Confidence Intervals

“Sunscreen lift: +41%, 90% CI [$38K, $52K].” Every recommendation ships with a calibrated interval. When Phoenix enters an unprecedented 118°F event, intervals widen automatically — the model knows it’s extrapolating.

Break Conditions

“cloud_cover > 60% → pause; temp > 105°F → monitor indoor-inversion; forecast_uncertainty > 8°F → widen intervals.” Every recommendation tells you exactly what would make it wrong.

Shadow Mode

New tenants start in shadow mode: the model says “shift $23K to sunscreen in Phoenix” but doesn’t act. You review the evidence bundle, compare to your own sales data, and approve. Only after calibration validates against your history does the system go autonomous.

Circuit Breakers

If prediction errors exceed rolling thresholds — say, the model called a $23K sunscreen lift but actuals came in at $9K three times running — the system widens intervals, reduces autonomy, caps budget shifts, and alerts you. It degrades gracefully, not silently.

We pre-register four evaluation components that customers can run on their own data without our involvement: decision lift (does the policy improve incremental profit vs. baselines?), calibration (do confidence intervals match realized coverage?), ablation (does removing weather features degrade performance specifically in weather-sensitive categories?), and safety audit (how often do circuit breakers trigger, and what is the distribution of drawdowns?).

☍Full trust documentationData residency, GDPR compliance, SOC 2 status, and governance framework.→

Roadmap

Thagorus is honest about what exists and what is planned. The science is designed. The first pipeline is built. Real-world validation with design partners is the current milestone.

Weather-shock identification v1.0
Natural experiments using weather deviations across markets for continuous causal identification.
Hierarchical partial pooling v1.0
Empirical Bayes shrinkage across tenants and categories. James-Stein theorem guarantees improvement for 3+ entities.
Evidence bundles v1.0
Confidence intervals, break conditions, backtests, and full audit trails with every recommendation.
Shadow mode with human-in-the-loop v1.0
Model generates recommendations; humans approve. Safety default for all new tenants.
Deep learning demand models
Cross-attention weather encoding, temporal transformers, and time-series foundation model integration for zero-shot new category onboarding.
Reinforcement learning optimization
Weather-contingent budget policies with model predictive control. Closed-loop optimization that adjusts as forecasts update.
Federated cross-tenant intelligence
Privacy-preserving cross-tenant learning. Each tenant contributes to the network without exposing raw data.
Cross-economy demand graph
A heatwave in Phoenix increases sunscreen demand, shifts discretionary spending in Tucson, reroutes cold-chain logistics from Dallas, and moves insurance pricing in Scottsdale. These cascades emerge from the graph.

If you’ve read this far, you should probably talk to us.

References

The LCDM draws on a deep body of work in causal inference, hierarchical modeling, control theory, and marketing science. Key references:

Wright, P. G. (1928). The Tariff on Animal and Vegetable Oils. Macmillan. First instrumental variables estimation in econometric history, using weather to identify demand elasticities.
James, W. & Stein, C. (1961). Estimation with quadratic loss. Proceedings of the Fourth Berkeley Symposium. Proved that pooled estimation always dominates individual estimation for three or more quantities.
Efron, B. & Morris, C. (1975). Data analysis using Stein’s estimator and its generalizations. Journal of the American Statistical Association, 70(350), 311-319. 71% error reduction demonstrated with baseball batting averages.
Dell, M., Jones, B. F., & Olken, B. A. (2014). What do we learn from the weather? The new climate-economy literature. Journal of Economic Literature, 52(3), 740-798. Canonical survey of 83 papers using weather as an instrument for causal identification.
Gordon, B. R., Zettelmeyer, F., Bhatt, N., & Dias, F. (2019). A comparison of approaches to advertising measurement. Marketing Science, 38(6), 913-940. Documents the failure of observational MMMs vs. randomized experiments.
Shapiro, B. T., Hitsch, G. J., & Tuchman, A. E. (2021). TV advertising effectiveness and profitability. Econometrica, 89(4), 1855-1879. Panel data with instrumental variables for ad effectiveness measurement.
Gelman, A. & Hill, J. (2006). Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press. Practical framework for partial pooling and hierarchical models.
Chernozhukov, V. et al. (2018). Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal, 21(1), C1-C68. Orthogonal scores for causal estimation with high-dimensional nuisance parameters.

📚Full bibliography75+ peer-reviewed citations organized by domain: causal inference, control theory, marketing science, and more.→

@keyframes wv-gradient-text-R2lc6 { 0% { background-position: 0% 50%; } 50% { background-position: 100% 50%; } 100% { background-position: 0% 50%; } } Documentation

The Problem

The Complexity Gap

How It Works

Weather as a Natural Experiment

Continuous Identification

Architecture

Pipeline Stages

What the Model Sees

The Network

Cross-Tenant Learning

Safety Guards

Integration

Sample Request

Sample Response

Webhook Pattern

Getting Started

Safety & Guardrails

Roadmap

References

Documentation