PPA Pricing Methodology — How DealFlow Sources US PPA Prices

Why this is hard

US PPA pricing is opaque on purpose. Counterparties treat strike prices as commercially sensitive, regulators redact them for 1–3 years (CPUC: 3 years post-delivery start; FERC: rate schedules sometimes filed as “see attached” exhibits), and most of the corporate VPPA market settles financially without ever crossing FERC jurisdiction. There's no single canonical source — there are seven partial sources, each with their own coverage gaps and methodology quirks.

DealFlow's approach is to triangulate across all the public sources, label the provenance of every price, and clearly distinguish disclosed from imputed values. We never blend imputed and disclosed numbers in the same metric without flagging it.

The seven layers, ranked by trust

We track seven active provenance values on every PPA, in order of decreasing strike-price reliability:

§1ferc-eqr / ferc-eqr-utility

FERC Electric Quarterly Report

The top tier. Federal regulator's structured wholesale-power filing, every quarter. Sellers and buyers report transactions including counterparty, contract execution date, term, MW, and (when not redacted) rate in $/MWh. We pull cleaned PUDL parquet from Catalyst Cooperative, filter to long-term renewable contracts signed 2023+, and either backfill disclosed rates onto existing entries (ferc-eqr) or bulk-import new utility-side PPAs (ferc-eqr-utility).

Captures well: utility-side renewable PPAs (developer → IOU/CCA/cooperative), hyperscaler-direct contracts where the hyperscaler is itself a registered FERC marketer (Google Energy LLC, Microsoft Energy LLC).

Misses: corporate VPPAs that settle as financial derivatives without physical delivery into FERC markets. Refresh: quarterly.

§2nyserda

NYSERDA Open NY (NY State Tier 1 RES procurement)

NYSERDA runs sealed-bid annual auctions (RESRFP17-1 through RESRFP24-1) for utility-scale solar / wind / hydro with 20-year Tier 1 RES contracts. Counterparties (developer + project SPV LLC) are publicly disclosed via the NY State Open Data Socrata API. The dataset captures 184 RFP awards (2017–2024) totaling ~16 GW gross — split across 19 Operational, 57 Under Construction, and 108 Cancelled. We import all three statuses to support cancellation tracking.

Pricing semantics caveat: fixed_rec_price is disclosed for only 13 of 184 awards (mostly small ≤20 MW from RESRFP17-19), and represents the REC strike component only ($17–27/MWh range), not all-in wholesale + REC compensation. Mixing REC-only strikes with all-in PPA prices would silently drag NYISO Solar/Wind benchmarks down ~50%, so our imputation pipeline explicitly excludes provenance='nyserda' from the disclosed-comparables pool.

Ranks above LBNL on trust because (a) state procurement filings are real contracts, not levelized estimates, (b) counterparty + developer + project LLC are all named explicitly, and (c) cancellation status is regulator-confirmed.

§3lbnl

LBNL Utility-Scale Solar dataset

LBNL's annual Utility-Scale Solar report (Bolinger, Seel, Mulvaney Kemp, et al.) contains an Individual_Project_Data sheet with 425 PV projects priced in levelized 2024$/MWh. CC-BY 4.0 licensed. Their team aggregates FERC EQR + FERC Form 1 + state PUC filings (CPUC, NYSERDA, MA DPU, NJ BPU, etc.) + trade press + direct outreach to developers — at higher quality than we could manage in-house — and publishes cleaned per-project results annually.

Caveats: LBNL prices are levelized (real, time-averaged over contract term) — distinct from as-signed contract prices. Counterparty names are NOT in LBNL public data — only a coarse offtaker-type classification. Solar only; wind has a separate anonymized workbook used as benchmark layer for imputation.

§4epa-gpp-research

EPA Green Power Partnership discovery

EPA's Green Power Partnership database lists 525 corporate green-power buyers with annual MWh disclosure. Public-domain federal data. We use it as a discovery layer: it tells us who is buying clean power, then we research the underlying contracts via press releases, 10-Ks, and sustainability reports. The discovered PPAs themselves usually don't have $/MWh disclosed.

§5sec-edgar

SEC EDGAR full-text search of material contract exhibits

Public utilities and IPPs file PPAs as Exhibit 10.x to 10-K, 10-Q, 8-K, S-1, and S-3 filings. The free EFTS API lets us query 2001–present. Our pipeline runs 7 high-precision queries with negative filters that reject sensitivit|hedge|PTC|avoided cost|RNG false-positive patterns within ±300 chars of any $/MWh match.

Caveats: SEC issuers very rarely disclose actual $/MWh strikes for individual PPAs. Richer disclosures are yieldco subsidiary 10-K/A exhibits, M&A 8-Ks discussing acquired-PPA fair-value liabilities, and regulator-filed rate schedules.

§6press

Manual press-release sourcing + LLM-driven historical backfill

The original DealFlow pipeline. Manual entry from corporate / developer press releases, trade media, sustainability reports. Most entries here do NOT have $/MWh because corporate buyers don't disclose it.

A subset of press entries are LLM-researched 2017–2024 historical backfills (May 2026, parallel research agents covering hyperscaler / industrial-retail / mid-tier-corporate / 2019 / 2022 / 2023 / utility-green-tariff cohorts). Each entry is tagged with the agent label for traceability and audited via the two-pass fact-check protocol below.

§7priceImputed

LBNL-lookup imputation

For entries without a disclosed $/MWh, we run a deterministic lookup-based imputation. For each unpriced PPA: match by (LBNL region from state) × (LBNL year ±1), look up median $/MWh for matching cluster, apply with confidence label (high / medium / low) based on cluster size, and override if LBNL median >2σ from disclosed comparables.

Disclosed-comparables override: if we have ≥5 disclosed entries for an (ISO × tech) cluster AND the LBNL imputation falls >2σ from our disclosed mean, use the disclosed median instead. Catches LBNL's small-DG skew in some regions (LBNL PJM Solar median ~$90/MWh from small commercial DG vs our utility-scale disclosed mean ~$40/MWh).

Coverage: 52.9% of total PPAs (482/912) have an imputed price. Combined with the 34.4% disclosed, total pricing visibility is 87.3%. Imputed prices are always labeled separately from disclosed prices on the website.

Utility green-tariff customer rosters (special pattern)

Three utility programs run mass-enrollment renewable subscription products where most subscribers never get their own press release:

DTE MIGreenPower (Michigan, MISO) — Ford 650 MW and Stellantis 400 MW made the news; the rest (Henry Ford Health 87 MW, Detroit Diesel 45 MW, 7-Eleven 14.5 MW, etc.) only surface in DTE's annual Sustainability Report and Michigan PSC dockets.
Xcel Renewable*Connect / Solar*Connect (MN, CO, WI, NM) — PUC Annual Performance Reports redact individual customer names; opaque outside the largest customers' own sustainability disclosures.
TVA Green Invest + Generation Flexibility (TN, MS, AL, KY) — Local Power Companies (KUB, NES, MTE, MLGW, Huntsville Utilities) re-sell TVA capacity; customer enrollments surface in TVA Annual ESG Disclosures and individual LPC press.

We sweep these via dedicated research agents. MIGreenPower entries that report annual MWh-only are converted at 22% capacity factor (~2,200 MWh/year per MW in MISO/Michigan irradiance). Where a primary source mentions program participation but doesn't publish a firm MW or MWh figure, the entry is rejected.

Quality safeguards

Provenance tag on every entry — never blend disclosed, imputed, ferc-eqr, lbnl, nyserda, etc., in any aggregation without flagging.
Outlier rejection on imputation — internal disclosed-comparable override prevents LBNL's small-DG-skewed regional medians from polluting utility-scale imputations. NYSERDA REC-only strikes are explicitly excluded from the comparables pool.
Two-pass fact-check audit on agent-sourced batches — Phase 5 (May 2026) ran a 25-entry random sample audit, surfaced a 20% HIGH-severity rate in the 2020-2021-gapfill batch, then ran a full second-pass audit on the remaining 25 finding another 24% HIGH. Severity ladder:
- HIGH → revert (URL doesn't support claim, MW off >50%, dead URL, fabricated entry, wrong year). 11 reverts applied.
- MEDIUM → in-place correction or annotation. 4 corrections + 7 annotations applied.
- LOW → silent fix (typos, capitalization).
- CLEAN → no action.
Permanent rejection-list registry —src/data/longtail-rejected.jsonis an append-only list of rules consulted before adding any candidate; matches are silently dropped. Prevents the same fabricated/wrong-source deal from being re-added on every research re-run. Currently seeded with 13 Phase-5 fact-check failures.
Build verify gate — next build must pass clean before any commit; type errors and malformed JSON kill the run.

Coverage by year vs. BNEF

BNEF's annual Corporate Energy Market Outlook reports US corporate PPA volume by signing year. Their figures cover corporate-direct VPPAs and physical PPAs; ours covers corporate + utility-side procurement + state procurement (NYSERDA) + utility green-tariff customer rosters, so we exceed BNEF in some years.

Year	DealFlow GW	BNEF GW	Ratio
2017	5.94	2.3	exceedsLBNL utility-side adds
2018	8.38	8.5	99%well-covered
2019	13.47	13.6	99%closed in Phase 5
2020	13.05	11.2	exceedsutility-side adds
2021	24.75	18.7	exceedsutility-side adds
2022	12.87	20.6	62%62% — BNEF paid-data territory
2023	20.09	25.0	80%80% — improved long-tail
2024	31.92	26.5	exceedsmid-2024 hyperscaler & DTE/TVA

Where DealFlow is below BNEF (2022, 2023), the residual gap is dominated by private deals that don't generate public press — sub-50 MW industrial / retail / regional-corporate VPPAs where the buyer didn't issue a release and the deal didn't surface in any developer 10-K, sustainability report, EPA GPP entry, or PUC filing. Honest estimate of the remaining unfindable gap: ~5-8 GW cumulative across 2022-2023.

What we explicitly DON'T claim

Not exhaustive. Honest BNEF triangulation gap (cumulative 2023–2025) is ~10 GW of long-tail <50 MW corporate VPPAs that don't surface in any free public source.
Imputed prices aren't signed-contract accurate. They're regional medians — correct in expectation but specific deals can deviate ±15-30% based on COD timing, escalator structure, REC bundling, and counterparty credit.
Not 100% data integrity. Quality varies by source layer. FERC EQR / NYSERDA / LBNL entries (~458/912 PPAs) are pulled from structured government datasets — high integrity. Agent-researched press entries are uneven; the 2020-2021-gapfill batch ran ~40% HIGH-severity issues in fact-check audit (now fully resolved). Critical fields (counterparty, year, state, technology) are >97% accurate post-audit.

What we DO claim

Every disclosed price is sourced and traceable. Every entry has a sourceUrl pointing to either a press release, FERC filing, LBNL dataset row, or SEC filing. We don't fabricate prices.
The pricing distribution is consistent with LBNL. Our regional medians (where ≥5 disclosed comparables exist) match LBNL's 2025 Utility-Scale Solar report ranges within ±5%.
Imputed prices have explicit confidence + range + source. Every imputed entry carries priceImputedConfidence, priceImputedRange, priceImputedSource, and priceImputedMatchKey so the basis is fully auditable.
The pipeline is reproducible. Every script is in git and can be re-run from scratch with identical results given the same upstream snapshots.

Source bibliography

FERC Electric Quarterly Reports (EQR)
PUDL FERC EQR — Catalyst Cooperative
LBNL Utility-Scale Solar (CC-BY 4.0)
LBNL Land-Based Wind Market Report
EPA Green Power Partnership
SEC EDGAR Full-Text Search
NYSERDA Open NY Large-Scale Renewables
DTE MIGreenPower — Sustainability Reports + Michigan PSC dockets
Xcel Renewable*Connect — ESG Reports + state PUC filings
TVA Green Invest — ESG Disclosures + Local Power Company filings

How DealFlow Sources US PPA Pricing