Credit Stress Testing for Financial Institutions: Why the Data Behind Your Scenarios May Not Survive Regulatory Scrutiny

Posted by
Laura Saville
on
March 27, 2026

Most large financial institutions have stress testing programs that appear complete. Scenario design, macro-linkage calibration, and capital projection frameworks all meet DFAST, CCAR, and IFRS 9 expectations on paper.

The methodology is sound. The credit data feeding those scenarios is often not.

Traditional rating agencies cover only about 10–15% of entities in a typical bank portfolio, primarily large public issuers. The remaining exposures—including private companies, middle-market borrowers, and unrated subsidiaries—carry no external ratings. This reflects a broader structural gap: fewer than 15% of companies globally have a public credit rating, leaving most corporate lending exposures without an independent benchmark (BIS; OECD).

For these borrowers, stressed probability of default (PD) assumptions rely entirely on internal calibration. Even well-designed scenarios cannot compensate for the absence of independent reference points.

As supervisors intensify scrutiny of model inputs, the question institutions face is no longer whether a stress testing framework exists, but whether the credit assumptions inside it can withstand regulatory review.

Four structural challenges consistently emerge:

Coverage gaps: Most bank portfolios are dominated by unrated borrowers with no external benchmark.
Model validation pressure: Supervisors increasingly require independent benchmarking of PD assumptions.
Low-default portfolios: Corporate lending portfolios often experience too few defaults for statistical backtesting to validate risk models.
Forward-looking calibration: Stress testing requires projecting borrower risk beyond historical experience.

These challenges explain why institutions with sophisticated stress testing frameworks still struggle to produce credit assumptions that supervisors consider fully defensible.

This article examines why the gap persists, why regulators are increasingly challenging it, and how institutions can close it.

What Regulators Are Actually Asking About

Supervisors increasingly expect independent data validation of the credit assumptions that feed stress test results. In the context of CCAR, these PDs are point-in-time (PIT) measures calibrated to stressed macroeconomic scenarios, rather than through-the-cycle (TTC) estimates. External benchmarks such as consensus PDs do not replace PIT modeling, but provide an important validation reference point for those assumptions. Relying solely on internal model outputs, without corroborating external evidence, attracts scrutiny during examinations.

Stress testing expectations continue to expand. Regulators increasingly expect institutions to demonstrate not only capital resilience but also the credibility of the data feeding their models. As stress testing evolves from a compliance exercise into a supervisory diagnostic tool, scrutiny of credit assumptions is likely to intensify.

This expectation runs through every major framework institutions operate under.

Four Frameworks, One Expectation

Independent regulatory regimes converge on the same requirement: external corroboration of internal credit assessments

SR 11-7: The Three-Element Validation Standard

SR 11-7, the Federal Reserve and OCC’s supervisory guidance on model risk management, establishes a three-element validation framework: conceptual soundness, ongoing monitoring including benchmarking, and outcomes analysis. The benchmarking element specifically calls for comparison of internal credit estimates against external data sources, not just backtesting against a bank’s own loss history.

When PD inputs lack external validation, the issue is not simply incomplete data. It becomes a model risk concern. Under SR 11-7, supervisors expect institutions to demonstrate that model inputs are independently benchmarked where possible. Without external evidence, stressed PD assumptions may fail validation even when the model architecture itself is sound.

SR 11-7 is guidance rather than binding regulation, but examiners evaluate institutions against it, and the consequences of falling short are concrete.

The ECB’s Targeted Review of Internal Models identified over 5,000 findings across participating institutions, resulting in a €275 billion increase in RWA and a 70 basis point average CET1 decline. Banks whose IRB models lacked sufficient external validation faced model rejection and were forced to revert to standardized approaches with substantially higher capital requirements.

Basel IRB, the Output Floor, and IRB Nexus

Under the Basel IRB framework, banks using internal PD estimates must demonstrate to supervisors that those estimates are sound and appropriate. The Basel Committee’s validation studies identify benchmarking against external sources as a valuable complement to backtesting, particularly where statistical tests alone lack the power to confirm whether internal rating systems are performing adequately.

Many corporate lending portfolios qualify as low-default portfolios (LDPs), where observed defaults are too rare to support robust statistical validation. In these portfolios, internal loss histories rarely contain enough defaults to validate PD estimates. Basel guidance, therefore, highlights external benchmarking as a key complement to internal validation.

Basel IV’s 72.5% output floor raises the stakes directly: it caps the capital benefit banks can derive from internal models, making PD calibration accuracy consequential for capital efficiency. Estimates that are too conservative waste capital. Estimates that are too optimistic invite supervisory pushback.

For institutions managing low- and no-default portfolios, the benchmarking requirement is especially difficult to satisfy through backtesting alone, because observed defaults are simply too rare to generate statistically robust results. This is the precise problem that Credit Benchmark and Oliver Wyman developed IRB Nexus to address.

Cem Dedeaga, Partner at Oliver Wyman, described it as having “the potential to be transformative for banks looking to significantly bolster their historical credit analytics data feeding their capital models,” a recognition that consensus data from institutions with actual lending exposure provides what backtesting cannot.

IFRS 9: Forward-Looking ECL Requirements

IFRS 9 requires banks to estimate expected credit losses using forward-looking information, including multiple scenarios, and to update those estimates at each reporting date. For unrated borrowers, that means defending PD assumptions at every reporting cycle without agency ratings to anchor them.

Depending on the transition matrix and probability of default model selected, the impairment calculation on a single loan can range from $0.5M to $2.5M. That is an audit exposure, not a modeling footnote.

PD assumptions influence multiple regulatory outputs simultaneously. They affect expected credit loss provisioning under IFRS 9 or CECL, risk-weighted assets under IRB frameworks, and capital depletion projections under stress scenarios. Errors in PD calibration, therefore, propagate across multiple regulatory calculations.

Federal Reserve Stress Test: Counterparty Default Component

The Federal Reserve’s 2025 stress test framework requires banks with substantial trading or custodial operations to model losses from the unexpected default of their largest counterparty. Many of those counterparties, including hedge funds, asset managers, and trading firms, are unrated.

Credit Benchmark’s Financial Counterpart Monitor data puts a number on the problem: across roughly 10,600 financial entities tracked on the platform, 92% of asset managers and 92% of sovereign wealth funds carry no rating from the traditional agencies.

A Coverage Gap That Keeps Widening

Private credit’s rapid expansion has widened the gap between where external ratings exist and where institutions actually carry risk. S&P Global data shows the number of private credit borrowers more than doubled from approximately 1,200 in 2021 to over 2,800 by 2023. Over the same period, Credit Benchmark’s analysis of global default risk trends shows corporate default risks have risen 40% over the past decade, a deterioration unfolding largely in segments where external ratings do not exist.

Traditional rating agencies are structured around large public issuers that pay for ratings to access bond markets. Private companies, middle-market borrowers, hedge funds, and private equity-backed firms don’t issue public debt, so they remain outside the agencies’ core coverage model. Agency coverage is expanding into private credit, but far more slowly than institutional exposure to these segments is growing.

For stress testing, this creates a compounding problem. Banks already face uncertainty in their baseline PD estimates for unrated entities because there is no external benchmark to verify whether internal ratings are accurate, conservative, or too optimistic. Stress testing then asks institutions to project how those same borrowers perform under adverse conditions, amplifying that uncertainty at exactly the point where precision matters most.

How Baseline Uncertainty Compounds Under Stress

Stress testing amplifies the PD estimation problem that already exists in normal conditions

Baseline Conditions

Rated entities
10-15% of portfolio

Agency PDs + internal models

Validated

Unrated entities
85-90% of portfolio

Internal calibrations only

No external benchmark

Stress scenarios applied

Stressed Conditions

Rated entities
10-15% of portfolio

Stressed PDs anchored to external data

Defensible

Unrated entities
85-90% of portfolio

Stressed projections built on unvalidated baseline

Uncertainty compounded

Precision matters most under stress, but that is exactly where the data is weakest.

The ECB’s 2025 stress test methodology illustrates the practical consequence. For entities not covered by Moody’s, ECB stress simulations use sector-level averages with statistical randomisation, because entity-level data for unrated counterparties does not exist at the scale required.

Why the Usual Workarounds Fall Short

Most institutions recognise the coverage gap and attempt to address it. Three common approaches dominate, but none reliably resolve the validation problem.

Internal Historical Loss Data

Internal loss data reflects a bank’s own credit experience from prior cycles. It is backward-looking by design and bounded by conditions the portfolio has already encountered. Stress scenarios are meant to capture conditions outside that experience, which is precisely why SR 11-7 calls for external benchmarking as a complement to internal validation.

The calibration problem this creates is real. As the CCO at a major US bank with approximately $150B in assets put it when evaluating their approach: without external benchmarks, “the bank couldn’t determine whether its credit views were too conservative, too aggressive, or appropriately calibrated relative to market consensus.” Backtesting against internal loss data cannot answer that question.

Sector Level Proxies

Sector-level proxies appear to eliminate the need for entity-level data, but applying a sector average probability of default (PD) to individual borrowers can mask the deterioration that stress testing is designed to detect. Stress tests operate at the level of individual obligors, translating macroeconomic shocks into borrower-level changes in default risk.

Without entity-level benchmarks, scenario calibration relies entirely on internal assumptions. Credit quality at a single company can deteriorate sharply while the sector average remains stable, creating a misleading picture of resilience at the portfolio level. Small differences in PD calibration can therefore produce materially different projections of losses and capital depletion.

Supervisors increasingly challenge proxy-based calibration when it substitutes for entity-level evidence rather than supplementing it.

Traditional Rating Agency Coverage

Agencies are economically incentivized to concentrate on large public issuers. Private companies, middle-market borrowers, and fund counterparties do not issue public debt, so they remain unrated. Institutions relying solely on agency coverage are effectively monitoring around 10% of their actual portfolio exposure.

Timing compounds the problem. Quarterly or semi-annual review cycles mean agency ratings often reflect information that markets have already priced in. By the time a downgrade is issued, the risk has typically materialised, and losses may have already occurred.

For stress testing, where the value lies in anticipating deterioration before it shows up in reported losses, stale credit views defeat the purpose.

Three Standards for Defensible External Validation

“Are the PD inputs you use for stress tests benchmarked against independent views from institutions with actual exposure to the same borrowers?”

That’s the question supervisors consistently raise. To answer it credibly, external validation data needs to meet three standards.

Entity-level granularity: Sector averages do not satisfy supervisory expectations. Examiners want to see how individual borrower risk compares to external benchmarks under stress, not whether a portfolio’s average PD aligns with an industry figure.

Timeliness: Quarterly-lagged data does not reflect stress scenario conditions as they unfold. Validation is most credible when it draws on credit views that update frequently enough to capture deterioration in near real time.

Unrated coverage: External validation that only covers rated entities misses the point. It must extend to the unrated borrowers that make up the majority of most portfolios.

Consensus credit data, built by aggregating views from banks with direct lending exposure, meets all three.

Common Workarounds Fail the Standards Supervisors Apply

Three approaches institutions rely on, tested against three requirements for defensible external validation

	Entity-Level Granularity	Timeliness	Unrated Coverage
Internal Historical Loss Data	◐ Partial Reflects own portfolio only	✗ Fails Backward-looking by design	◐ Partial Only covers own borrowers
Sector-Level Proxies	✗ Fails Masks entity-specific deterioration	◐ Partial Depends on source; often lagged	✓ Pass Sector averages cover all segments
Agency Rating Coverage	✓ Pass Issuer-level assessments	✗ Fails Quarterly/semi-annual cycles	✗ Fails 10-15% coverage of typical portfolio
Consensus Credit Data	✓ Pass 120,000+ entities at LEI level	✓ Pass Weekly updates from 40+ banks	✓ Pass 90%+ entities are unrated

No single workaround meets all three standards. Each addresses a piece of the validation problem while leaving critical gaps elsewhere.

How Consensus Credit Data Fills the Gap

Credit Benchmark aggregates anonymized credit assessments from 40+ global banks, nearly half of them GSIBs, into weekly consensus ratings and PD estimates. The methodology provides entity-level views on 120,000+ entities across 160 countries, the vast majority of which carry no rating from S&P, Moody’s, or Fitch.

Consensus data aggregates independent credit assessments from institutions with direct lending exposure to the same borrowers. By combining multiple internal credit views, the approach reduces single-model bias and provides an empirical view of how market participants collectively perceive credit risk as it evolves.

The discriminatory power of that consensus is independently validated. A 2025 analysis covering ten years of data shows Credit Benchmark’s 1-year Gini coefficient at 0.88, compared to S&P’s 0.91 across their overlapping rated universe.

A peer-reviewed study in the 2024 Review of Accounting Studies confirmed that consensus data improves default prediction accuracy relative to traditional agency ratings. These are the kinds of statistical benchmarks that model risk management teams and external auditors can evaluate directly.

For stress testing, three capabilities are most relevant:

Weekly PD curves for peer benchmarking: Weekly updates enable institutions to compare internal stress assumptions against the views of peer banks with actual lending exposure to the same borrowers. That cadence captures credit deterioration 6-8 months ahead of agency downgrades, giving risk teams the early warning signals supervisors look for.

IFRS 9 impairment is assessed at reporting dates, not continuously. However, more frequent updates provide forward-looking signals between reporting periods, supporting earlier identification of deterioration and strengthening the validation of PD assumptions used in those quarterly calculations.

Empirical migration data for stressed ECL calculations: Credit Benchmark’s migration probabilities support stressed ECL calculations under IFRS 9 by providing empirical transition data across credit categories under adverse conditions. Over 350 industry and geography-specific transition matrices are available, built from 10-year monthly histories across 118,000+ borrowers.

Coverage of the exposures that matter most: The consensus universe extends to the unrated entities where supervisory scrutiny is most intense and where other data sources go quiet.

Credit Benchmark does not replace internal models or agency ratings. Instead, it provides the independent external benchmark for the large segment of global credit exposures those sources do not cover.

Stress Testing Unrated Counterparties in Practice: The CDCC

The Canadian Derivatives Clearing Corporation (CDCC) ran directly into the unrated exposure problem.

As a central counterparty, CDCC must stress-test clearing members’ creditworthiness under adverse scenarios. The 2018 Nasdaq Commodities default, which resulted in a $130M loss and triggered a default waterfall, is a reminder of what happens when a CCP encounters a deteriorating counterparty without adequate forward-looking credit intelligence. For CDCC, traditional agency data covered only a fraction of its 30+ clearing members, and the majority were unrated.

To close the gap, CDCC integrated Credit Benchmark’s consensus ratings and PD data for continuous monitoring and stress testing of clearing members without external credit ratings. Vladimir Levtsun, Acting Director of Financial Resilience Risk at CDCC, described what changed: “Credit Benchmark’s data has contributed to directly strengthening our ability to manage counterparty risk and enhance internal reporting, leading to more confident, proactive risk decisions.”

With entity-level credit views on counterparties previously assessed only through internal models, CDCC gained the independent validation that supervisors expect. Weekly consensus updates allow margin requirement adjustments before financial stress materialises rather than in response to it.

From Data Gap to Regulatory Defensibility

Supervisors expect external validation for stressed PD assumptions. Without it, even rigorous scenarios and well-built models can attract examination findings.

The first step is a coverage assessment to determine how much of your portfolio Credit Benchmark already covers before committing to a full engagement. Understanding the overlap between your unrated exposures and Credit Benchmark’s consensus universe is the fastest way to quantify the validation gap you can close.

Request a portfolio coverage assessment today.

Frequently Asked Questions

How much does Eventbrite cost for free events?

Free events cost £0. Eventbrite does not charge a platform fee when the event itself is free.

What are Eventbrite fees for paid events?

For paid events, Eventbrite charges 3.7% + $1.79 per ticket, plus an additional 2.9% payment processing fee per order.

What is Eventbrite Pro?

Eventbrite Pro costs $15 per month and includes enhanced email marketing tools to help promote events and communicate with attendees more effectively.