A Complete Guide to Credit Risk Modelling (2024)

This article explains basic concepts and methodologies of credit risk modelling and how it is important for financial institutions.

In credit risk domain, statistics and machine learning play an important role in solving problems related to credit risk. Hence role of predictive modelers and data scientists have become so important.

What is Credit Risk?

In simple words, it is the risk of borrower not repaying loan, credit card or any other type of loan. Sometimes customers pay some installments of loan but don't repay the full amount which includes principal amount plus interest.

For example, you took a personal loan of USD 100,000 for 10 years at 9% interest rate. You paid a few initial installments of loan to the bank but stopped paying afterwards. Remaining unpaid installments are worth USD 30,000. It's a loss to the bank.

Do you remember or aware of 2008 global recession? In US, low-creditworthy customers were given home loans which were risky due to their high likelihood of default. To compensate risk, banks used to charge high interest rate. Banks further sold these loans to investors as Collateralized Debt Obligations (CDOs), considered low-risk from 2004-2007. As defaults increased, banks seized (foreclosed) properties. It caused a real estate bubble burst and a sharp decline in home prices. This led to a global recession as many financial institutions had invested in these funds.

A Complete Guide to Credit Risk Modelling (1)

What is Credit Risk Modelling?

Credit risk modeling refers to data driven risk models which calculates the chances of a borrower defaults on loan (or credit card). If a borrower fails to repay loan, how much amount he/she owes at the time of default and how much lender would lose from the outstanding amount. In other words, we need to build probability of default, loss given default and exposure at default models as per regulatory basel norms.

Basel Regulations

A committee was set up in year 1974 by central bank governors of G10 countries. It is to ensure that banks have minimum enough capital to give back depositors’ funds. They meet regularly to discuss banking supervisory matters at the Bank for International Settlements (BIS) in Basel, Switzerland. The committee was expanded in 2009 to 27 countries.

Basel I

Basel I accord is the first official pact introduced in year 1988. It focused on credit risk and introduced the idea of the capital adequacy ratio which is also known as Capital to Risk Assets Ratio. It is the ratio of a bank's capital to its risk. Banks needed to maintain ratio of at least 8%. It means capital should be more than 8 percent of the risk-weighted assets. Capital is an aggregation of Tier 1 and Tier 2 capital.

  1. Tier 1 capital : Primary funding source of the bank. It includes shareholders' equity and retained earnings
  2. Tier 2 capital : Subordinated loans, revaluation reserves, undisclosed reserves and general provisions

In Basel I, fixed risk weights were set based on the level of exposure. It was 50% for mortgages and 100% for non-mortgage exposures (like credit card, overdraft, auto loans, personal finance etc). See the example shown below -

Mortgage $5,000Risk Weight 50%Risk Weighted Assets $2500 (Mortage * Risk Weight)Minimum Capital Required $200 (8% * Risk Weighted Assets)

Basel II

Basel II accord was introduced in June 2004 to eliminate the limitations of Basel I. For example, Basel I focused only on credit risk whereas Basel II focused not only credit risk but also includes operational and market risk. Operational Risk includes fraud and system failures. Market risk includes equity, currency and commodity risk.

In Basel II, there are following three ways to estimate credit risk.

  • Standardized Approach
  • Foundation Internal Rating Based (IRB) approach
  • Advanced Internal Rating Based (IRB) Approach

Standardized Approach

For corporate, the banks relies on ratings from certified credit rating agencies (CRAs) like S&P, Moody etc. to quantify required capital for credit risk. Risk weight is 20% for high rated exposures and goes up to 150 percent for low rated exposures. For retail, risk weight is 35% for mortgage exposures and 75% for non-mortgage exposures (no rating by credit rating agencies required for retail).

Corporate Exposure $5,00,000Credit Assessment AAARisk Weights 20%Risk Weighted Assets $1,00,000Minimum Capital Required $8,000

Internal Ratings Based (IRB) Approach

It has four credit risk components :

  • Probability of Default (PD)
  • Exposure at Default (EAD)
  • Loss given Default (LGD)
  • Effective Maturity (M)

Probability of Default (PD)

Probability of default means the likelihood that a borrower will default on debt (credit card, mortgage or non-mortgage loan) over a one-year period. In simple words, it returns the expected probability of customers fail to repay the loan. Probability is expressed in the form of percentage, lies between 0% and 100%. Higher the probability, higher the chance of default.

Exposure at Default (EAD)

It means how much should we expect the amount outstanding to be in the case of default. It is the amount that the borrower has to pay the bank at the time of default.

Loss given Default (LGD)

It means how much of the amount outstanding we expect to lose. It is a proportion of the total exposure when borrower defaults. It is calculated by (1 - Recovery Rate).

LGD = (EAD – PV(recovery) – PV(cost)) / EADPV (recovery)= Present value of recovery discounted till time of default.PV (cost) = Present value of cost discounted till time of default.

Someone takes $100,000 home loan from bank for purchase of flat. At the time of default, loan has an outstanding balance of $70,000. Bank foreclosed flat and sold it for $60,000. EAD is $70,000. LGD is calculated by dividing ($70,000 - $60,000)/$70,000 i.e. 14.3%.

Expected Loss

Expected Loss is calculated by (PD * LGD * EAD).


Probability of Default 2%Exposure at Default $20,000Loss Given Default 20%Expected Loss $80

Foundation and Advanced IRB Approach

There are two types of Internal Rating Based (IRB) approaches which are Foundation IRB and Advanced IRB.

Foundation IRB
PD is estimated internally by the bank while LGD and EAD are prescribed by regulator.

Advanced IRB
PD, LGD, and EAD can be estimated internally by the bank itself.

Effective Maturity (M)

It is a duration that reflects standard bank practice is used. For Foundation IRB, the effective maturity is 2.5 years (exception is repo style transactions where it is 6 months). For Advanced IRB, M is the greater of 1 year or the effective maturity of the specific instrument.

Basel III

Basel III accord was scheduled to be implemented effective March 2019. In view of the coronavirus pandemic, the implementation had been postponed to January 1, 2023.

Basel III has incorporated several risk measures to counter issues which were identified and highlighted in 2008 financial crisis. It emphasis on revised capital standards (such as leverage ratios), stress testing and tangible equity capital which is the component with the greatest loss-absorbing capacity.

The concept of building internal models and external ratings for estimating PD, LGD and EAD remains same as it was in Basel II. However there are some changes introduced in Basel III. It is shown in the table below.
Basel II Basel III
Common Tier 1 capital ratio(shareholders’ equity + retained earnings) 2% * RWA 4.5% * RWA
Tier 1 capital ratio 4% * RWA 6% * RWA
Tier 2 capital ratio 4% * RWA 2% * RWA
Capital conservation buffer(common equity) - 2.5% * RWA

Does Basel IV exist?

The Basel Committee introduced "Basel III: Finalizing post-crisis reforms" in 2017, an extension of Basel III. In the US, it's termed Basel III Endgame. In the UK, it is called Basel 3.1 and some refer to it as Basel IV. But officially there are only 3 Basel Accords and it is being considered as a part of Basel III only.

The EU regulatory authority has set January 2025 as the implementation date, while both the UK and US regulatory authorities aim to implement the changes by July 2025.


IFRS 9 is an International Financial Reporting Standard dealing with accounting for financial instruments. It replaces IAS 39 Financial Instruments which was based on the incurred loss model whereas IFRS 9 focuses on the expected loss model that covers also future losses.

In IFRS 9, the idea is to recognize 12-month loss allowance at initial recognition and lifetime loss allowance on significant increase in credit risk

IFRS 9 vs Basel III

Probability of Default Modeling

In this section, we covered various steps and methods related to PD modeling.

Define Dependent Variable

Binary variable having values 1 and 0. 1 refers to bad customers and 0 refers to good customers.

Bad Customers: Customers who defaulted in payment. By 'default', it means if either or all of the following scenarios have taken place.

  • Payment due more than 90 days. In some countries, it is 120 or 180 days.
  • Borrower has filed for bankruptcy
  • Loan is partially or fully written off

Indeterminates or rollovers: These customers fall into these 2 categories :

  • Payment due 30 or max 60 days but paid after that. They are regular late payers.
  • Inactive accounts

All the other customers are good customers. Indeterminates should not be included as it would reduce the discrimination ability to distinguish between good and bad. It is important to note that we include these customers at the time of scoring.

We consider 12 months as performance window to flag defaults which means if a customer has defaulted any time in next 12 months, it would be flagged as 'Bad'

Methodologies for Estimating PD

There are two main methodologies for estimating Probability of Default.

  1. Judgmental Method
  2. Statistical Method

Judgmental Method

It relies on the knowledge of experienced credit professionals. It is generally based on five Cs of the applicant and loan.

  • Character : Check credit history of borrower. If no credit history, bank can ask for referees who bank can contact to know about the reputation of borrower.
  • Capital : Calculate difference between the borrower’s assets (e.g., car, house, etc.) and liabilities (e.g., renting expenses, etc.)
  • Collateral : Value of the collateral (security) provided in case borrower fails to repay
  • Capacity : Assess borrower’s ability to pay principal plus interest amount by checking job status, income etc.
  • Conditions includes internal and external factors (e.g. economic recession, war, natural calamities etc.)
Judgmental methods have become past as Statistical methods are more popular these days. But it is still widely used when historical data is not available (especially new credit products).

Statistical Method

In today's world, nobody has time to wait for 1-2 months to know about the status of loan. Also many borrowers apply for loan through bank's website. Hence real-time credit decisions by bank is required to remain competitive in the digital world. The advantage of using statistical method is that it produces mathematical equation which is an automated and faster solution for making credit decisions.

This method is unbiased and free from dishonest or fraudulent conduct by loan approval officer or manager.

This method also comes with higher accuracy as statistical and machine learning models considers hundreds of data points to identify defaulters.

Data Sources for PD Modeling

  • Demographic Data : Applicant's age, income, employment status, marital status, no. of years at current address, no. of years at job, postal code
  • Existing Relationship : Tenure, number of products, payment performance, previous claims
  • Credit Bureau Variables : Default or Delinquency history, Bureau score, Amount of credits, Inquiries etc.

Steps of PD Modeling

  • Data Preparation
  • Variable Selection
  • Model Development
  • Model Validation
  • Calibration
  • Independent Validation
  • Supervisory Approval
  • Model Implementation : Roll out to users
  • Periodic Monitoring
  • Post Implementation Validation : Backtesting and Benchmarking
  • Model Refinement (if any issue)

Statistical Techniques used for Model Development

  • Logistic Regression is most widely used technique for estimation of PD
  • Survival Analysis is generally used to compute lifetime PD (required for IFRS 9)
  • Random Forest
  • Gradient Boosting
  • Markov chain Modeling
  • Neural Network

Model Performance in PD Model

There are main 2 levels of performance testing -

  1. Discrimination : Ability to differentiate between good (non-defaulters) and bad (defaulters) customers
  2. Calibration : Check whether the actual default rate is close to predicted PD values

Statistical Tests for Model Performance

Discrimination : Area under Curve, Gini coefficient, KS StatisticsCalibration : Hosmer and Lemeshow Test, Binomial Test

Check out this link for detailed explanation : Model Performance Simplified

Rating Philosophy

It refers to the time horizon for which ratings measure credit risk and how much they are influenced by cyclic effects.

Point in time (PIT) PD

  • It evaluates the chances of default at that point in time. It considers both current macro-economic factors and risk attributes of borrower.
  • Since it captures current macro-economic factors so PIT PD moves up as macro-economic conditions deteriorate and moves down as macro-economic conditions improve.
  • It focuses on reporting date
  • IFRS 9 requires PDs to be Point in time

Through the cycle (TTC) PD

  • It predicts average default rate over an economic cycle and ignores short run changes to a customer's PD and closely resembles long-term average default rate.
  • Grade assigned is not dependent on current macro-economic factors
  • It focuses on long-run average PD
  • Basel III requires PDs to be Through the cycle

In general, hybrid model (considering both PIT and TTC) is used.

A Complete Guide to Credit Risk Modelling (2)

Credit Scoring and Scorecard

Probability of Default model is used to score each customer to assess his/her likelihood of default. When you go to Bank for loan, they check your credit score. This credit score can be built internally by bank or Bank can use score of credit bureaus.

Credit Bureaus collect individuals' credit information from various banks and sell it in the form of a credit report. They also release credit scores. In US, FICO score is very popular credit score ranging between 300 and 850. In India, CIBIL score is used for the same and lie between 300 and 900.

Types of Scorecards

1. Application Scorecard : It applies to new (first time) customers applying for loan or credit card. It estimate probability of default at time applicant applies for loan. See the example below how it works.

A Complete Guide to Credit Risk Modelling (3)

Suppose cutoff for granting loan = 350Profile of a New CustomerAge 30Gender MaleSalary 15000Total Points = (100 + 85 + 120) = 305Decision : Refuse Loan

Data required for application scorecard

We use customer's application or demographic data along with credit bureau data. There is no observation window for historical data as these are new customers. Definition of Bad is same which is 90+ days past due. Performance window is generally 12 to 24 months from opening account.

Application scorecard is used majorly for the following tasks:

  • To determine whether or not to approve a customer for a loan.
  • To assist in 'due diligence'. Suppose an applicant scoring very high or very low can be declined or approved outright without asking for further information.

2.Behavior Scorecard : It applies to existing customers to assess whether customer will default in loan payment. Performance window is generally 6 to 18 months.

Behavior scorecard is used majorly for the following tasks:

  • To set credit limit i.e. increase or decrease credit limit
  • Debt provisioning and profit scoring.
  • Renewals

Difference between Application and Behavior Scorecard

Application scorecard is applied on new customers (generally lower than 1 year) whereas Behavior scorecard is applied on existing customers (greater than 1 year). For application scorecard, we don't require well-calibrated default probabilities. But calibrated default probabilities are required for behavior scorecard as per Basel norms. These two scorecards are also different in terms of usage. See the explanation above in their respective section how they are generally used.

A Complete Guide to Credit Risk Modelling (4)

Collections Scoring

It predicts probability that a loan already late for a given number of days will be late for another given number of days. They are typically built for performance windows of one month.

Desertion Scoring

It predicts the probability a borrower will apply for a new loan once the current loan is paid off.

Important Terminologies related to Credit Risk

Stressed PD vs. Unstressed PD

Stressed PD: A stressed PD depends on the risk attributes of borrower but is not highly affected by macroeconomic factors as adverse economic conditions are already factored into it.

Unstressed PD: An unstressed PD depends on both current macroeconomic and risk attributes of borrower. It moves up or down depending on the economic conditions.

Downturn LGD and EAD

Under Basel II and III, financial institutions need to estimate downturn LGD and EAD. By 'downturn', it means adverse economic conditions. We need to select the month with highest default rate and then take two consecutive quarters (6-month) window on both sides of this point and consider it as downturn period and then take maximum of EAD and LGD which provides the downturn estimates. It is required because LGD and EAD can be affected by downturn economic conditions.

Conditional PD

It is the probability of default during the second year given that it does not default during the first year. To calculate conditional PD, we need probability of not defaulting by the end of year 1 (P0) and unconditional probability of defaulting during the second year (P1).

If P0=0.5 and P1=0.1 so Conditional PD i.e. Prob(default | Survival) would be 0.1/0.5 = 20%

Lifetime PD vs 12 month PD

As per IFRS 9, we require two types of PDs for calculating expected credit losses (ECL).

  • 12-month PDs for stage 1 assets - Chances of default within the next 12 months
  • Lifetime PDs for stage 2 and 3 assets - Chances of default over the remaining life of the financial instrument.

Suppose 12-month PD is 3% which means survival rate is 97% (1 - PD). 2nd and 3rd year conditional PD is 4% and 5%.

  1. 1st year cumulative survival rate (CSR) is same as first year survival rate (SR).
  2. 2nd year cumulative survival rate = 1st year CSR * SR of 2nd year = 97% * 96% = 93%
  3. 3rd year cumulative survival rate = 2nd year CSR * SR of 3rd year = 93% * 95% = 88%. Lifetime PD = 1 - 88% = 12%

Macroeconomic factors to consider to estimate ECL

Estimating Expected Credit Loss (ECL) is crucial for banks and other financial institutions to manage the risk of lending money. To do this well, they must think about different macroeconomic factors that can affect how likely people are to repay their loans. Here are some important macroeconomic factors to consider when estimating ECL:

  • GDP: The overall economic growth of a country affects borrowers' ability to repay loans and influences credit risk.
  • Unemployment Rate: High unemployment rates can lead to reduced income and higher credit default rates, impacting credit quality.
  • Index of Industrial Production: The performance of industries can impact the creditworthiness of borrowers in specific sectors.
  • Import and Export: Global economic conditions and trade trends influence businesses' performance, affecting credit risk.
  • Interest Rate: Changes in interest rates can affect borrowers' ability to service their debt, impacting credit losses.
  • Inflation Rate: High inflation rates can weaken borrowers' purchasing power and lead to higher credit risk.
  • House Price Index: Real estate market conditions can affect the credit quality of loans related to property.
  • Exchange Rate: For institutions dealing with multiple currencies, exchange rate fluctuations can influence credit risk.

Stress Testing

In simple terms, stress testing is like giving a financial institution (such as a bank) a really tough test to see if it can handle difficult situations. Instead of just looking at regular situations, stress tests make them imagine extreme and rare problems, like a big economic crisis or unexpected disasters. By doing this, we can figure out how strong and prepared the institution is to handle these tough times and make sure it can stay stable even in the worst-case scenarios. For example, how a 5% increase in the unemployment rate affects the performance of a bank.

Types of Stress Testing

There are three types of stress testing.

  1. Scenario Analysis : Banks use scenario analysis to imagine different future situations and see how they might affect their financial health. It helps them prepare for risks and make better decisions.
  2. Reverse Stress Testing : In reverse stress testing, banks start with a negative outcome and figure out what could cause it. It helps them identify vulnerabilities and improve risk management.
  3. Sensitivity Analysis : Sensitivity analysis involves testing different factors to see how they impact the bank's performance. It helps banks understand their exposure to risks and adjust their strategies accordingly.

Stress Testing for Credit Risk: Practical Guide

Softwares used in risk analytics

Let's split this section into two parts -

1. Data Extraction
Most of the data is stored in relational databases (SQL Server, Teradata). Analyst need to have expert level knowledge of SQL to extract or manipulate data. Data is not saved in a single SQL table or database. In order to extract relevant data fields from database, you need to select multiple tables and join them based on matching key(s). During this process, you need to apply some business rules (excluding some type of customers or accounts). Transaction table is generally in mainframe environment so basic knowledge of mainframe and UNIX would be key. Mainframe and UNIX are not primary skill sets banks generally look for in risk analyst (It's good to have!). Developers are generally hired for this work.

2. Model Building
SAS is the most widely used software in risk analytics. Despite huge popularity of R and Python these days, more than 90% of banks and other financial institutions still use SAS. Banks also started exploring R and Python. They are building (or already built) syntax library (repository) in R and Python language for credit risk projects.

SAS can be easily integrated with relational databases and mainframe. Many companies execute both data extraction and model building steps in SAS environment only.

End Note

Hope you have got a fair idea of how predictive modeling is used in credit risk domain and what are the key credit risk parameters. In risk analytics, domain knowledge is more important than technical or statistical knowledge. Hope this article helped you in filling that gap. Please provide your feedback in the comment box below.

A Complete Guide to Credit Risk Modelling (2024)


Is credit risk modelling hard? ›

Credit risk models need to be regularly validated and backtested to ensure they accurately predict credit risk. However, this can be challenging in practice, particularly when the models are based on complex credit risk algorithms or data sources.

How does a PD model work? ›

The primary objective of credit risk modelling is to estimate two critical parameters: the probability of default (PD) and the potential loss given default (LGD). PD represents the likelihood of a borrower defaulting within a specific timeframe, while LGD measures the expected loss if a default occurs.

What are the 5 C's of credit? ›

Each lender has its own method for analyzing a borrower's creditworthiness. Most lenders use the five Cs—character, capacity, capital, collateral, and conditions—when analyzing individual or business credit applications.

Is credit risk analyst a stressful job? ›

Being a credit analyst can be a stressful job. You often must decide whether a person or a company can make a purchase, and at what interest rate, which is a significant responsibility.

What is the hardest financial model? ›

Leveraged Buyout (LBO) Model

An LBO is often one of the most detailed and challenging of all types of financial models, as the many layers of financing create circular references and require cash flow waterfalls.

What is the formula for PD to score? ›

PD = (A – B * Score)^C

In this formula: PD: Probability of Default. Score: The credit score assigned to the borrower based on their characteristics. A, B, and C: Parameters estimated during the model calibration process.

How to build credit risk models? ›

Building credit risk models typically entails four steps: gathering and preprocessing data, modelling of probability of default (PD), Loss Given Default (LGD) and Exposure at Default (EAD), evaluating the credit risk models built and then the deployment step to put them into production.

How is PD calculated? ›

Default probabilities may be estimated from a historical data base of actual defaults using modern techniques like logistic regression. Default probabilities may also be estimated from the observable prices of credit default swaps, bonds, and options on common stock.

What habit lowers your credit score? ›

Making late payments, even a single day late, can significantly affect your credit. This becomes especially true if you make a habit of paying late. Some lenders or credit card companies will charge you a fee for being a single day late and could cut you off from making further purchases on the account.

What is a good credit score? ›

Although ranges vary depending on the credit scoring model, generally credit scores from 580 to 669 are considered fair; 670 to 739 are considered good; 740 to 799 are considered very good; and 800 and up are considered excellent.

What is FICO credit score? ›

FICO credit scores are a method of quantifying and evaluating an individual's creditworthiness. FICO scores are used in 90% of mortgage application decisions in the United States. Scores range from 300 to 850, with scores in the 670 to 739 range considered to be “good” credit scores.

What is the credit triangle formula? ›

You can use the "credit triangle" which states that the (annualised) credit spread S equals the annualised probability of default p times the loss given default LGD which equals par minus the expected recovery amount R, i.e. S=p(1−R).

What is PD in credit risk? ›

Default probability, or probability of default (PD), is the likelihood that a borrower will fail to pay back a certain debt. For businesses, probability of default is reflected in the company's credit ratings. For individuals, a credit score is one gauge of default risk.

What is EAD in credit risk? ›

Exposure at default (EAD) is the total value a bank is exposed to when a loan defaults. Using the internal ratings-based (IRB) approach, financial institutions calculate their risk.

Is financial Modelling course difficult? ›

The field of finance can be complicated to understand, and financial modeling is considered one of the most challenging tasks in this field.

What are the disadvantages of credit risk modelling? ›

Credit risk modeling faces several challenges and limitations, including: Data quality and availability: The accuracy and completeness of the data used in the models are crucial for their reliability. Inadequate or inconsistent data can lead to incorrect predictions and misinformed credit decisions.

How to become credit risk modeller? ›

Steps to Become a Credit Risk Modeller

Earn a Bachelor's Degree: Start with a degree in Finance, Statistics, Mathematics, or a related field. This will provide the foundational knowledge necessary for the role. Gain Practical Experience: Start in roles such as a Credit Analyst or Risk Analyst.

How hard is it to become a risk analyst? ›

Risk analysts typically hold bachelor's degrees in finance, economics, accounting, business or mathematics. Some pursue graduate study, and many earn CRA or CFA certifications. Along with formal qualifications, these professionals need good numeracy and strong communication, analysis and decision-making skills.

Top Articles
Latest Posts
Article information

Author: Ray Christiansen

Last Updated:

Views: 6582

Rating: 4.9 / 5 (69 voted)

Reviews: 92% of readers found this page helpful

Author information

Name: Ray Christiansen

Birthday: 1998-05-04

Address: Apt. 814 34339 Sauer Islands, Hirtheville, GA 02446-8771

Phone: +337636892828

Job: Lead Hospitality Designer

Hobby: Urban exploration, Tai chi, Lockpicking, Fashion, Gunsmithing, Pottery, Geocaching

Introduction: My name is Ray Christiansen, I am a fair, good, cute, gentle, vast, glamorous, excited person who loves writing and wants to share my knowledge and understanding with you.