The variables that affect the interest rate of a loan
The interest rate of a loan is a major concern to both the borrower and the lender. The interest charged on loans is a source of revenue to the lender. Therefore, the lender will prefer to maximize revenue. On the other hand, the amount of interest charged is a cost to the borrower. Thus, the interest rate on a loan is a key variable to both parties. On the side of the lender, the major concern is how to estimate the rate of interest for each loan application. The other concerns are what variables to include during estimation and how to identify the variables.
Objective of the study
The paper seeks to carry out a study to establish the factors that have an association with the interest rate of a loan and thus can be used to determine the rate of interest. Several statistical tools will be used in the study. Some of the tools are descriptive statistics, correlation analysis, regression analysis and evaluation of the regression equation.
Statement of the problem
It is perceived that the interest rate of loan highly depends on the risk level of the applicant. The level of risk is commonly measured by the amount of income earned by the applicant and the credit worthiness. A number of measures are often used to gauge the credit worthiness of the applicant. The most common measure is the use of FICO score and credit report. Thus, an applicant with favorable periodic income and satisfactory credit rating has less risk and is likely to pay a lower interest rate.
Hypothesis
- Null hypothesis (Ho): The interest rate of the loan depends on the level of risk of the individual
- Alternative hypothesis (H1): The interest rate of the loan does not depend onthe level of risk of the individual
Data collection
The analysis is based on secondary data that is obtained from the website of Lending Club (1). The site provides data that contains details of loan applicants such as the amount of the loan applied, funded amount, terms of the loan, the annual income of the individual, number of queries in the last six months, and average FICO score. The data was downloaded on 14th October 2013. Further, it contains seven variables for 30 people selected randomly.
(Source of the data – Lending Club 1)
Analysis and results
Descriptive statistics
The table presented below shows a summary of descriptive statistics for the various variables.
The table shows that there is high deviation from the mean in the amount of annual income, loan amount and funded amount. The median income in the data is $47,750 while the median loan amount is 7,000. The modal interest rate is 11.86%. Finally, the modal FICO score is 737.
Correlation coefficient
Correlation coefficient measures the degree of association between two variables (Verbeek 20). The results of correlation coefficient indicate that there a weak relationship between the interest rate and the other variables. There is a weak positive relationship between interest rate and loan amount (1.44%), the term of the loan (17.93%) and inquiries (14.48%). Further, there is the weak negative relationship between the funding amount (10.73%) and annual income (27.24%). Finally, there is a strong negative relationship between FICO score (85.24%) and interest rate.
Multiple linear regression model
In the analysis, the dependent variable is the interest rate while the independent variables are the amount of the loan applied, funded amount, terms of the loan, the annual income of the individual, number of queries in the last six months, and average FICO score. The regression line can be simplified as shown below.
Y = b0 + b1X1 + b2X2 + b3X3 + b4X4 + b5X5 + b6X6
Y = Interest rate
X1 = Loan amount
X2 = Funded amount
X3 = Term of the loan
X4 = Annual income
X5 = Number of inquiries in the past 6 months
X6 = FICO score
The theoretical expectations are b0, can take any value b1, b3 and b5 > 0 (positive) while b2, b4 and b6 < 0 (negative). From the results of regression, the regression line can be written as Y = 0.636566224 + 2.06984E-06X1 – 2.37562E-06X2 + 0.000642208X3 – 5.64485E-09X4 + 0.003503609X5 – 0.000765147X6. The intercept value of 0.6366 represents other variables that affect the demand of the commodity but are not included in the model. The positive coefficients imply that if the related variable increases by one unit, the interest rate will increase by the value of the coefficient. On the other hand, the negative coefficients imply that if the related variable increases by one unit, the interest rate will decrease by the value of the coefficient.
Evaluation of regression model
T – test
A t – test is used to evaluate the statistical significance of the explanatory variables (Vinod 33). A two tailed t- test is carried out at 5% significance level.
Null hypothesis: Ho: bi = 0
Alternative hypothesis: H1: bi ≠ 0
The null hypothesis implies that the variables are not significant determinants of demand. The alternative hypothesis implies that variables are significant determinant of demand.
Test of significance for the linear model
The table presented below summarizes the results of the t – tests.
Based on the results above, the value of t – calculated is greater than the value of t – tabulated for two variables (term of the loan and FICO score). This implies that rest the two variables are statistically significant in the determination of interest rate at the 5% significance level. The rest of the variables are not statistically significant since the value of t – computed is less than the value of t – tabulated. Therefore, the null hypothesis will not be rejected at the 5% significance level.
F – test of the regression models
The overall significance of the regression model can be evaluated using an F – test (Verbeek 20). The test will be carried out 5% significance level.
- Null hypothesis H0: β0 = β1
- Alternative hypothesis H1: βj ≠ 0, for at least one value of j
The null hypothesis implies that the overall regression line is not significant. The alternative hypothesis implies that overall regression line is significant. The value of F – computed (16.26047) is greater than the value of F – tabulated (2.5277). Thus, the null hypothesis will be rejected. This implies that the overall linear regression line is significant and can be used in further analysis and predictions.
R-square value
The value of R – square is 80.92% while the value of the adjusted R-square is 75.94%. The high values are an indication of a strong regression line because the explanatory variables explain 80.92% of the variations in the explained variables. The independent variables cannot explain only 19.08% of the variations in the dependent variables.
Conclusion and discussion
The study focused on establishing the factors that have an association with the interest rate of a loan. From the analysis, it is established that the overall regression line is significant. Also, the variables in the regression line explain a large percentage of variations in the explained variable. However, it is also observed that only two variables are statistically significant in the estimation of interest rate. Therefore, the null hypothesis of the research will not be rejected. The implication of the results on the population is that risk level (commonly measured using FICO) is a significant determinant of the interest rate. The results of the sample can be extrapolated to the population. Further, to improve the study, an analyst should include several measures of the risk level of the individual such as debt to income ratio and open credit lines of the individual. This might improve the number of statistically significant explanatory variables. Also, it will also improve the value of R – square. Finally, an example of bias that might exist in the analysis is omitted-variable bias.
Works Cited
Lending Club 2013, Lending Club Statistics. Web.
Mankiw, Gregory. Principles of Economics, USA: Cengage Learning, 2011. Print.
Verbeek, Marno. A Guide to Modern Econometrics, England: John Wiley & Sons, 2008. Print.
Vinod, Hrishikesh. Hands on Intermediate Econometrics Using R: Templates for Extending Dozens of Practical Examples. Hackensack, NJ: World Scientific Publishers, 2008. Print.