

Conclusion
Cracking the Code
While we applaud that Lending Club publishes the loan application variables so their lending practices are open to review by third parties, we believe they should also share the source code of their proprietary algorithm that calculates the loan grade. Since that algorithm is not public, our team has tried to replicate the secret proprietary model that LC uses to calculate your loan grade and interest rate. We have tested several techniques that are commonly used in the financial industry (linear regression, logistic regression, decision tree, and random forest) and ‘trained’ these models to mimic the LC algorithm as closely as possible. While our models do not perfectly correspond to the LC model, we were able to identify the variables that have the biggest impact on your loan grade and interest rate. Understanding the relationship between these variables in your personal financial records and the interest rate you will receive can help you improve your score.
Tips & Tricks!
01
Time is Money
Asking for a shorter loan term, 36 months instead of 60 months, significantly improves your interest rates! Instead of one 60 month loan, we'd suggest two successive 36 month loans if your circumstances allow for the uncertainty that your second application might be refused.
Purpose Preference
The purpose of the loan you specify when you apply for a Lending Club loan is the most important variable impacting your interest rate. The nine possible answers you can specify account for a total of 39% of your loan grade.
Favorable Purposes
credit card, debt consolidation, home improvement, car, and major purchase
Unfavorable Purposes
small business, moving, other, and house
03
FICO Still Matters (a bit)
While it isn't the most important driver of your interest rate like most might expect, your FICO score still accounts for 8.4% of your grade. Keep your FICO high!
Economics Anonymous
Surprisingly, our models indicated that having your income independently verified actually harmed risk grades in general.
05
Don't Worry 'bout It
The state you live in and the length of your employment don’t matter much. So that’s two less things to worry about.
02
04
Results Elaborated
Using a linear regression model with Lasso regularization to mimic Lending Club’s proprietary model, we found that of the 92 variables in our model only 41 variables contribute to your loan grade at all and only 21 variables had coefficients that contributed to more than 1% of the loan grade. Those are all listed in the table below in order of importance. Variables in the green rows are the ones for which you want to increase the value or answer ‘yes’ to get a lower interest rate. The red rows are the variables for which you want to decrease the value or answer ‘no’ to get a lower interest rate.

Future Work
We've exposed the Lending Club risk rubric and given you the tips and tricks to improve your application. The next step for this website will be to embed a model to predict your risk grade and interest rates. That way you can predict the outcome of your application before ever submitting, giving yourself time and feedback to improve your application.
​
While our models were effective enough to understand the components of Lending Club's risk analysis, to build true predictive functionality we will have to go further to optimize our models. One of the models we would like to test is Ordinal Logistic Regression, specifically the OrdinalRidge model from the mord library (https://pythonhosted.org/mord/). This model is likely to perform better than the regular Logistic Regression with a categorical response variable, because it maintains the ordinal relationship between the subgrade categories.