Using machine learning to predict LendingClub loan defaults

This spring I took some time to test out the free open source machine learning kit Scikit-learn, or SKlearn, recycling a project from a Data Analytics class from my MBA program.

The basic challenge: Given a bunch of loan application data from thousands of loans, can you predict which loans will default and which will not?

My code is in GitHub located here:


Using Excel, figuring out what columns were important and which were not was a long and painful process. Once I had a basic script in Python, I was able to use something called a decision tree classifier to determine that for me, with greater accuracy. You can learn more about that here:

In 39 lines of (amateur) code you can go from some basic cleaned .csv data to visual tree graphs. It should take less than 1 minute to run. Powerful stuff, and what I love about coding data analytics solutions like this is you can reuse it for many types of classification and regression problems.

You can get the data here:

Feel free to install Python along with Pandas and SKLearn and give it a shot yourself! I am sure there are things I could improve, but it picked up on the proper drivers without any prompt from me. SKLearn has a lot of interesting features that I am starting to wrap my head around.




Author: secondhandstocks

The genesis for this blog stems from a Marine buddy and I came back from Afghanistan with more money than knowledge, and heedlessly tossed our hats into the stock market ring. A few months later, I remember discovering the classic book The Intelligent Investor by Graham and Dodd, and ravenously devouring my first introduction to value investing. That framework - with some generous additions by Seth Klarman, and Joel Greenblatt among others - guides my investment philosophy. I spent five years working in the intelligence field, both in the Marine Corps and then for a government agency after that. I speak Arabic and Pashto, have programming and analysis experience, and enjoy investing in technology companies as a hobby. I also spent a year on Wall Street working on a #1 Ranked Institutional Investor team, before deciding that that the Sell-Side was not for me.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s