Using machine learning to predict LendingClub loan defaults

This spring I took some time to test out the free open source machine learning kit Scikit-learn, or SKlearn, recycling a project from a Data Analytics class from my MBA program.

The basic challenge: Given a bunch of loan application data from thousands of loans, can you predict which loans will default and which will not?

My code is in GitHub located here: https://github.com/dforrestwilson/albums/blob/master/lendingclub.py

 

Using Excel, figuring out what columns were important and which were not was a long and painful process. Once I had a basic script in Python, I was able to use something called a decision tree classifier to determine that for me, with greater accuracy. You can learn more about that here: http://scikit-learn.org/stable/modules/tree.html

In 39 lines of (amateur) code you can go from some basic cleaned .csv data to visual tree graphs. It should take less than 1 minute to run. Powerful stuff, and what I love about coding data analytics solutions like this is you can reuse it for many types of classification and regression problems.

You can get the data here: https://www.lendingclub.com/info/download-data.action

Feel free to install Python along with Pandas and SKLearn and give it a shot yourself! I am sure there are things I could improve, but it picked up on the proper drivers without any prompt from me. SKLearn has a lot of interesting features that I am starting to wrap my head around.

 

 

Advertisements

Author: secondhandstocks

The genesis for this blog stems from a Marine buddy and I came back from Afghanistan with more money than knowledge, and heedlessly tossed our hats into the stock market ring. A few months later, I remember discovering the classic book The Intelligent Investor by Graham and Dodd, and ravenously devouring my first introduction to value investing. That framework - with some generous additions by Seth Klarman, and Joel Greenblatt among others - guides my investment philosophy. I spent five years working in the intelligence field, both in the Marine Corps and then for a government agency after that. I speak Arabic and Pashto, have programming and analysis experience, and enjoy investing in technology companies as a hobby. I also spent a year on Wall Street working on a #1 Ranked Institutional Investor team, before deciding that that the Sell-Side was not for me.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s