Nearest Neighbors and Feature Engineering
  • AI Chat
  • Code
  • Report
  • Beta
    Spinner

    Loan Data

    Ready to put your coding skills to the test? Join us for our Workspace Competition!
    For more information, visit datacamp.com/workspacecompetition

    Context

    This dataset (source) consists of data from almost 10,000 borrowers that took loans - with some paid back and others still in progress. It was extracted from lendingclub.com which is an organization that connects borrowers with investors. We've included a few suggested questions at the end of this template to help you get started.

    Load packages

    library(skimr)
    library(tidyverse)

    Load your Data

    loans <- readr::read_csv('data/loans.csv.gz')
    skim(loans) %>% 
      select(-(numeric.p0:numeric.p100)) %>%
      select(-(complete_rate))

    Understand your data

    Variableclassdescription
    credit_policynumeric1 if the customer meets the credit underwriting criteria; 0 otherwise.
    purposecharacterThe purpose of the loan.
    int_ratenumericThe interest rate of the loan (more risky borrowers are assigned higher interest rates).
    installmentnumericThe monthly installments owed by the borrower if the loan is funded.
    log_annual_incnumericThe natural log of the self-reported annual income of the borrower.
    dtinumericThe debt-to-income ratio of the borrower (amount of debt divided by annual income).
    ficonumericThe FICO credit score of the borrower.
    days_with_cr_linenumericThe number of days the borrower has had a credit line.
    revol_balnumericThe borrower's revolving balance (amount unpaid at the end of the credit card billing cycle).
    revol_utilnumericThe borrower's revolving line utilization rate (the amount of the credit line used relative to total credit available).
    inq_last_6mthsnumericThe borrower's number of inquiries by creditors in the last 6 months.
    delinq_2yrsnumericThe number of times the borrower had been 30+ days past due on a payment in the past 2 years.
    pub_recnumericThe borrower's number of derogatory public records.
    not_fully_paidnumeric1 if the loan is not fully paid; 0 otherwise.

    Now you can start to explore this dataset with the chance to win incredible prices! Can't think of where to start? Try your hand at these suggestions:

    • Extract useful insights and visualize them in the most interesting way possible.
    • Find out how long it takes for users to pay back their loan.
    • Build a model that can predict the probability a user will be able to pay back their loan within a certain period.
    • Find out what kind of people take a loan for what purposes.

    Judging Criteria

    CATEGORYWEIGHTAGEDETAILS
    Analysis30%
    • Documentation on the goal and what was included in the analysis
    • How the question was approached
    • Visualisation tools and techniques utilized
    Results30%
    • How the results derived related to the problem chosen
    • The ability to trigger potential further analysis
    Creativity40%
    • How "out of the box" the analysis conducted is
    • Whether the publication is properly motivated and adds value