Unsupervised Learning in Python
  • AI Chat
  • Code
  • Report
  • Beta
    Spinner

    Unsupervised Learning in Python

    👋 Welcome to your new workspace! Here, you can experiment with the data you used in Unsupervised Learning in Python and practice your newly learned skills with a challenge. You can find out more about DataCamp Workspace here.

    Below is a code cell that imports the course packages and loads in the course datasets as pandas DataFrames.

    🏃To execute the code, click inside the cell to select it and click "Run" or the ► icon. You can also use Shift-Enter to run a selected cell and automatically switch to the next cell.

    # Import the course packages
    import pandas as pd
    import numpy as np
    import matplotlib.pyplot as plt
    import sklearn
    import scipy.stats 
    
    # Import the course datasets as DataFrames
    grains = pd.read_csv('datasets/grains.csv')
    fish = pd.read_csv('datasets/fish.csv', header=None)
    wine = pd.read_csv('datasets/wine.csv')
    eurovision = pd.read_csv('datasets/eurovision-2016.csv')
    stocks = pd.read_csv('datasets/company-stock-movements-2010-2015-incl.csv', index_col=0)
    digits = pd.read_csv('datasets/lcd-digits.csv', header=None)
    
    # Preview the first DataFrame
    grains

    Challenge Yourself

    Don't know where to start? Add code to the code cell below to try the following challenge:

    You work for an agricultural research center. Your manager wants you to group seed varieties based on different measurements contained in the grains DataFrame. They also want to know how your clustering solution compares to the seed types listed in the dataset (the variety_number and variety columns).

    Try to use all of the relevant techniques you learned in Unsupervised Learning in Python!

    Reminder: To execute the code you add to a cell, click inside the cell to select it and click "Run" or the ► icon. You can also use Shift-Enter to run a selected cell and automatically switch to the next cell.

    # Use this cell (and add others as needed) to cluster the grains data!
    

    Continue to Explore

    Feeling confident about your skills? Continue to Machine Learning with Tree-Based Models in Python, or check out the other Machine Learning Scientist with Python Career Track courses to learn other advanced machine learning techniques.

    If you're interested in exploring the remaining course datasets, you can refer to the DataFrames and potential problems below:

    • fish: Each row represents an individual fish. Standardize the features and cluster the fish by their measurements. You can then compare your cluster labels with the actual fish species (first column).
    • wine: There are three class_labels in this dataset. Transform the features to get the most accurate clustering.
    • eurovision: Perform hierarchical clustering of the voting countries using complete linkage and plot the resulting dendrogram.