Spotify Music Data
  • AI Chat
  • Code
  • Report
  • Beta
    Spinner

    Spotify Music Data

    This dataset consists of ~600 songs that were in the top songs of the year from 2010 to 2019 (as measured by Billboard). You can explore interesting song data pulled from Spotify such as the beats per minute, amount of spoken words, loudness, and energy of every song.

    Not sure where to begin? Scroll to the bottom to find challenges!

    from scipy.stats import norm
    import pandas as pd
    import matplotlib.pyplot as plt
    import seaborn as sns
    %matplotlib inline
    
    music =  pd.read_csv("spotify_top_music.csv", index_col=0)
    music.head()

    Data dictionary

    VariableExplanation
    0titleThe title of the song
    1artistThe artist of the song
    2top genreThe genre of the song
    3yearThe year the song was in the Billboard
    4bpmBeats per minute: the tempo of the song
    5nrgyThe energy of the song: higher values mean more energetic (fast, loud)
    6dnceThe danceability of the song: higher values mean it's easier to dance to
    7dBDecibel: the loudness of the song
    8liveLiveness: likeliness the song was recorded with a live audience
    9valValence: higher values mean a more positive sound (happy, cheerful)
    10durThe duration of the song
    11acousThe acousticness of the song: likeliness the song is acoustic
    12spchSpeechines: higher values mean more spoken words
    13popPopularity: higher values mean more popular

    Source of dataset.

    Don't know where to start?

    Challenges are brief tasks designed to help you practice specific skills:

    • πŸ—ΊοΈ Explore: Which artists and genres are the most popular?
    • πŸ“Š Visualize: Visualize the numeric values as a time-series by year. Can you spot any changes over the years?
    • πŸ”Ž Analyze: Train and build a classifier to predict a song's genre based on columns 3 to 13.

    Scenarios are broader questions to help you develop an end-to-end project for your portfolio:

    Your friend, who is an aspiring musician, wants to make a hit song and has asked you to use your data skills to help her. You have decided to analyze what makes a top song, keeping in mind changes over the years. What concrete recommendations can you give her before she writes lyrics, makes beats, and records the song? She's open to any genre!

    You will need to prepare a report that is accessible to a broad audience. It will need to outline your motivation, analysis steps, findings, and conclusions.


    ✍️ If you have an idea for an interesting Scenario or Challenge, or have feedback on our existing ones, let us know! You can submit feedback by pressing the question mark in the top right corner of the screen and selecting "Give Feedback". Include the phrase "Content Feedback" to help us flag it in our system.

    Read rows and columns

    music.shape

    The type of the data

    music.dtypes

    If there any missing value

    music.isna().any().sum()
    # There any duplicate values
    music.duplicated()
    music.head(5)
    

    Count your values use groupby

    artist_pop = music.groupby(['title', 'artist', 'top genre'])['pop'].count()
    artist_pop
    β€Œ
    β€Œ
    β€Œ