Skip to content
[Solution] Analyzing Students' Mental Health in SQL
  • AI Chat
  • Code
  • Report
  • Spinner

    Analyzing Students' Mental Health in SQL

    In this live code-along, you'll perform exploratory data analysis on a dataset around mental health of domestic and international students. You'll perform SQL querying to look at how social connectedness and cultural issues affect mental health. Finally, you'll visualize the results of your analysis using the Python Plotly package.

    The Data

    This survey was conducted in 2018 at an international Japanese university and the associated study was published in 2019. It was approved by several ethical and regulatory boards.

    The study found that international students have a higher risk of mental health difficulties compared to the general population, and that social connectedness and acculturative stress are predictive of depression.

    Social connectedness: measure of belonging to a social group or network.

    Acculturative stress: stress associated with learning about and intergrating into a new culture.

    See paper for more info, including data description.

    Link to the data.

    Inspect the Data

    Our data is in one table that includes all of the survey data. There are 50 fields and, according to the paper, 268 records. Each row is a student.

    You can check the schema on the left.

    1. Check if the data has 268 records.
    Unknown integration
    DataFrameavailable as
    df
    variable
    -- Count the number of records in the table
    SELECT COUNT(*) AS total_records
    FROM students;
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.
    1. Inspect the dataset to see what the fields look like.
    Unknown integration
    DataFrameavailable as
    df
    variable
    -- Inspect the data and limit the output to 5 records
    SELECT *
    FROM students
    LIMIT 5;
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.
    1. How many international and domestic students are in the data set?
    Unknown integration
    DataFrameavailable as
    df
    variable
    -- Count the number of international and domestic students
    SELECT inter_dom, COUNT(inter_dom) AS count_inter_dom
    FROM students
    GROUP BY inter_dom;
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.
    1. Look into the 18 unassigned rows to understand what they could be.
    Unknown integration
    DataFrameavailable as
    df
    variable
    -- Query the data to see all records where inter_dom is neither 'Dom' nor 'Inter'
    SELECT *
    FROM students
    WHERE inter_dom NOT LIKE 'D%' AND inter_dom NOT LIKE 'I%';
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.
    1. Where are the international students from?
    Unknown integration
    DataFrameavailable as
    df
    variable
    -- See what Region international students are from
    SELECT "Region", COUNT(inter_dom) AS count_inter_dom
    FROM students
    WHERE inter_dom = 'Inter'
    GROUP BY "Region";
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.

    Understanding the scores

    1. Find the minimum, maximum, and average of each of the diagnostic tests (PHQ-9, SCS, ASISS). This information is in the paper, but it's good practice to look this up yourself during analysis.