Skip to content
Competition - hospital readmissions
  • AI Chat
  • Code
  • Report
  • Spinner

    Reducing hospital readmissions

    📖 Background

    You work for a consulting company helping a hospital group better understand patient readmissions. The hospital gave you access to ten years of information on patients readmitted to the hospital after being discharged. The doctors want you to assess if initial diagnoses, number of procedures, or other variables could help them better understand the probability of readmission.

    They want to focus follow-up calls and attention on those patients with a higher probability of readmission.

    💾 The data

    You have access to ten years of patient information (source):

    Information in the file
    • "age" - age bracket of the patient
    • "time_in_hospital" - days (from 1 to 14)
    • "n_procedures" - number of procedures performed during the hospital stay
    • "n_lab_procedures" - number of laboratory procedures performed during the hospital stay
    • "n_medications" - number of medications administered during the hospital stay
    • "n_outpatient" - number of outpatient visits in the year before a hospital stay
    • "n_inpatient" - number of inpatient visits in the year before the hospital stay
    • "n_emergency" - number of visits to the emergency room in the year before the hospital stay
    • "medical_specialty" - the specialty of the admitting physician
    • "diag_1" - primary diagnosis (Circulatory, Respiratory, Digestive, etc.)
    • "diag_2" - secondary diagnosis
    • "diag_3" - additional secondary diagnosis
    • "glucose_test" - whether the glucose serum came out as high (> 200), normal, or not performed
    • "A1Ctest" - whether the A1C level of the patient came out as high (> 7%), normal, or not performed
    • "change" - whether there was a change in the diabetes medication ('yes' or 'no')
    • "diabetes_med" - whether a diabetes medication was prescribed ('yes' or 'no')
    • "readmitted" - if the patient was readmitted at the hospital ('yes' or 'no')

    Acknowledgments: Beata Strack, Jonathan P. DeShazo, Chris Gennings, Juan L. Olmo, Sebastian Ventura, Krzysztof J. Cios, and John N. Clore, "Impact of HbA1c Measurement on Hospital Readmission Rates: Analysis of 70,000 Clinical Database Patient Records," BioMed Research International, vol. 2014, Article ID 781670, 11 pages, 2014.

    import pandas as pd
    df = pd.read_csv('data/hospital_readmissions.csv')
    df.head()

    💪 Competition challenge

    Create a report that covers the following:

    1. What is the most common primary diagnosis by age group?
    2. Some doctors believe diabetes might play a central role in readmission. Explore the effect of a diabetes diagnosis on readmission rates.
    3. On what groups of patients should the hospital focus their follow-up efforts to better monitor patients with a high probability of readmission?

    RECOMMENDATIONS:

    • The top three primary diagnosis by patient age group are Ciculatory, Respiratory and Digestive.

    • Diabetes does not play a significant role in readmission, but it has low power in predicting readmission rate. By grouping

      patients into diabetes(with diabetes as primary diagnosisi) and non-diabetes(with no record of diabetes in both primary and

      sendary diagnosis), it was reveal that diabetes patient has slightly high probability of readmission than other groups, also

      there is no chane in other features such as diabetes medication, change in medication, etc.

    • The feature in patient which can help hospital focus on patients probabilitiy of readmission are n_inpatient, patients given

      diabetes medication, age, number of emergency, primary diagnosis, number of outpatient and number of procedures.

    1. What is the most common diagnosis by age group.

    import seaborn as sns
    import matplotlib.pyplot as plt
    
    plt.style.use('seaborn-darkgrid')
    sns.set_palette('Accent')
    
    df.age.value_counts().plot(kind='bar')
    plt.title('PROPORTION OF AGE GROUP')
    plt.xlabel('Age group')
    plt.ylabel('Age count')
    plt.show()
    order = ['Circulatory', 'Respiratory', 'Diabetes', 'Digestive', 'Injury', 'Musculoskeletal', 'Other', 'Missing']
    plt.figure(figsize=(8,5))
    sns.countplot(data=df[df['diag_1'] != 'Missing'], x='age', hue='diag_1', hue_order=order)#, palette='viridis_r', saturation=0.8)
    plt.title('PRIMARY DIAGNOSIS BY AGE GROUP')
    plt.xlabel('Age group')
    plt.legend(framealpha=0.2, fontsize='small')
    plt.show()

    Insights:

    From the primary diagnosis visual, there are four common disease by across age groups with Circulatory disease been the most common followed by Respiratory disease, then Digestive.

    "Other" in the visuals can be ignore since it consist of those less frequent diseases.

    2. Exploring the effect of diabetes on readmission rate.

    In order to explore the effect of diabetes on the rate of readmission, we classify the patients into three:

    DIABETES PATIENT: Patients with diabetes record in the primary diagnosis.

    NON-DIABETES PATIENTS: Patients with no record of diabetes in the primary, secondary and additional dignosis.

    ANY-DIABETES PATIENTS: Patients with any record of diabetes being in the primary, secondary or additional secondary diagnosis.

    Hidden code

    Patient with primary diagnosis as diabetes has more proprtion of readmission, while patients with other disease than diabetes has large porportion of readmission also, but ther is a need to look into the factor that contribute to patient readmission and also for the non diabetes patient such as primary diagnosis, diabetes medication, change in medication and others.

    Hidden code
    ‌
    ‌
    ‌