Avenues Trust Group Performance Analysis - Part One
Project Brief
The following brief was provider by Group Director of Operations Dan Gower Smith:
Scope: To have CQC data in a format e.g. graphs that we can share with Trustees.
Avenues charities:
- Avenues South East
- Avenues London
- Avenues East
- Avenues South
- Autism Hampshire
For each charity they have “registered locations” e.g. care homes or personal care. Regardless of the type of provision they have an “overall rating” and “domain ratings”.
For Each charity, I would like a graph that show the “domain ratings” and “overall ratings”.
Then an Avenues Group Summary that reports on the “overall ratings”. If we could go back to 2017 that would be fab and include “archived” services that would be useful but I don’t know how that will look.
Once we know that this is possible it would be good to compare the Group summary to our competitors. I can provide 4 or 5 links.
Latest Ratings Analysis
In the emailed brief, weblinks for the CQC pages related to each charity were provided. These links contained the Provider IDs, which have been listed below, both individually and as a group, so that the data can be filtered as necessary.
In a follow-up email, MR Gower Smith provided a list of the competitor charities.
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
# Provider IDs for Avenues Trust charities
# individual providers
avenues_south_east_id = '1-102643044'
avenues_london_id = '1-119235881'
avenues_east_id = '1-101692736'
avenues_south_id = '1-101652247'
autism_hampshire_id = '1-101613059'
# all Avenues Trust providers
all_avenues_ids = [avenues_south_east_id, avenues_london_id, avenues_east_id, avenues_south_id, autism_hampshire_id]
# Provider IDs for Avenues Trust's competitor charities
# Certitude Providers
southside_partnership_id = '1-101652731'
support_for_living_id = '1-101648408'
# Dimensions
dimensions_somerset_sev_id = '1-3241092145'
dimensions_uk_id = '1-349152381'
# Others
active_prospects_id = '1-101650500'
fitzroy_support_id = '1-102642772'
choice_support_id = '1-15126086868'
# all competitors
all_competitor_ids = [southside_partnership_id, support_for_living_id, dimensions_somerset_sev_id,
dimensions_uk_id, active_prospects_id, fitzroy_support_id, choice_support_id]
# all charities of interest
all_provider_ids = all_avenues_ids + all_competitor_ids
Data Collection & Cleaning
The CQC website provides a downloadable care directory with ratings files in Excel format. There are monthly updates for each year since 2015, each containing details of all locations, along with their domain and overall ratings. Each Excel file contained two data sheets, one for providers and the other locations.
The first part of the analysis involves only the most recent ratings for all providers. The latest Excel file was downloaded and the locations sheet converted into a CSV file.
# import the data
feb_ratings = pd.read_csv('cqc_2024_feb.csv')
feb_ratings.info()
Many of the columns in the DataFrame are unnecessary for the analysis. We can see above that there are no null values in the columns being kept.
cols_to_keep = ['Location ID',
'Location Name',
'Provider ID',
'Provider Name',
'Domain',
'Latest Rating',
'Inherited Rating (Y/N)',
'Publication Date']
feb_ratings = feb_ratings[cols_to_keep]
feb_ratings.head()
We have two sets of data within the document that we are interested in: Avenues Providers and their competitors providers. We can create two new dataframes with the respective providers only.
avenues_ratings = feb_ratings[feb_ratings['Provider ID'].isin(all_avenues_ids)]
comp_ratings = feb_ratings[feb_ratings['Provider ID'].isin(all_competitor_ids)]
display(avenues_ratings.head())
display(comp_ratings.head())
Data Analysis
Avenues Trust Providers
The first part of the brief is to graph the performance of the Avenues Trust charities, showing the ratings for each domain, as well as the overall rating.
Since a several different graphs are required with the same labels and parameters, functions will be written to plot the graphs, taking the dataframe of interest and the title as arguements.
# set the order for the domains as shown on the CQC website
domain_order = ['Overall', 'Safe', 'Effective', 'Caring', 'Responsive', 'Well-led']
# set the order for the ratings from best to worst
rating_order = ['Outstanding', 'Good', 'Requires improvement', 'Inadequate']
palette = ['green', 'yellowgreen', 'orange', 'red']
# assign the columns to be used to variables to avoid errors in typing string values repeatedly
x = 'Domain'
hue = 'Latest Rating'
col = 'Provider Name'
y = '% of Locations'
# set style
plt.style.use('fivethirtyeight')
def single_count_plot_all_domains(df, title):
sns.countplot(data=df, x=x, hue=hue, order=domain_order, hue_order=rating_order, palette=palette)
plt.ylabel('Number of Locations')
plt.title(title, fontweight='bold')
plt.xticks(rotation=45)
plt.xlabel('')
plt.legend(bbox_to_anchor=(1, 0.6))
single_count_plot_all_domains(avenues_ratings, 'All Avenues Trust Locations')
def multiple_count_plots_all_domains(df, cols, title):
g = sns.FacetGrid(data=df, col=col, col_wrap=cols, height=4, sharex=False, margin_titles=True)
g.map(sns.countplot, x, hue=hue, data=df, order=domain_order, hue_order=rating_order, palette=palette)
g.set_xticklabels(rotation=45, fontsize=10)
g.set_titles("{col_name}")
g.fig.suptitle(title, fontsize=20, fontweight='bold')
g.set_xlabels('')
g.set_ylabels('No. Locations', fontsize=12)
plt.tight_layout()
for ax in g.axes:
for container in ax.containers:
plt.setp(container, width=0.2)
plt.legend(bbox_to_anchor=(1.1, 0.3), loc='lower left', borderaxespad=0.)
multiple_count_plots_all_domains(avenues_ratings, 3, 'Avenues Trust Providers Ratings Count')