User avatar
Samuel Verevis’ workbooks














Sign up
Beta
Spinner

Why staff are leaving, and why you should care

knitr::opts_chunk$set(echo = FALSE, message = FALSE)

## 
## Clear decks and load packages, and data
##

  install.packages(c("naniar", "styler", "Metrics", "ranger", "ggridges", "reactable","wesanderson", "reactablefmtr"))

## Load packages and data
##
  library(tidyverse)
  library(naniar)
  library(styler)
  library(broom)
  library(Metrics)
  library(rsample)
  library(ranger)
  library(ggridges)
  library(viridis)
  library(reactable)
  library(wesanderson)
  library(reactablefmtr)

## Load data

  df <- readr::read_csv('./data/employee_churn_data.csv')
  
  theme_set(theme_minimal(base_family = "Lato") +
            theme(text = element_text(color = "gray12"),
    
    # Customize the text in the title, subtitle, and caption
    plot.title = element_text(face = "bold", size = 14, hjust = 0.05),
    plot.subtitle = element_text(size = 10, hjust = 0.05),
    plot.caption = element_text(size = 10, hjust = .5),
    
    # Make the background white and remove extra grid lines
    panel.background = element_rect(fill = "white", color = "white"),
    panel.grid = element_blank(),
    panel.grid.major.x = element_blank()
  ))
 

Key recommendations (tl;dr)

Armed with this knowledge, the company can (and should) do more to intervene and reduce turn-over levels to below 20% across departments. This could be done via:

  • Increasing the percentage of promotions given to staff proportionately across departments,
  • Increasing the number of bonuses issues as way of review performance recognition,
  • Introduce other benefits that shows recognition for hard work, such as extra days off or paid development opportunities.
  • Targeted interventions for employees who have only been at the company for a limited amount of time.

The high costs of staff turnover

The company's high attrition rate, left unchecked may lead to lower productivity, inefficiencies, and ultimately higher costs...

A high attrition rate can lead to lower company productivity, inefficiencies and loss of institutional knowledge. Indeed, there will always be a natural tendency for staff leaving, which is typically outside the company's influence. But minimizing staff departures will lead to better outcomes for the company and its staff.

The overall attrition rate for this company is 29.2%, that is - 2,784 of the 9,540 employees left over the past 24 months. The high employee turnover rate, is likely leaving large skill gaps and putting further strain on existing staff - more needs to be done to prevent staff from leaving.

...The majority of the staff that leave have been working for the company between 5 - 8 years...

About 91% of the staff that left, had been working for the company for between 5 - 8 years. Given the seniority, these employees are likely to be difficult and costly to replace, as they likely have deep institutional knowledge and skills that will be costly replace, and take time to retrain.

However, relative attrition rates between different cohorts, appears highest for employees in the early stages of their careers, with 66% of staff leaving in the first 2 years. This decreases over time, but spikes again for those who have been at the company for 8 years.

## turn the number into % total staff? to get in same scale are review numbers\
##
    ggplot(df, aes(tenure, fill = left)) + 
      geom_bar(position = "dodge2", alpha = 0.7) +
      scale_fill_manual(values = c("#6C5B7B","#C06C84")) +
   theme(
    # Remove axis ticks and text
    axis.title = element_blank(),
    axis.ticks = element_blank(),
    # Use gray text for the region names
    axis.text.x = element_text(color = "gray12", size = 12),
    # Move the legend to the bottom
    legend.position = "bottom"
  ) 

  ggplot(df, aes(tenure, fill = left)) + 
      geom_bar(position = "fill", alpha = 0.7) +
      scale_fill_manual(values = wes_palette("GrandBudapest1")) +
   theme(
    # Remove axis ticks and text
    axis.title = element_blank(),
    axis.ticks = element_blank(),
    # Use gray text for the region names
    axis.text.x = element_text(color = "gray12", size = 12),
    # Move the legend to the bottom
    legend.position = "bottom"
  )
      

... With little variation in attrition rates between departments, suggesting a more systemic issue in nature

The attrition rate across the 10 company department varies slightly from the overall company attrition rate of 29%. IT, Logistics and Retail have the highest attrition rate of 31%, followed by marketing at 30%. The rest of the department attrition rates are below 30%, with Finance having the lowest turnover rate, of 27%. In absolute terms, Sales, Retail, Engineering and Operations, had the largest number of staff leaving by department (totaling 1,881). Overall, the difference across departments employee turnover is marginal and suggests that reasons for leaving may be more systemic in nature.


dep_left <- df %>% 
              group_by(department) %>% 
              summarise(n = n()) %>%  ungroup() %>% 
              mutate(perc_dep = n/ sum(n) * 100) %>% 
              left_join(df %>% group_by(department, left) %>%  summarise( n = n()) %>%
                         mutate( attr = n / sum(n) * 100) %>%  filter(left == 'yes') %>%  select(-n, -left), 
                        by = "department")

ggplot(dep_left) +
  geom_hline(
    aes(yintercept = y), 
    data.frame(y = c(0:3) * 10),
    color = "lightgrey"
  ) + 
    geom_col(
      aes(x = reorder(department, perc_dep),
          y = perc_dep, fill = perc_dep),
          position = "dodge2",show.legend = TRUE, alpha = .9) +
        coord_polar() +
  geom_point(
    aes(x = reorder(department, perc_dep), y = attr), size = 3, color = "gray12") +
  geom_segment(
              aes(x = reorder(department, perc_dep), y = 0, 
               xend =  reorder(department, perc_dep),
               yend = 30), linetype = "dashed", color = "grey12") +
  scale_y_continuous(
    limits = c(-10, 40),
    expand = c(0, 0),
    breaks = c(0, 10,20,30)
  ) + 
  # New fill and legend title for number of tracks per region
  scale_fill_gradientn(
    "Percent of staff by department",
     colours = c( "#6C5B7B","#C06C84","#F67280","#F8B195")
  ) +
  # Make the guide for the fill discrete
  guides(
    fill = guide_colorsteps(
      barwidth = 15, barheight = .5, title.position = "top", title.hjust = .5
    )
  ) +
  theme(
    # Remove axis ticks and text
    axis.title = element_blank(),
    axis.ticks = element_blank(),
    axis.text.y = element_blank(),
    # Use gray text for the region names
    axis.text.x = element_text(color = "gray12", size = 12),
    # Move the legend to the bottom
    legend.position = "bottom",
  ) +
  annotate(
    x = 1.1, 
    y = 15,
    label = "Attrition rate",
    geom = "text",
    angle = 72,
    color = "gray12",
    size = 2.5,
    family = "Bell MT"
  ) +
  annotate(
    x = 11.1, 
    y = 11, 
    label = "10%", 
    geom = "text", 
    color = "gray12", 
    family = "Bell MT"
  ) +
  annotate(
    x = 11.1, 
    y = 21, 
    label = "20%", 
    geom = "text", 
    color = "gray12", 
    family = "Bell MT"
  ) +
  annotate(
    x = 11.1, 
    y =31, 
    label = "30%", 
    geom = "text", 
    color = "gray12", 
    family = "Bell MT"
  ) 
   

Staff that exit typically have higher performance reviews coupled with lower satisfaction rates, compared to those who stayed...

Typically, jobs satisfaction, performance reviews, and salary are important features that may allude to the reasons why an employee decides to leave their current job. Comparing distributional differences of employees who have left against those who have stayed, reveals important factors for understanding employee attrition rates.

Firstly, a trade off exists between performance review and satisfaction ratings. That is, as performance scores increases across employees, they also tend to have lower satisfaction ratings (see Figure 3). This feature is even more pronounced for employees who have left. Suggesting that, as they work harder to attain higher performance review, they're putting themselves at a higher risk of burnout and leaving. This relationship is also salient across departments.


# "#F67280","#F8B195"

 ggplot(df,
          aes(x = review, y = satisfaction, color = left)) +
          geom_point(alpha = 0.2) +
          scale_color_manual(values = c("#F67280","#F8B195")) +
          facet_wrap(~department) +
          geom_smooth( aes(x = review, y = satisfaction), method  = "lm", data = df, se = FALSE)  #inherit.aes = FALSE)
         

...As well as working longer days, and receiving relatively fewer promotions or bonuses...

Similarly, for the group of employee that left, they tended to work more than 8 hours a day on average (assuming a 22 working days in a month), relative to those who stayed (Figure 4). Despite working longer hours, those that left tend to receive proportionately fewer bonuses or promotions (22%) compared to those who stayed (25%).


df <- df %>% 
        mutate(avg_hrs_day = avg_hrs_month / 23,
               reward = ifelse(promoted == 1 | bonus == 1, "Promotion/Bonus", "No promotion/bonus"))

ggplot(df,
       aes(avg_hrs_day, fill = left)) +
      geom_density(alpha = 0.4) +
      scale_fill_manual(values = c("#6C5B7B","#C06C84")) +
      xlab("Average working hours a day") 
      


ggplot(df,
       aes(left,  fill = reward)) +
       geom_bar(position = "dodge2", alpha = 0.8) +
      scale_fill_manual(values = wes_palette("GrandBudapest2")[3:4]) +
  ylab("Number of staff")



...As staff who left tend to trade off happiness for hard work to get higher performance reviews than their peers, with little benefits.

A clear connection arises between attaining higher performance review scores (a composite score > 68), not being promoted, and not receiving a bonus. That is, employees who experience all of the above make up about 1,250 (45%) of the total pool of employees who left. With median review performance for this group at 69%, compare to the average of 65% and to those that didn't leave 63%.

While not enough in isolation, these factors in combination, are likely driving the exit rates of staff across departments. As staff who trade off happiness for hard work and higher performance reviews, tend to receive fewer promotions and bonuses relative to their peers.


ggplot(df, aes(x = review, fill = left)) +
  geom_histogram(alpha = 0.6) + 
  facet_wrap( ~ reward) +
  scale_fill_manual(values = wes_palette("GrandBudapest2")[2:3]) +
  ylab("Number of staff") 
    
    # median(subset(df, left == 'yes' & bonus == 0 & promoted == 0)$review)
    # mean(subset(df, left == 'yes' & bonus == 0 & promoted == 0)$review)
    # 
    # median(subset(df, left == 'no' & bonus == 1 & promoted == 1)$review)
    # mean(subset(df, left == 'no' & bonus == 1 & promoted == 1)$review)
## number promoted by salary bands
# 
# 
# df <- df %>% mutate(high_rev = ifelse(df$review > 0.68, 1, 0))
# 
# op10 <- summary_df(df, high_rev, bonus, left)
# 
# op10 <- op10 %>%
#           filter(left == 'yes', bonus == 0, high_rev == 1)



  • AI Chat
  • Code