Employee Network Analysis with Networkx
  • AI Chat
  • Code
  • Report
  • Beta
    Spinner

    How can the company improve collaboration?

    📖 Background

    You work in the analytics department of a multinational company, and the head of HR wants your help mapping out the company's employee network using message data.

    They plan to use the network map to understand interdepartmental dynamics better and explore how the company shares information. The ultimate goal of this project is to think of ways to improve collaboration throughout the company.

    💾 The data

    The company has six months of information on inter-employee communication. For privacy reasons, only sender, receiver, and message length information are available (source).

    Messages has information on the sender, receiver, and time.
    • "sender" - represents the employee id of the employee sending the message.
    • "receiver" - represents the employee id of the employee receiving the message.
    • "timestamp" - the date of the message.
    • "message_length" - the length in words of the message.
    Employees has information on each employee;
    • "id" - represents the employee id of the employee.
    • "department" - is the department within the company.
    • "location" - is the country where the employee lives.
    • "age" - is the age of the employee.

    Acknowledgments: Pietro Panzarasa, Tore Opsahl, and Kathleen M. Carley. "Patterns and dynamics of users' behavior and interaction: Network analysis of an online community." Journal of the American Society for Information Science and Technology 60.5 (2009): 911-932.

    💪 Competition challenge

    Create a report that covers the following:

    1. Which departments are the most/least active?
    2. Which employee has the most connections?
    3. Identify the most influential departments and employees.
    4. Using the network analysis, in which departments would you recommend the HR team focus to boost collaboration?

    📚 Import libraries and dataframes

    # Load standard libraries
    import pandas as pd
    import numpy as np
    
    # Load visualization library
    import matplotlib.pyplot as plt
    import matplotlib.patches as mpatches
    
    # Load network library
    import networkx as nx
    
    # Load DataFrames
    employees = pd.read_csv('data/employees.csv')
    messages = pd.read_csv('data/messages.csv', parse_dates=['timestamp'])

    👨‍🔬 Exploratory analysis

    employees dataframe

    [47]
    employees.tail()

    The dataframe has 664 rows and 4 columns.

    [48]
    employees.shape

    There are not null values in the dataframe.

    [49]
    employees.isnull().sum()
    • There are 6 departments in the company's employee network.
      • The sales department has the most employees with 161 employees.
      • The marketing department has the least employees with only 52 employees.
    [50]
    employees.value_counts(subset='department')