Collaboration is Key 🔑
💾 The data
Messages has information on the sender, receiver, and time.
- "sender" - represents the employee id of the employee sending the message.
- "receiver" - represents the employee id of the employee receiving the message.
- "timestamp" - the date of the message.
- "message_length" - the length in words of the message.
Employees has information on each employee.
- "id" - represents the employee id of the employee.
- "department" - is the department within the company.
- "location" - is the country where the employee lives.
- "age" - is the age of the employee.
Acknowledgments: Pietro Panzarasa, Tore Opsahl, and Kathleen M. Carley. "Patterns and dynamics of users' behavior and interaction: Network analysis of an online community." Journal of the American Society for Information Science and Technology 60.5 (2009): 911-932.
(I am using BigQuery to run my SQL queries)
Collaboration is a huge factor in the acheivements of a company. Many departments must work together to produce results. So, how can this multinatainal company improve their collaboration?
After evaluating all of the data, I beleive that the HR team should focus on three departments: IT, Engineering, and Marketing, to boost collaboration.
What department is sending and receiving the most messages?📤📥
SELECT e.department, COUNT(m.sender) as messages_sent, x.messages_received FROM collaboration.messages as m JOIN collaboration.employees as e ON m.sender = e.id JOIN (SELECT e.department,COUNT(m.receiver) as messages_received FROM collaboration.employees as e JOIN collaboration.messages as m ON e.id = m.receiver GROUP BY e.department ) as x ON x.department = e.department GROUP BY e.department, x.messages_received ORDER BY messages_sent DESC;
messages_sent_and_received <- readr::read_csv('messagessent_received.csv', show_col_types = FALSE) messages_sent_and_received
messages_sent_and_received <- readr::read_csv('messagessent_received.csv', show_col_types = FALSE) library(ggplot2) ggplot(messages_sent_and_received, aes(messages_sent, messages_received, color= department)) + geom_point(size=4) +coord_fixed(ratio = 1.25, xlim = NULL, ylim = NULL, expand = TRUE, clip = "on") + labs(y = "Messages Received", x = "Messages Sent") + ggtitle("Messages Sent and Received by Department")
messages_sent_and_received <- readr::read_csv('messagessent_received.csv', show_col_types = FALSE) library(ggplot2) library(reshape2) df.long<-melt(messages_sent_and_received) level_order <- c('Sales', 'Operations', 'Admin', 'IT', 'Engineering', 'Marketing') ggplot(df.long,aes(x= factor(department, level= level_order),value,fill=variable))+ geom_bar(stat="identity",position="dodge") + scale_fill_manual(values=c("royalblue2", "firebrick1")) +labs(y = "Messages Sent and Received", x = "Department") + ggtitle("Messages Sent and Received by Department")
Comparing messages sent and received by department we can see that Sales, Admin, and Operations are sending more messages than they receive while Engineering, IT, and Marketing all receive more messages than they send. 📊
What is the percentage of employees in each department compared to the percentage of messages sent from each department? 🤔
SELECT e.department, ROUND((COUNT(e.id)/664)*100,2) AS percent_employees_in_department, x.percent_messages_sent FROM collaboration.employees as e JOIN (SELECT e.department, ROUND((COUNT(m.sender)/3512)*100,2) as percent_messages_sent FROM collaboration.employees as e JOIN collaboration.messages as m ON e.id = m.sender GROUP BY e.department ORDER BY percent_messages_sent DESC) as x ON e.department = x.department GROUP BY e.department, x.percent_messages_sent ORDER BY percent_employees_in_department DESC;
percent_of_whole <- readr::read_csv('percent_of_whole.csv', show_col_types = FALSE) percent_of_whole
y <- c(24.25, 21.08,20.18, 15.05, 11.6, 7.83) label <- c("Sales", "Admin", "Operations", "Engineering", "IT", "Marketing" ) colrs <- c("yellow2","firebrick2","lawngreen", "deepskyblue1","chocolate1","darkorchid1") pie(y, label, main = "Percentage of Employees in Department", col = colrs) x <- c(44.16, 24.4,28.84, .074, 1.4, 0.46) labels <- c("Sales", "Admin", "Operations", "Engineering", "IT", "Marketing" ) pie(x, labels, main = "Percentage of Messages Sent", col = colrs)
From this we can see that the three departments that make up roughly 65% of the company are sending nearly 98% of all messages.
Meaning that the three departments making up 35% of the company are sending little more than 2% of messages. This shows that Engineering, IT, and Marketing, would benefit the most from HR intervention to boost collaboration. 💡
Which employees have the most connections. 🧵
SELECT DISTINCT m.sender, COUNT(DISTINCT m.receiver) AS messages_received, e.department FROM collaboration.messages as m JOIN collaboration.employees as e ON m.sender = e.id LEFT JOIN collaboration.employees as receiver_dep ON m.receiver = receiver_dep.id WHERE receiver_dep.department <> e.department GROUP BY m.sender, e.department ORDER BY messages_received DESC;