Elena Morozova














Sign up
Beta
Spinner

Analyzing Crime in LA

🌇🚔 Background

Los Angeles, California 😎. The City of Angels. Tinseltown. The Entertainment Capital of the World! Known for its warm weather, palm trees, sprawling coastline, and Hollywood, along with producing some of the most iconic films and songs!

However, as with any highely populated city, it isn't always glamarous and there can be a large volume of crime. That's where you can help!

You have been asked to support the Los Angeles Police Department (LAPD) by analyzing their crime data to identify patterns in criminal behavior. They plan to use your insights to allocate resources effectively to tackle various crimes in different areas.

You are free to use any methodologies that you like in order to produce your insights.

The Data

They have provided you with a single dataset to use. A summary and preview is provided below.

The data is publicly available here.

👮‍♀️ crimes.csv

ColumnDescription
'DR_NO'Division of Records Number: Official file number made up of a 2 digit year, area ID, and 5 digits.
'Date Rptd'Date reported - MM/DD/YYYY.
'DATE OCC'Date of occurence - MM/DD/YYYY.
'TIME OCC'In 24 hour military time.
'AREA'The LAPD has 21 Community Police Stations referred to as Geographic Areas within the department. These Geographic Areas are sequentially numbered from 1-21.
'AREA NAME'The 21 Geographic Areas or Patrol Divisions are also given a name designation that references a landmark or the surrounding community that it is responsible for. For example 77th Street Division is located at the intersection of South Broadway and 77th Street, serving neighborhoods in South Los Angeles.
'Rpt Dist No'A four-digit code that represents a sub-area within a Geographic Area. All crime records reference the "RD" that it occurred in for statistical comparisons. Find LAPD Reporting Districts on the LA City GeoHub at http://geohub.lacity.org/datasets/c4f83909b81d4786aa8ba8a74ab
'Crm Cd'Crime code for the offence committed.
'Crm Cd Desc'Definition of the crime.
'Vict Age'Victim Age (years)
'Vict Sex'Victim's sex: F: Female, M: Male, X: Unknown.
'Vict Descent'Victim's descent:
  • A - Other Asian
  • B - Black
  • C - Chinese
  • D - Cambodian
  • F - Filipino
  • G - Guamanian
  • H - Hispanic/Latin/Mexican
  • I - American Indian/Alaskan Native
  • J - Japanese
  • K - Korean
  • L - Laotian
  • O - Other
  • P - Pacific Islander
  • S - Samoan
  • U - Hawaiian
  • V - Vietnamese
  • W - White
  • X - Unknown
  • Z - Asian Indian
'Premis Cd'Code for the type of structure, vehicle, or location where the crime took place.
'Premis Desc'Definition of the 'Premis Cd'.
'Weapon Used Cd'The type of weapon used in the crime.
'Weapon Desc'Description of the weapon used (if applicable).
'Status Desc'Crime status.
'Crm Cd 1'Indicates the crime committed. Crime Code 1 is the primary and most serious one. Crime Code 2, 3, and 4 are respectively less serious offenses. Lower crime class numbers are more serious.
'Crm Cd 2'May contain a code for an additional crime, less serious than Crime Code 1.
'Crm Cd 3'May contain a code for an additional crime, less serious than Crime Code 1.
'Crm Cd 4'May contain a code for an additional crime, less serious than Crime Code 1.
'LOCATION'Street address of the crime.
'Cross Street'Cross Street of rounded Address
'LAT'Latitude of the crime location.
'LON'Longtitude of the crime location.
import pandas as pd
crimes = pd.read_csv("data/crimes.csv")
crimes.head()

💪 The Challenge

  • Use your skills to produce insights about crimes in Los Angeles.
  • Examples could include examining how crime varies by area, crime type, victim age, time of day, and victim descent.
  • You could build machine learning models to predict criminal activities, such as when a crime may occur, what type of crime, or where, based on features in the dataset.
  • You may also wish to visualize the distribution of crimes on a map.

Note:

To ensure the best user experience, we currently discourage using Folium and Bokeh in Workspace notebooks.

✍️ Judging criteria

This competition is for helping to understand how competitions work. This competition will not be judged.

✅ Checklist before publishing

  • Rename your workspace to make it descriptive of your work. N.B. you should leave the notebook name as notebook.ipynb.
  • Remove redundant cells like the judging criteria, so the workbook is focused on your work.
  • Check that all the cells run without error.

⌛️ Time is ticking. Good luck!

crimes.info()

Summary

Preprocessing:

  • Rename the columns to make them more understandable
  • Split id data into year, area and unique number and compare them to the corresponding columns. Deleted data with inconsistent dates.
  • Delete rows with inconsistent 'age' column
  • Delete rows with inconsistent geo data
  • Correct ‘sex’ labels to have 3 categories Grouping data:
  • descent
  • status data
  • weapon data
  • crimes data (type of crime)
  • premises data
  • age of victim General Analysis:
  • In one third of the crimes a weapon was used. The most common weapon was abuse.
  • Most common crime types were Theft, Violence, Damage
  • Most crimes were commited in public areas, in the open air
  • Most victims were young adults (20-39 years), males of Hispanic origins. Areas:
  • Top 3 area by number of crimes: 1. Central, 12. 77thStreet, 14. Pacific
  • Top 3 areas by number of thefts: 14. Pacific, 1. Central, 8. West LA
  • Top 3 areas by number of violences: 12. 77thStreet, 1. Central, 18. Southeast
  • Top 3 area by number of damages: 1. Central, 12. 77thStreet, 3. Southwest
  • Top 3 area by number of violationes: 5. Harbor, 3. Southwest, 15. N Hollywood
  • Top 3 area by number of abuses: 12. 77thStreet, 18. Southeast, 3. Southwest
  • Top 3 area by number of other crimes: 12. 77thStreet, 1. Central, 18. Southeast Crimes:
  • Top 3 crimes: VEHICLE – STOLEN, BATTERY - SIMPLE ASSAULT, THEFT OF IDENTITY
  • Top 3 crimes in theft group by number: VEHICLE – STOLEN, THEFT OF IDENTITY, BURGLARY FROM VEHICLE
  • Top 3 crimes in violence group by number: BATTERY - SIMPLE ASSAUL, ASSAULT WITH DEADLY WEAPON - AGGRAVATED ASSAULT, INTIMATE PARTNER - AGGRAVATED ASSAULT
  • Top 3 crimes in damage group by number: VANDALISM - FELONY, VANDALISM - MISDEAMEANOR, TELEPHONE PROPERTY – DAMAGE
  • 19.38 % of all crimes closed, 7.3 % of theft crimes closed, 43.88 % of violence crimes closed ,16.33 % of damage crimes closed.

Report of area 14 PACIFIC in group THEFT.

  • We have 75 unique districts in area 14 PACIFIC.
  • There are 15886 crimes in group THEFT: on 1st place in ranking between all areas in LA in this group of crime.
  • Top 3 problem districts are: 1494, 1488, 1469.
  • Main crimes in district 1494 are: THEFT-GRAND, THEFT PLAIN - PETTY, VEHICLE – STOLEN. Max number of crimes where in year 2022, in January, on Monday, in hours 12:00, 10:00, 13:00.
  • Main crimes in district 1488 are: EMBEZZLEMENT, GRAND THEFT, VEHICLE - STOLEN, THEFT FROM MOTOR VEHICLE - PETTY. Max number of crimes where in year 2020, in January, on Monday, in hours 12:00, 10:00, 18:00.
  • Main crimes in district 1469 are: VEHICLE - STOLEN, SHOPLIFTING - PETTY THEFT ($950 & UNDER), BURGLARY. Max number of crimes where in year 2022, in May, on Monday, in hours 12:00, 13:00, 18:00.

Report of area 12 77TH STREET in group VIOLENCE.

  • We have 39 unique districts in area 12 77TH STREET.
  • There are 7709 crimes in group VIOLENCE: on 1st place in ranking between all areas in LA in this group of crime.
  • Top 3 problem districts are: 1269, 1241, 1268.
  • Main crimes in district 1269 are: ASSAULT WITH DEADLY WEAPON, AGGRAVATED ASSAULT, INTIMATE PARTNER - SIMPLE ASSAULT, BATTERY - SIMPLE ASSAULT. Max number of crimes where in year 2022, in January, on Sunday, in hours 18:00, 20:00, 21:00.
  • Main crimes in district 1241 are: INTIMATE PARTNER - SIMPLE ASSAULT, ASSAULT WITH DEADLY WEAPON, AGGRAVATED ASSAULT, BATTERY - SIMPLE ASSAULT. Max number of crimes where in year 2021, in April, on Saturday, in hours 19:00, 22:00, 11:00.
  • Main crimes in district 1268 are: ASSAULT WITH DEADLY WEAPON, AGGRAVATED ASSAULT, INTIMATE PARTNER - SIMPLE ASSAULT, BATTERY - SIMPLE ASSAULT. Max number of crimes where in year 2020, in June, on Thursday, in hours 21:00, 23:00, 20:00.

Report of area 1 CENTRAL in group DAMAGE.

  • We have 52 unique districts in area 1 CENTRAL.
  • There are 2553 crimes in group DAMAGE: on 1st place in ranking between all areas in LA in this group of crime.
  • Top 3 problem districts are: 162, 182, 111.
  • District 162: Max number of crimes where in year 2022, in February, on Sunday, in hours 19:00, 1:00, 17:00.
  • District 182: Max number of crimes where in year 2022, in April, on Sunday, in hours 21:00, 22:00, 14:00.
  • District 111: Max number of crimes where in year 2022, in March, on Thursday, in hours 18:00, 12:00, 19:00.

Further you can find functions that can analyse any area based on type of crimes, premises and weapon used and visualize it on map.

EDA

First of all I'll rename the columns to make them more understandable

df=crimes.copy()
df.rename(columns={'DR_NO':'id',\
                   'Date Rptd':'date_r',\
                   'DATE OCC':'date',\
                   'TIME OCC':'time',\
                   'AREA':'area',\
                   'AREA NAME':'area_name',\
                   'Rpt Dist No':'district',\
                   'Crm Cd':'crime_id',\
                   'Crm Cd Desc':'crime_desc',\
                   'Vict Age':'age',\
                   'Vict Sex':'sex',\
                   'Vict Descent':'descent',\
                   'Premis Cd':'premis_id',\
                   'Premis Desc':'premis_desc',\
                   'Weapon Used Cd':'weapon_id',\
                   'Weapon Desc':'weapon_desc',\
                   'Status':'status_id',\
                   'Status Desc':'status_desc',\
                   'Crm Cd 1':'crime1_id',\
                   'Crm Cd 2':'crime2_id',\
                   'Crm Cd 3':'crime3_id',\
                   'Crm Cd 4':'crime4_id',\
                   'LOCATION':'location',\
                   'Cross Street':'cross_street',\
                   'LAT':'latitude',\
                   'LON':'longtitude'
                  },inplace=True)
df.info()

Import libraries




  • AI Chat
  • Code