San Diego Crime Incidents with Demographic Descriptions

Crime incidents in San Diego, from 2016 though July 2020 inclusive, with UCR codes for the crime and the age, race and sex of the victim and suspect.

sandiegodata.org-crime_victims-1.1.1. Modified 2020-09-29T18:47:48

Resources | Packages | Documentation| Contacts| References| Data Dictionary

Resources

  • sdcrime_16_20. San Diego crime suspects and victims, 2016 to 2020
  • ucrcodes. UCR codes and detailed descriptions

Documentation

This dataset describes crime incidents from 2016 to 2020, with demographic information for both the victims and suspects. The file has multiple rows per incident, one for each suspect or victim. The primary key pk links records together into a single crime incident. The dataset is derived from data acquired for a PRA request and is processed to standardize geographic identifiers and racial categories.

Refer to the source dataset for the original data and the PRA request used to acquire it.

Processing

The data presented here are a processed version of the file received from ARJIS through a Public Records Act request. The processing includes:

  • Converting the tract identifier to a formal ACS format tract geoid
  • Converting the block identifier to a formal ACS format block geoid
  • Adding the position of the centroid of the tracts, in WKT format
  • Adding the Census internal point location, for the block, in WGS 84 latitude and longitude.
  • Recording the race field to the Census race / ethinicity scheme.

Additiona processing that was performed on the upstream data, which came directly from ARJIS, includes:

  • Created “year” field
  • Deleted MACRStatus from years 2017-2020
  • Combined years into 1 file
  • Deleted partial August cases to have complete month
  • Deleted 2 ARJIS and 1 DA as AGENCY records
  • Deleted incident type (all were crime case), highcharge (all were 1) and role (all were incident)
  • ALLYRS_NOSUSP includes only victims, victim/witnesses and blank (property?) in the person role
  • UNIQUECASE includes unique case numbers (no matter how many victims)

Race recode

The race field of the original data includes many names of regions, countries or ethnicities. The census_race_eth field is a recode of the race field to use the race/ethnicity scheme used by the Census. The codes used are:

  • nhwhite: Non Hispanic White
  • hispanic: Hispanic, of any race
  • black: Black or African-American
  • asian: Asian
  • nhopi: Native Hawaiian or Pacific Islander.

This file does not include any records that would be classified as the remaining census race codes, such as American Indian or Alaskan Native. These are the translations from the values in the race field to those of the census_race field:

  • OTHER: other
  • none: unknown
  • WHITE: nhwhite
  • HISPANIC: hisp
  • BLACK: black
  • MIDDLE EASTERN: white
  • PACIFIC ISLANDER: nhopi
  • CHINESE: asian
  • JAPANESE: asian
  • OTHER ASIAN: asian
  • FILIPINO: asian
  • ASIAN INDIAN: asian
  • GUAMANIAN: nhopi
  • VIETNAMESE: asian
  • HAWAIIAN: nhopi
  • INDIAN: asian
  • CAMBODIAN: asian
  • KOREAN: asian
  • SAMOAN: nhopi
  • LAOTIAN: asian
  • EAST AFRICAN: black

For the 2020 census, Filipinos may be classified as Pacific Islanders, rather than Asian, as they had been in previous years. Because this data was collected before this transition, Filipinos are classified as Asians.

Contacts

Data Dictionary

ucrcodes | sdcrime_16_20

ucrcodes

Column NameData TypeDescription
indexinteger
ucr_codestring
descriptionstring

sdcrime_16_20

Column NameData TypeDescription
pkintegerPk – this is an auto number generated by our SQL database when a new unique record comes into ARJIS. Note: Use this as your unique identifier.
activitynumberstringactivityNumber – the case number for each activity/incident/crime case (these all mean the same thing), this can be used as a unique identifier when combined with the agency name
activitydatedatetimeActivityDate – The date/time the crime case occurred
yearintegerYear of source file
agencystringAgency – the agency reporting the crime case
violationsectionstringViolationSection – the highest charge number recorded on the arrest ( i.e. most serious charge connected to the arrest, arrests may have up to ten charges)
violationtypestringViolationType – the municipal, penal or other code section which is attached to the highest recorded Violation section on the arrest
chargedescriptionstringChargeDescription – A brief text definition of the type of crime type noted in the violation section
chargelevelstringChargeLevel – whether the highest charge is a felony, misdemeanor or infraction (there should only be felonies and misdemeanors in this file)
codeucrstringCodeUCR – a more specific code to define the exact crime type per Uniform Crime Reporting Standards, see defining table on last tab of excel. If you really want to get specific with crime categories this is your tool!
crimecategorystringCrimeCategory – This is the category of the offense type based on the UCR program definitions and corresponds to the categories used on the Crime Statistics portal, with the exception that additional crime categories are included such as arson, simple assault, Part II crimes, and recovered vehicles.
personrolestringPersonRole – the role of the person on this particular record (victim, suspect, victim/witness or blank) please note that the same crime case may have multiple suspects or victims and some crime cases may not have any suspect or victim information listed depending on what the agency provided to ARJIS.
racestringRace – the ethnicity of the person listed in the person role field which can include numerous fields. Note: ARJIS agencies may apply race codes differently across the region.
ageintegerAge – the age of the person listed in the person role field Note: some cases may have no age listed which is a data entry error or unknown by the agency
sexstringSex – the gender of the person listed in the person role field which can include Male, Female, Non-Binary or Unknown. Note: some cases may have no sex listed which is a data entry error or unknown by the agency
zipcodeintegerZipCode – The zip code where the incident/crime case occurred
censusblockintegerCensus Block/Tract – the census block/tract where the incident/crime case occurred
censustractintegerCensus Block/Tract – the census block/tract where the incident/crime case occurred
citystringCity – The city where the incident/crime case occurred
census_racestringRace and ethnicity, recoded to the Census scheme
tract_geoidstringTract id, in ACS Geoid format
block_geoidstringBlock ID, in ACS Geoid format
intptlatnumberCensus block internal point, latitude
intptlonnumberCensus block internal point, longitude
geometrystringCentroid of census block

References

Urls used in the creation of this data package.

  • data/census_blocks.csv. Census 2010 blocks, converted to ACS geoids, with centroid position
  • op_sd_crime_xls. Response from PRA request
  • op_sd_crime_csv. Conversion of main tab of response data to CSV

Packages

Accessing Data in Vanilla Pandas

import pandas as pd


sdcrime_16_20_df =  pd.read_csv('http://library.metatab.org/sandiegodata.org-crime_victims-1.1.1/data/sdcrime_16_20.csv')
ucrcodes_df =  pd.read_csv('http://library.metatab.org/sandiegodata.org-crime_victims-1.1.1/data/ucrcodes.csv')

Accessing Package in Metapack

import metapack as mp
pkg = mp.open_package('http://library.metatab.org/sandiegodata.org-crime_victims-1.1.1.csv')

# Create Dataframes
sdcrime_16_20_gdf = pkg.resource('sdcrime_16_20').geoframe()
ucrcodes_df = pkg.resource('ucrcodes').dataframe()