Crime incidents in San Diego, from 2016 though July 2020 inclusive, with UCR codes for the crime and the age, race and sex of the victim and suspect.
sandiegodata.org-crime_victims-1.1.3
. Modified 2020-11-22T23:31:33
Resources | Packages | Documentation| Contacts| References| Data Dictionary
Resources
- sdcrime_16_20. San Diego crime suspects and victims, 2016 to 2020
- ucrcodes. UCR codes and detailed descriptions
Documentation
This dataset describes crime incidents from 2016 to 2020, with demographic
information for both the victims and suspects. The file has multiple rows per
incident, one for each suspect or victim. The primary key pk
links records
together into a single crime incident. The dataset is derived from data acquired for a
PRA request and is processed to standardize geographic identifiers and racial categories.
Refer to the source dataset for the original data and the PRA request used to acquire it.
Processing
The data presented here are a processed version of the file received from ARJIS through a Public Records Act request. The processing includes:
- Converting the tract identifier to a formal ACS format tract geoid
- Converting the block identifier to a formal ACS format block geoid
- Adding the position of the centroid of the tracts, in WKT format
- Adding the Census internal point location, for the block, in WGS 84 latitude and longitude.
- Recording the race field to the Census race / ethinicity scheme.
Additiona processing that was performed on the upstream data, which came directly from ARJIS, includes:
- Created “year” field
- Deleted MACRStatus from years 2017-2020
- Combined years into 1 file
- Deleted partial August cases to have complete month
- Deleted 2 ARJIS and 1 DA as AGENCY records
- Deleted incident type (all were crime case), highcharge (all were 1) and role (all were incident)
- ALLYRS_NOSUSP includes only victims, victim/witnesses and blank (property?) in the person role
- UNIQUECASE includes unique case numbers (no matter how many victims)
Race recode
The race
field of the original data includes many names of regions,
countries or ethnicities. The census_race_eth
field is a recode of the
race
field to use the race/ethnicity scheme used by the Census. The codes
used are:
- nhwhite: Non Hispanic White
- hispanic: Hispanic, of any race
- black: Black or African-American
- asian: Asian
- nhopi: Native Hawaiian or Pacific Islander.
This file does not include any records that would be classified as the
remaining census race codes, such as American Indian or Alaskan Native. These
are the translations from the values in the race
field to those of the
census_race
field:
- OTHER: other
- none: unknown
- WHITE: nhwhite
- HISPANIC: hisp
- BLACK: black
- MIDDLE EASTERN: white
- PACIFIC ISLANDER: nhopi
- CHINESE: asian
- JAPANESE: asian
- OTHER ASIAN: asian
- FILIPINO: asian
- ASIAN INDIAN: asian
- GUAMANIAN: nhopi
- VIETNAMESE: asian
- HAWAIIAN: nhopi
- INDIAN: asian
- CAMBODIAN: asian
- KOREAN: asian
- SAMOAN: nhopi
- LAOTIAN: asian
- EAST AFRICAN: black
For the 2020 census, Filipinos may be classified as Pacific Islanders, rather than Asian, as they had been in previous years. Because this data was collected before this transition, Filipinos are classified as Asians.
Documentation Links
Contacts
- Wrangler
Data Dictionary
ucrcodes | sdcrime_16_20ucrcodes
Column Name | Data Type | Description |
---|---|---|
index | integer | |
ucr_code | string | |
description | string |
sdcrime_16_20
Column Name | Data Type | Description |
---|---|---|
pk | integer | Pk – this is an auto number generated by our SQL database when a new unique record comes into ARJIS. Note: Use this as your unique identifier. |
activitynumber | string | activityNumber – the case number for each activity/incident/crime case (these all mean the same thing), this can be used as a unique identifier when combined with the agency name |
activitydate | datetime | ActivityDate – The date/time the crime case occurred |
year | integer | Year of source file |
agency | string | Agency – the agency reporting the crime case |
violationsection | string | ViolationSection – the highest charge number recorded on the arrest ( i.e. most serious charge connected to the arrest, arrests may have up to ten charges) |
violationtype | string | ViolationType – the municipal, penal or other code section which is attached to the highest recorded Violation section on the arrest |
chargedescription | string | ChargeDescription – A brief text definition of the type of crime type noted in the violation section |
chargelevel | string | ChargeLevel – whether the highest charge is a felony, misdemeanor or infraction (there should only be felonies and misdemeanors in this file) |
codeucr | string | CodeUCR – a more specific code to define the exact crime type per Uniform Crime Reporting Standards, see defining table on last tab of excel. If you really want to get specific with crime categories this is your tool! |
crimecategory | string | CrimeCategory – This is the category of the offense type based on the UCR program definitions and corresponds to the categories used on the Crime Statistics portal, with the exception that additional crime categories are included such as arson, simple assault, Part II crimes, and recovered vehicles. |
personrole | string | PersonRole – the role of the person on this particular record (victim, suspect, victim/witness or blank) please note that the same crime case may have multiple suspects or victims and some crime cases may not have any suspect or victim information listed depending on what the agency provided to ARJIS. |
race | string | Race – the ethnicity of the person listed in the person role field which can include numerous fields. Note: ARJIS agencies may apply race codes differently across the region. |
age | integer | Age – the age of the person listed in the person role field Note: some cases may have no age listed which is a data entry error or unknown by the agency |
sex | string | Sex – the gender of the person listed in the person role field which can include Male, Female, Non-Binary or Unknown. Note: some cases may have no sex listed which is a data entry error or unknown by the agency |
zipcode | integer | ZipCode – The zip code where the incident/crime case occurred |
censusblock | integer | Census Block/Tract – the census block/tract where the incident/crime case occurred |
censustract | integer | Census Block/Tract – the census block/tract where the incident/crime case occurred |
city | string | City – The city where the incident/crime case occurred |
census_race | string | Race and ethnicity, recoded to the Census scheme |
tract_geoid | string | Tract id, in ACS Geoid format |
block_geoid | string | Block ID, in ACS Geoid format |
intptlat | number | Census block internal point, latitude |
intptlon | number | Census block internal point, longitude |
geometry | string | Centroid of census block |
References
Urls used in the creation of this data package.
- data/census_blocks.csv. Census 2010 blocks, converted to ACS geoids, with centroid position
- op_sd_crime_xls. Response from PRA request
- op_sd_crime_csv. Conversion of main tab of response data to CSV
Packages
- s3 s3://library.metatab.org/sandiegodata.org-crime_victims-1.1.3.csv
- csv http://library.metatab.org/sandiegodata.org-crime_victims-1.1.3.csv
- source https://github.com/metatab-packages/sandiegodata.org-crime_victims.git
Accessing Data in Vanilla Pandas
import pandas as pd
sdcrime_16_20_df = pd.read_csv('http://library.metatab.org/sandiegodata.org-crime_victims-1.1.3/data/sdcrime_16_20.csv')
ucrcodes_df = pd.read_csv('http://library.metatab.org/sandiegodata.org-crime_victims-1.1.3/data/ucrcodes.csv')
Accessing Package in Metapack
import metapack as mp
pkg = mp.open_package('http://library.metatab.org/sandiegodata.org-crime_victims-1.1.3.csv')
# Create Dataframes
sdcrime_16_20_gdf = pkg.resource('sdcrime_16_20').geoframe()
ucrcodes_df = pkg.resource('ucrcodes').dataframe()
1 thought on “San Diego Crime Incidents with Demographic Descriptions”
Comments are closed.