San Diego Downtown Homless Computer Vision Package

Files and code for analyzing San Diego downtown homelessness data with computer vision

sandiegodata.org-downtown_cv-5

Resources | Packages | Documentation| Contacts| Data Dictionary

Resources

gcp. Ground control points
intersection_regions. Polygon transformations for each the intersections of each map
intersections. List of intersections.
file_annotations. File annotations on count files
counts. Annotation position, types and counts of handwritten marks

Documentation

This dataset collects records related to a conversion of 5 years of paper maps that record positions of homeless sleepers in downtown San Diego. The San Diego Regional Data Library is converting these paper maps to a digital form with a manual process that uses an image annotation tool, and theses annotations can be used to train computer vision algorithms to georeference maps and recognize handwritten marks.

These datasets link to map urls and annotations, for three kinds of annotations:

Ground Control Points, which identify the map image locations for known intersections, linking image coordinates ( in pixels ) to geographic coordinates.
Image locations of handwritten marks and the number written in the mark.
File annotations, for other handwritten notes such as the temperature and presence of rain.

More Information:

Blog Post. For more discussion about the GCP and handwritten marks, and the tasks in volved in developing computer vision algorithms for these data, see our recent blog post on the subject.
Clustering Notebook. For some examples of using OpenCV to extract and match templates, to georeference maps, see the Templates and Clustering Jupyter Notebook].
Extract Marks Notebook. For examples of extracting ( but not recognizing ) handwritten marks, see this notebook.

Developer notes

After anotation JSON files are copied into S#, the list of S# urls must be updated. To refresh the list of urls run

$  bin/update_s3.sh <s3-profile>

Contacts

Wrangler
Eric Busboom, Civic Knowledge

Packages

zip http://library.metatab.org/sandiegodata.org-downtown_cv-5.zip
s3 s3://library.metatab.org/sandiegodata.org-downtown_cv-5.csv
csv http://library.metatab.org/sandiegodata.org-downtown_cv-5.csv
source https://github.com/sandiegodata-projects/homelessness.git

Accessing Packages in Metapack

import metapack as mp
pkg = mp.open_package('http://library.metatab.org/sandiegodata.org-downtown_cv-5.zip')

# Create Dataframes

gcp_df = pkg.resource('gcp').dataframe()
intersection_regions_df = pkg.resource('intersection_regions').dataframe()intersections_gdf = pkg.resource('intersections').geoframe()

file_annotations_df = pkg.resource('file_annotations').dataframe()
counts_df = pkg.resource('counts').dataframe()

Data Dictionary

gcp | intersections | intersection_regions | file_annotations | counts

gcp

Column Name	Data Type	Description
image_url	string	Map image url
x	integer	X position of upper left of region rectangle, in pixels
y	integer	Y position of upper left of region rectangle, in pixels
width	integer	Width of selection region rectangle in pixels
height	integer	Height of selection region rectangle in pixels
intersection	string	Name of intersection

intersections

Column Name	Data Type	Description
geometry	string	WKT format geometry of intersection point
neighborhood	string	Neighborhood intersection is in
intersection	string	Name of intersection

intersection_regions

Column Name	Data Type	Description
image_url	string	Url to a map image
neighborhood	string	Name of the neighborhood for the maps
year	integer	Year portion of the data collection date.
month	integer	Month portion of the data collection date.
intersections_id	string	A string composed of the names of the four intersections.
intersection_group	string	A name, based on the neighbrhood, that identifies distinct intersection_id strings.
map_name	string	A name based on the neighborhood and map changes in 2016 and 2017
source_inv	string	The intersection polygon, fromed from the intersection points, in WKT format, in the pixel coordinate space. This version is inverted, with the Y coordinate being subtracted from 2000, so the orientation of the Y axis is the same as the EPSG:2230 geographic coordinate space.
source	string	Like source_inv, but the Y axis is not inverted, so the coordinates are the same as the image.
source_area	number	Area of source shape, in square pixels
source_shape	string	(X,Y) shape of source polygon bounding box
source_shape_x	integer	X value of source_shape
source_shape_y	integer	Y value of source_shape
dest	string	The intersection polygon, but in EPSG:2230 (State plane 6, California, Feet) coordinates.
matrix	string	An affine transformation matric that transforms from the coorinates of source_inv to dest. When pixel locations are properly inverted, this matrix transforms from pixel locations to geographic locations.

file_annotations

Column Name	Data Type	Description
image_url	string	Url to a map image
url_year	integer	Year, from url
url_month	integer	Month, from url
date	datetime	Date, from file annotation, or from url if the annotation is empty
neighborhood	string	Neighborhood
url_neighborhood	string	Neighborhood from url
total_count	number	Total count of handwritten marks, or may be the processed value, with the structure and vehicle counts multipled by conversion factors.
temp	integer	Temperature, if it was given on the map
rain	string	Rain, if it was recorded on the map.

counts

Column Name	Data Type	Description
image_url	string	Map image URL
cx	integer	X value of the center of the circle region, in pixels
cy	integer	Y value of the center of the circle region in pixels
r	integer	Radius of the circle region, in pixels
type	string	Type of sleeper: Individual, Vehicle or Structure
count	string	Count of sleepers

Last Modified 2019-09-13T04:52:40

Packages

zip http://library.metatab.org/sandiegodata.org-downtown_cv-5.zip
s3 s3://library.metatab.org/sandiegodata.org-downtown_cv-5.csv
csv http://library.metatab.org/sandiegodata.org-downtown_cv-5.csv
source https://github.com/sandiegodata-projects/homelessness.git

Accessing Packages in Metapack

import metapack as mp
pkg = mp.open_package('http://library.metatab.org/sandiegodata.org-downtown_cv-5.zip')

# Create Dataframes

gcp_df = pkg.resource('gcp').dataframe()
intersection_regions_df = pkg.resource('intersection_regions').dataframe()intersections_gdf = pkg.resource('intersections').geoframe()

file_annotations_df = pkg.resource('file_annotations').dataframe()
counts_df = pkg.resource('counts').dataframe()