Visualize traffic incident reports in San Francisco.
Data sources:
from cartoframes.auth import set_default_context, Context
from cartoframes.viz import Map, Layer, Legend, Source
from cartoframes.data import Dataset
import pandas
If you have a CARTO account, you can set your credentials in the following cell. This allows you to upload the dataset and share the final visualization through your account.
# username = '' # <-- insert your username here
# api_key = ''# <-- insert your API key here
# context = Context('https://{}.carto.com/'.format(username), api_key)
# set_default_context(context)
Using pandas, we can read an external data source, which is converted to a dataframe. Let's see which columns we have:
incident_reports_df = pandas.read_csv('https://data.sfgov.org/resource/wg3w-h783.csv')
incident_reports_df.head()
incident_reports_df.columns
Some of the latitude
and longitude
values are NaN
, in the next step we get rid of them. After that, we create a dataset from the dataframe and use it in a Layer to visualize the data:
incident_reports_df = incident_reports_df[incident_reports_df.longitude == incident_reports_df.longitude]
incident_reports_df = incident_reports_df[incident_reports_df.latitude == incident_reports_df.latitude]
incident_reports_data = Dataset(incident_reports_df)
Map(Layer(incident_reports_data))
Now, we are going to use a helper method to color by category, and the category is 'Day of Week' (incident_day_of_week
)
from cartoframes.viz.helpers import color_category_layer
Map(
color_category_layer(incident_reports_data, 'incident_day_of_week', 'Day of Week', top=7)
)
As we can see in the legend, the days are sorted by frequency, which means that there're less incidents on Thursdays and More on Tuesdays. Since our purpose is not to visualize the frequency and we want to see the days properly sorted from Monday to Sunday in the legend, we can modify the helper and set the categories we want to visualize in the desired position:
from cartoframes.viz.helpers import color_category_layer
Map(
color_category_layer(incident_reports_data, 'incident_day_of_week', 'Day of Week', cat=[
'Monday',
'Tuesday',
'Wednesday',
'Thursday',
'Friday',
'Saturday',
'Sunday'
])
)
Now, we want to look for traffic incidents, and then use these categories to visualize those incidents:
incident_reports_df.incident_category.unique()
from cartoframes.viz.helpers import size_category_layer
Map(
size_category_layer(
incident_reports_data,
'incident_category',
'Traffic Incidents',
cat=['Traffic Collision', 'Traffic Violation Arrest'])
)
In CARTO we have a dataset we can use for the next step, named 'sfcta_congestion_roads'. We are going to set the Context
for this dataset. To have more control over this dataset, if you have a CARTO account you can import it to have everything together, and it won't be needed to create a different source for this Dataset.
Once we've the data source created, we're going to combine two helper methods. The first one uses the Source with the roads data from CARTO, and the second one the traffic incident reports.
from cartoframes.viz.helpers import color_continuous_layer
sfcta_congestion_roads_source=Source(
'sfcta_congestion_roads',
Context(
base_url='https://cartovl.carto.com',
api_key='default_public'
)
)
Map([
color_continuous_layer(sfcta_congestion_roads_source, 'auto_speed', 'Recorded vehicle speeds'),
size_category_layer(
incident_reports_data,
'incident_category',
'Traffic Incidents',
cat=['Traffic Collision', 'Traffic Violation Arrest'])
])
We are going to add information about traffic signals, by getting data from a different source:
traffic_signals_df = pandas.read_csv('https://data.sfgov.org/resource/c8ue-f4py.csv')
traffic_signals_df.head()
traffic_signals_df.columns
traffic_signals_df = traffic_signals_df.rename(columns={'type': 'signal_type'})
traffic_signals_df.signal_type.unique()
Since there is no latitude
and longitude
columns, we can use the point
column to create a GeoDataFrame.
import geopandas
from shapely import wkt
traffic_signals_df['point'] = traffic_signals_df['point'].apply(wkt.loads)
traffic_signals_df = traffic_signals_df.rename(columns={'point': 'geometry'}).set_geometry('geometry')
trafic_signals_gdf = geopandas.GeoDataFrame(traffic_signals_df, geometry='geometry')
traffic_signals_data = Dataset(trafic_signals_gdf)
Map(Layer(traffic_signals_data))
We are getting only the signal types we want to visualize, but we are going to build a Layer that uses cross symbols this time:
signal_gdf = trafic_signals_gdf[trafic_signals_gdf['signal_type'].isin(['RADAR SPEED SIGN', 'FLASHER', 'LIGHTED CROSSWALK'])]
signal_data = Dataset(signal_gdf)
Map(
Layer(
signal_data,
'''
color: ramp($signal_type, bold)
width: 15
symbol: cross
''',
legend={
'type': 'color-category',
'title': 'Radar'
})
)
All together:
Map([
color_continuous_layer(
sfcta_congestion_roads_source, 'auto_speed', 'Recorded vehicle speeds'),
size_category_layer(
incident_reports_data,
'incident_category',
'Traffic Incidents',
cat=['Traffic Collision', 'Traffic Violation Arrest']),
Layer(
signal_data,
'''
color: ramp($signal_type, bold)
width: 15
symbol: cross
''',
legend={
'type': 'color-category',
'title': 'Radar'
})
])