You might have seen the blog post we produced a little while ago looking at the
L train closure and the possible impact that might have on L commuters to manhattan. One of the most visually striking elements of that post were the maps where we showed walking routes for people in Brooklyn to their nearest L. We did it a few times, once like above showing all people in Brooklyn, and a second time showing cumulative walking corridors for people that live on the L. You might have wondered how we made those lines, so in this post we will show you how.
One of the aims of our blog post about the L line closures was to show the large number of people that would be impacted by the closure of the tunnel. We could have shown this as a choropleth, simply coloring the regions of the map brighter when there where a large number of commuters who would be effected there but to me you lose the impact of knowing that these are people who are being impacted with lives and jobs. Instead we decided to convey this information by showing the impact by producing flow lines of the routes commuters would take to go to their nearest L train stop. Being able to imagine yourself walking to your own subway stop helps connect you with those that are going to be affected by the closure.
To produce maps of the flows of cars/pedestrians/walkers etc we needed to know three things:
To understand how many people would be effected by the closure we turned to US Census. The Census is a treasure trove of information on the kinds of people in the US and how they live and work. The Census has a column which asks people how they typically commute to work. While the column has a few caveats (e.g. people could only reply to the question with a single transportation method) but we decided that it could give a reasonable idea of relative numbers of people who would be walking from each census block.
With our subway commuters in hand, we needed to filter them to only include people who commute in Manhattan. To get this number we use the LODES dataset which describes where people in the US commute to work. It tells us for each region the number of people who work in another region. Using some simple math, we can combine our two values and uncover all the people who likely take the subway from Brooklyn to Manhattan for work. Below we map our results out:
Finally, we filtered our data to only the blocks where people likely walked to the L (based on distance) for their commute. This would have been enough to tell the story we wanted to tell but it lacked visual impact we wanted. So we busted out some routing.
To show the daily impact of street traffic by the L train closure we decided to map the walking routes of those commuters. Routing currently isn’t available within the CartoDB interface yet, so we turned to the excellent routing solution Valhalla produced by our friends at Mapzen.
Valhalla has some nifty features for finding the best route between two points, including a desired mechanism. In our case we wanted to map out pedestrian routes. We wrote a little python package to take a list of origin and destination latitude and longitudes and get the Valhalla generated linestring, travel distance and travel time.
If you want to try it, you can install it via:
pip install valhallapy
You can point this script at Valhalla’s servers and include your API key and let it trundle through all the data you want to route.
Alternatively, if you have a one-off job the requires a large number of routes to do like we did, you can use the docker container here to quickly and easily run a version of Valhalla locally on your computer.
Once all the routes have been generated the script spits them out as a geojson file that is ready to be uploaded to CartoDB. If we simply plot those lines we get the following map. This is awesome but with so many of the routes from different census blocks overlapping we have no way of really gauging the number of people along each route.
To get around this problem we need to segment the linestring, breaking them in to straight segments with no turns. To do this we wrote a PL/PGSQL function which you can grab here:
Just copy and paste that segment_line function in to your CartoDB SQL pane and it will be available in your account. Running the function on your geometry column will spit the lines out in to separate parts, each with the number of commuters and travel time along that route.
Now we just have to aggregate the number of commuters along each line segment and the max travel time along that segment by running the query in that gist. Having those two variables is perfect as it gives us two dimensions to style the routes.
Let’s bring it all to gather by writing some CartoCSS. We are going to style the thickness of each line with the number of commuters along that line and the color of the line by the maximum commute time of any commuter along that segment. Here is an example of CartoCSS you could use:
Bringing that all together we get the following map which is really powerful and impactful! We also produced this map showing the walking commutes of every person who rides the subway walking to their nearest subway stop.
We’re excited to find out what is possible with all the work that [Mapzen] has been doing. Look out for more experiments soon! In the meantime, let us know if you try this kind of analysis yourself.
Have fun with routing and stylizing routes!
A major barrier to extracting value from data and analytics is the complexity of siloed data sources and legacy systems. Enter Snowflake; a new breed of cloud data warehous...News
Please fill out the below form and we'll be in touch real soon.