Create Centroids of Geometries
A centroid is a geometric center that is calculated from geometries in a map layer. You can define whether centroids are calculated using all geometries, a group of geometries, or by a singular geometry.
This guide describes how to apply the Create Centroids of Geometries analysis to find unweighted, or weighted, centers of polygons for geometry groupings.
- Specifically, we will use bike share usage data, that is aggregated by station.
- Our goal is to find the geographical, or weighted centroid, of their service cluster to optimize the geographical assignment of vans; which re-balance bike docks if they are running low, or completely filled.
This multi-step process locates the service areas (by applying the Calculate clusters with points analysis), then calculates the weight, by usage, within the identified service areas (by applying the Create Centroids of Geometries analysis).
For this analysis, a well-formed, coherent research question helps to identify the variables to be used for weights, categories, and aggregation. If working with individual polygons, categorizing by
the_geom allows you to collapse the polygons to their individual centroids.
Let’s explore the Citi Bike data for New York City taken from June 2016. This dataset contains a column called
count, that contains the number of times users ended their trip in the month of June. The
end_station_name column contains the bike station where they ended their trip in the month of June.
Suppose the New York City Department of Transportation has a budget to position seven vans around the bike share network to reallocate excess bikes to empty stations. This example finds the optimal seven points, given our basic assumptions, according to the dataset.
Import the template .carto file packaged from “Download resources” of this guide and create the map. Builder opens with Citi Bike data as the first and only map layer.
Click on “Download resources” from this guide to download the zip file to your local machine. Extract the zip file to view the .carto file(s) used for this guide.
Since our research question states the positioning of seven vans, we would want to form seven clusters of our station points. This analysis, which uses k-means clustering behind the scenes, minimizes the internal distance between points within a cluster while maximizing the distance between clusters.
From the LAYERS list, click the Citi Bike map layer.
Click the ANALYSIS tab and apply the Calculate clusters of points option. The base layer must be the layer for which you need the clusters.
7as the number of clusters.
Switch to the Data View of the Citi Bike map layer, you will see that this step gives a
cluster_noto each station, from one to seven.
The Data View and Map View appear as buttons on your map visualization when a map layer is selected. Click to switch between viewing your connected dataset as a table, or show the map view of your data.
Centroid of Clusters
Now that the cluster assignment of each station has been calculated (the result being the
cluster_no column of the map layer), locate a weighted center for these clusters to optimally position the Citi Bike re-balance van.
By definition, a centroid is the geometric center of any geometry. When a centroid is weighted, the “center” is pulled towards areas with higher weights (For example, if one side of a city has a higher population, its weighted center would be closer to the side with more people). It uses a defined data field as weights for each point, and calculates the arithmetic mean position of the points with the bias.
From the Citi Bike layer ANALYSIS tab, click on ADD NEW ANALYSIS to add a second analysis to the chain.
Select the Create Centroids of Geometries analysis.
- Enable the CATEGORIZE BY option and choose
- Enable the WEIGHTED BY option and choose count.
- Enable the OPERATION option from the Measure by section, and choose SUM by the count column. This is the count of user station visits for each bike station.
- APPLY the analysis.
The analysis confirmation dialog displays which columns from your dataset were updated. The result is seven points as weighted centers of clusters for servicing the most frequented stations with the vans.
Are you having trouble visualizing the results? Change the basemap style to DARK MATTER (LITE) and notice how the seven points are much more visible.
- Enable the CATEGORIZE BY option and choose
To apply even more style enhancements, you can style these centroids By Value and choose one of the updated columns from your dataset. This helps display where the weighted centers are located, in relation to the most frequently used bike stations.
To visualize and style analysis results as a separate layer, there is a shortcut to create a new layer directly from the LAYERS list in Builder.
Click and drag
A1 Calculate Clusters of Pointsfrom the Citi Bike map layer and drop it below.
A new map layer is created,
B Citi Bike, displaying all of the centroids in the cluster of stations. Style this layer By Value using the
cluster_nocolumn to visualize how the regions are defined geographically, according to the most frequented stations.
Download the final .carto file from the “Download resources” of this guide, and explore the advanced cartography applied. The Create Lines from Points analysis was also applied, and used to create a new layer, to clearly link between the stations that would be serviced (clusters), and the re-balance van that they would be serviced by (centroids).</div>
This analysis has a limit on the time that it takes to execute the analysis. If the analysis takes more than 5 minutes, CARTO will return a timeout error.
If you are interested in using the underlying functions in the SQL view of Builder, you can find centroids of polygon geometries (or clusters) with the SQL
ST_Centroid query. View the PostGIS ST_Centroid documentation for details.