# Center of Points in Geometric Coordinates

Let’s imagine the following scenario…

There are five geostationary satellites around the earth and you need your map to automatically pan to the center of their locations to know the rough region of their coverage. On a number line this isn’t a very hard problem since it’s like finding a center of mass or mean for some arbitrary quantity $x_{i}$:

Latitude works this way, but longitude is different.

### Why Longitude is Different

What makes it difficult for longitude is that points that are actually very close can have a very large range in the coordinate system we commonly use. Take the following example: Tonga is located at 175.2° W and Tuvalu is at 179.2° E. How far apart are they? Well, it should be 5.6° longitude since they are 4.8° and 0.8° from 180° W/E, respectively. But since we normally view longitude as a number line [-180°, 180°] where East is positive and West is negative, these two points are

apart! Which answer is correct? Well, they both are, but this can cause some annoying problems if we were to devise a solution to finding the center of points on a map. In our example with Tonga and Tuvalu, the first solution would be well zoomed into the islands, whereas the second would be zoomed out enough to encompass 354.4° of the globe.

Still working with Tonga and Tuvalu, what would be the mean location of these two countries? Mean is sum / number of points, so we get

or 2° to the east of the prime meridian. That seems odd because that’s on the other side of the world.

Since we are getting results that seem a little nonsensical, there must be a clearer solution to our problem.

### Finding another way

The problem here lies in viewing longitude as a number line where the end points are far apart, when in fact the end points represent the same physical location. One way to get around this difficulty involves mapping the line segment to a two-dimensional ring of radius one (a unit circle). This is convenient because now -180° and +180° lie at the same point in this new two-dimensional coordinate system: $(-1,0)$.

Below is an image of the satellite problem mentioned above. Five satellites are arranged around the earth. The coordinate system on the left represents the longitude as we normally measure it; the coordinate system on the right is the transformed longitude with degrees at appropriate locations to show the mapping. As you can easily see on the map, taking the normal mean of longitude (the red ×) gives a value that’s way off compared to the correct value (the gold star). If the coordinate system were just a number line without periodic boundary values, though, you could balance the line on your finger at the red ×.

The transformation that helps us to find the gold star is the following:

where the index $i$ means that it is the $i^{\text{th}}$ entry in the data table (of $N$ entries).

To find our center of points, we now need to average over all the $\zeta_{i}$ and $\xi_{i}$ (which we’ll call $\bar{\zeta}$ and $\bar{\xi}$), and then find the angle between them. The angle can be calculated by finding the inverse tangent. To be careful about which quadrant is chosen, we will use atan2, the two-argument inverse tangent that preserves the signs of the inputs.

This calculation for longitude, and the one mentioned above for latitude, can be easily calculated in CARTO by placing the following SQL command in the SQL editor. The values calculated are: avg_lon based on the discussion just above, avg_lon_naive based on a straight average of longitude, and avg_lat the average latitude.

Data Table

SQL

Aside: If you are performing this calculation over a large number of points, the SQL function sum() will probably be faster than avg(). Since tangent is defined as opposite / adjacent, any common multiples cancel. This means that the denominator used in averages will cancel for both $\bar{\xi}$ and $\bar{\zeta}$, so avg() can be replaced by sum() for those calculations.

Output

The result for avg_lon makes sense because it is in between Tuvalu and Tonga. avg_lon_naive is just wrong. Notice that we have to convert degrees to radians and then back in the process. Also notice that the two calculated longitudes are 180° apart for the two point case. For a larger dataset, the avg_lon_naive result becomes more nonsensical.

Using the dataset linked at the bottom, you can see the data visualized on the map. It’s abundantly clear that the naive approach to finding the mean longitude does not work.

Output

#### Weighted Center of Points

These results can be generalized to a weighted center of points. If the column you want to weight by is $w$, then we get

Since tangent takes the ratio of the arguments, the denominators would cancel, and our equation simplifies in a way to make it easier for the computer!

In SQL, this equation would look like:

Pro Tip: If you don’t have an explicit latitude or longitude column, you can use the information directly from the_geom by replacing lat with ST_Y(the_geom) and lon with ST_X(the_geom).

We’re all about open data here at CARTO. If you want to clone the map and data, see them both here. Happy mapping!

Andy Eschbacher is a data scientist at CARTO, where he integrates data science solutions into CARTO's infrastructure, solves spatial data science problems for clients, and builds out tools to better enable people working at the intersection of data science and GIS.

• ## Our Thoughts as MapboxGL JS v2.0 Goes Proprietary

This week, Mapbox announced that they were changing the license of their MapboxGL JS library as part of their latest v2.0 release. The library has gone from an Open Source ...

• ## Announcing the CARTO Scientific Committee

CARTO has been at the forefront of the development of Spatial Data Science, a subset of Data Science that focuses on the unique characteristics of spatial data, moving beyo...

• ## How to Create Maps from Snowflake using CARTO & SQL

Simpler data workflows for Snowflake users