Center of Points in Geometric Coordinates

Let’s imagine the following scenario…

There are five geostationary satellites around the earth and you need your map to automatically pan to the center of their locations to know the rough region of their coverage. On a number line this isn’t a very hard problem since it’s like finding a center of mass or mean for some arbitrary quantity:

Latitude works this way, but longitude is different.

Why Longitude is Different

What makes it difficult for longitude is that points that are actually very close can have a very large range in the coordinate system we commonly use. Take the following example: Tonga is located at 175.2° W and Tuvalu is at 179.2° E. How far apart are they? Well, it should be 5.6° longitude since they are 4.8° and 0.8° from 180° W/E, respectively. But since we normally view longitude as a number line [-180°, 180°] where East is positive and West is negative, these two points are

apart! Which answer is correct? Well, they both are, but this can cause some annoying problems if we were to devise a solution to finding the center of points on a map. In our example with Tonga and Tuvalu, the first solution would be well zoomed into the islands, whereas the second would be zoomed out enough to encompass 354.4° of the globe.

Still working with Tonga and Tuvalu, what would be the mean location of these two countries? Mean is sum / number of points, so we get

or 2° to the east of the prime meridian. That seems odd because that’s on the other side of the world.

Since we are getting results that seem a little nonsensical, there must be a clearer solution to our problem.

Finding another way

The problem here lies in viewing longitude as a number line where the end points are far apart, when in fact the end points represent the same physical location. One way to get around this difficulty involves mapping the line segment to a two-dimensional ring of radius one (a unit circle). This is convenient because now -180° and +180° lie at the same point in this new two-dimensional coordinate system: \((-1,0)\).

Below is an image of the satellite problem mentioned above. Five satellites are arranged around the earth. The coordinate system on the left represents the longitude as we normally measure it; the coordinate system on the right is the transformed longitude with degrees at appropriate locations to show the mapping. As you can easily see on the map, taking the normal mean of longitude (the red ×) gives a value that’s way off compared to the correct value (the gold star). If the coordinate system were just a number line without periodic boundary values, though, you could balance the line on your finger at the red ×.

Center of Points demo

The transformation that helps us to find the gold star is the following:

where the indexmeans that it is theentry in the data table (ofentries).

To find our center of points, we now need to average over all theand(which we’ll calland), and then find the angle between them. The angle can be calculated by finding the inverse tangent. To be careful about which quadrant is chosen, we will use atan2, the two-argument inverse tangent that preserves the signs of the inputs.

This calculation for longitude, and the one mentioned above for latitude, can be easily calculated in CARTO by placing the following SQL command in the SQL editor. The values calculated are: avg_lon based on the discussion just above, avg_lon_naive based on a straight average of longitude, and avg_lat the average latitude.

Data Table

country    lat         lon
-------    --------    ---------
Tonga      -21.1333    -175.2
Tuvalu     -8.53333     179.2167

SQL

SELECT 
  180 * atan2(s.zeta,s.xi) / pi() AS avg_lon,
  s.avg_lon_naive AS avg_lon_naive,
  s.avg_lat AS avg_lat
FROM 
  (
  SELECT 
    avg(sin(pi() * lon / 180)) AS zeta, 
    avg(cos(pi() * lon / 180)) AS xi,
    avg(lon) AS avg_lon_naive,
    avg(lat) AS avg_lat
  FROM pacific_islands
  ) AS s

Aside: If you are performing this calculation over a large number of points, the SQL function sum() will probably be faster than avg(). Since tangent is defined as opposite / adjacent, any common multiples cancel. This means that the denominator used in averages will cancel for bothand, so avg() can be replaced by sum() for those calculations.

Output

avg_lon    avg_lon_naive    avg_lat
-------    -------------    -------
-177.992   2.00833          -14.9333

The result for avg_lon makes sense because it is in between Tuvalu and Tonga. avg_lon_naive is just wrong. Notice that we have to convert degrees to radians and then back in the process. Also notice that the two calculated longitudes are 180° apart for the two point case. For a larger dataset, the avg_lon_naive result becomes more nonsensical.

Using the dataset linked at the bottom, you can see the data visualized on the map. It’s abundantly clear that the naive approach to finding the mean longitude does not work.

Output

avg_lon    avg_lon_naive    avg_lat
-------    -------------    -------
176.614    36.6444          -11.4833

Weighted Center of Points

These results can be generalized to a weighted center of points. If the column you want to weight by is, then we get

Since tangent takes the ratio of the arguments, the denominators would cancel, and our equation simplifies in a way to make it easier for the computer!

In SQL, this equation would look like:

SELECT 
  180 * atan2(s.zeta_w,s.xi_w) / pi() AS avg_w_lon,
  s.avg_w_lon_naive AS avg_w_lon_naive,
  s.avg_w_lat AS avg_w_lat
FROM 
  (
  SELECT 
    sum(w * sin(pi() * lon / 180)) AS zeta_w, 
    sum(w * cos(pi() * lon / 180)) AS xi_w,
    sum(w * lon) / sum(w) AS avg_w_lon_naive,
    sum(w * lat) / sum(w) AS avg_w_lat
  FROM pacific_islands
  ) AS s

Pro Tip: If you don’t have an explicit latitude or longitude column, you can use the information directly from the_geom by replacing lat with ST_Y(the_geom) and lon with ST_X(the_geom).

We’re all about open data here at CARTO. If you want to clone the map and data, see them both here. Happy mapping!

About the author

Andy Eschbacher is a data scientist at CARTO, where he integrates data science solutions into CARTO's infrastructure, solves spatial data science problems for clients, and builds out tools to better enable people working at the intersection of data science and GIS.

More posts from Andy Eschbacher

Related Posts

Ready to optimize your territories with Location Intelligence?

Close circle icon

Contact us

Please fill out the below form and we'll be in touch real soon.