Let’s imagine the following scenario…
There are five geostationary satellites around the earth and you need your map to automatically pan to the center of their locations to know the rough region of their coverage. On a number line this isn’t a very hard problem since it’s like finding a center of mass or mean for some arbitrary quantity :
Latitude works this way, but longitude is different.
What makes it difficult for longitude is that points that are actually very close can have a very large range in the coordinate system we commonly use. Take the following example: Tonga is located at 175.2° W and Tuvalu is at 179.2° E. How far apart are they? Well, it should be 5.6° longitude since they are 4.8° and 0.8° from 180° W/E, respectively. But since we normally view longitude as a number line [-180°, 180°] where East is positive and West is negative, these two points are
apart! Which answer is correct? Well, they both are, but this can cause some annoying problems if we were to devise a solution to finding the center of points on a map. In our example with Tonga and Tuvalu, the first solution would be well zoomed into the islands, whereas the second would be zoomed out enough to encompass 354.4° of the globe.
Still working with Tonga and Tuvalu, what would be the mean location of these two countries? Mean is sum / number of points, so we get
or 2° to the east of the prime meridian. That seems odd because that’s on the other side of the world.
Since we are getting results that seem a little nonsensical, there must be a clearer solution to our problem.
The problem here lies in viewing longitude as a number line where the end points are far apart, when in fact the end points represent the same physical location. One way to get around this difficulty involves mapping the line segment to a two-dimensional ring of radius one (a unit circle). This is convenient because now -180° and +180° lie at the same point in this new two-dimensional coordinate system: \((-1,0)\).
Below is an image of the satellite problem mentioned above. Five satellites are arranged around the earth. The coordinate system on the left represents the longitude as we normally measure it; the coordinate system on the right is the transformed longitude with degrees at appropriate locations to show the mapping. As you can easily see on the map, taking the normal mean of longitude (the red ×) gives a value that’s way off compared to the correct value (the gold star). If the coordinate system were just a number line without periodic boundary values, though, you could balance the line on your finger at the red ×.
The transformation that helps us to find the gold star is the following:
where the index means that it is the entry in the data table (of entries).
To find our center of points, we now need to average over all the and (which we’ll call and ), and then find the angle between them. The angle can be calculated by finding the inverse tangent. To be careful about which quadrant is chosen, we will use atan2, the two-argument inverse tangent that preserves the signs of the inputs.
This calculation for longitude, and the one mentioned above for latitude, can be easily calculated in CARTO by placing the following SQL command in the SQL editor. The values calculated are:
avg_lon based on the discussion just above,
avg_lon_naive based on a straight average of longitude, and
avg_lat the average latitude.
Aside: If you are performing this calculation over a large number of points, the SQL function
sum() will probably be faster than
avg(). Since tangent is defined as opposite / adjacent, any common multiples cancel. This means that the denominator used in averages will cancel for both and , so
avg() can be replaced by
sum() for those calculations.
The result for
avg_lon makes sense because it is in between Tuvalu and Tonga.
avg_lon_naive is just wrong. Notice that we have to convert degrees to radians and then back in the process. Also notice that the two calculated longitudes are 180° apart for the two point case. For a larger dataset, the
avg_lon_naive result becomes more nonsensical.
Using the dataset linked at the bottom, you can see the data visualized on the map. It’s abundantly clear that the naive approach to finding the mean longitude does not work.
Want to give Spatial Analysis a try?Sign up for a FREE account
These results can be generalized to a weighted center of points. If the column you want to weight by is , then we get
Since tangent takes the ratio of the arguments, the denominators would cancel, and our equation simplifies in a way to make it easier for the computer!
In SQL, this equation would look like:
Pro Tip: If you don’t have an explicit latitude or longitude column, you can use the information directly from
the_geom by replacing
We’re all about open data here at CARTO. If you want to clone the map and data, see them both here. Happy mapping!
Still have questions?Request a personlized no pressure demo
Data is an essential ingredient for any spatial analysis; but often, before any dataset can be mined for insights, data scientists need to spend a considerable amount of ti...News
Map visualization on the web has evolved a lot in recent years. We have seen a rapid shift to Vector Tiles and more visualizations powered by the Graphics Processing Unit (...News
In the world of Spatial Data Science, being able to accurately and consistently link data to physical location points on a map is crucial. However, place data is often mess...News
Please fill out the below form and we'll be in touch real soon.