Geospatial Foundation Models

Table of Contents

Geospatial foundation models are AI models pre-trained on vast amounts of data that can be adapted to a broad range of tasks. In the geospatial world, these models are trained on diverse datasets like satellite imagery, maps, and online behavior to understand the physical and human world.

A remote sensing foundation model, for example, is trained on a huge corpus of satellite imagery and can be used for tasks like object detection or semantic segmentation without the need for extensive, task-specific training.

The Population Dynamics Foundation Model (PDFM)

The Population Dynamics Foundation Model (PDFM) is a powerful type of geospatial foundation model created by Google Research. Its purpose is to generate a compact, location-specific "fingerprint" or embedding that captures a region's population characteristics and its complex interplay with the environment. The PDFM achieves this by ingesting massive amounts of regional data, including:

Aggregated search trends: Normalized frequency of the top 1,000 topics and entities.
Maps POIs and mobility: Information on points of interest (POIs) like cafes, parks, and schools, and mobility data showing how people move around them.
Weather and air quality: Various metrics related to weather and air quality for each location.

These diverse features are then processed by a graph neural network, resulting in a list of 330 features that create a rich embedding for each location, such as a zip code or county.

Why CARTO & Google

CARTO acts as a bridge between Google’s foundation models research and real-world analytics. To make these models accessible to everyone, not just ML experts.

Through CARTO Workflows, a no-code tool, complex AI models are translated into user-friendly components.

How CARTO integrates PDFM embeddings

CARTO Workflows makes foundation models usable for practitioners through:

Seamless integration: PDFM and AlphaEarth embeddings are available as components within the CARTO Workflows canvas.
No ML expertise required: You can use a simple drag-and-drop interface to enrich your spatial data pipelines.
Combine with your data: Easily link embeddings to your own business, environmental, or demographic datasets to create more comprehensive analyses.
Scalable and reproducible: The platform runs on top of your data warehouse (like BigQuery), allowing for scalable analyses on large datasets.

Benefits of using PDFM embeddings

Integrating PDFM embeddings into your spatial analysis offers significant real-world value for practitioners.

Do more with less effort

Building models from scratch with all the diverse data sources that feed the PDFM requires immense effort and computational power. By using the pre-computed embeddings, you get access to a rich representation of that data, and the dynamics between them, without the heavy lifting.

Expand your toolkit

PDFM embeddings unlock new types of analyses that were previously difficult to perform. You can run new analyses like similarity search, environmental matching, and fine-grained predictions that go beyond what traditional demographic data can offer.

Improve decision support

The embeddings add richer signals that complement your existing spatial data. This leads to better, more nuanced insights into your business, and helps improve decision-making processes by providing a deeper understanding of why a specific location behaves in a certain way.

Stay ahead

The field of geospatial foundation models is evolving rapidly. By using a platform like CARTO Workflows, you can stay ahead of the curve, as new and updated models become available through the same, familiar interface.

Live demo using PDFM embeddings

Watch this live demo showcasing the use of PDFM embeddings to predict total liquor sales in Iowa at the zip code level.

Preview of CARTO Workflows using Google’s PDFM embeddings

The demo compared three different models:

A model using only traditional social demographic data
A model using only the PDFM embeddings
A model combining both data sources

Key findings from the demo

‍

Visualization of product-specific sales prediction using PDFM embeddings

For predicting total liquor sales, the best-performing model was the one that combined both PDFM embeddings and social demographics. The embeddings added significant explanatory power, complementing the traditional data.

For predicting a specific product's sales (e.g., a specific brand of vodka), the model using only the PDFM embeddings performed best. This highlights the ability of the embeddings to capture a location's behavioral dynamics that are not apparent in static demographic data.

These results underscore that PDFM embeddings are a multi-purpose tool that can be used in different ways depending on your specific goal. Watch the full demo here.

Table of Contents