Aug 19, 2025

Lucía García-Duarte

and

CARTO Contributor

Aug 19, 2025

Lucía García-Duarte

and

Aug 19, 2025

From Imagery to Insight: Google AlphaEarth Foundations in CARTO

Spatial Data Science

mins read

This post may describe functionality for an old version of CARTO. Find out about the latest and cloud-native version here.

From Imagery to Insight: Google AlphaEarth Foundations in CARTO

Sifting through raw satellite imagery to find insights can feel like hunting for a needle in a haystack - except the haystack is the entire planet. For analysts and data teams, the challenge isn’t lack of data, but the complexity of turning it into something usable for real-world decisions.

That’s why we’re excited to share another important milestone in providing access to cutting-edge geospatial AI capabilities for your analytical workflows and spatial models built in CARTO. A couple of months ago we launched the Geospatial Foundation Models Extension Package for CARTO Workflows, our low-code tool for automating spatial analysis pipelines. This enables our users to leverage the power of industry-leading geospatial foundation models by loading their spatial embeddings into analytical and data processing workflows.

Today, we’re taking it a step further — adding the recently launched Google Deepmind’s AlphaEarth Foundations [1] to the list of available geospatial foundation models. Built from petabytes of multimodal Earth observation data, AlphaEarth delivers annual, 64-dimensional “satellite embeddings” that capture the planet’s land and coastal waters at 10-meter resolution, distilling complex optical, radar, and climate inputs into a compact, analysis-ready format.

This new addition means that you can now integrate powerful, analysis-ready satellite embeddings of Earth’s surface directly into your spatial pipelines with the ease of CARTO’s low-code tools. Keep reading to learn how!

What Are Satellite Embeddings?

The Satellite Embeddings dataset by Google Earth Engine offers condensed, high-dimensional numerical representations of the Earth’s surface derived from multi-source remote sensing data. Instead of manually handling dozens of raw bands, users can work with streamlined representations that inherently capture key environmental patterns, unlocking powerful new ways to explore and compare the Earth’s surface.

A screenshot of AlphaEarth’s Satellite Embeddings — Visualizing AlphaEarth’s Satellite Embeddings - brighter areas in the similarity layer indicate greater year-to-year change, enabling rapid detection of environmental shifts

Specifically, Google’s Satellite Embeddings dataset provides:

Global coverage - embeddings available for every 10-meter land pixel on Earth.
Annual temporal summaries - capturing year-round surface condition patterns, from 2017 to 2024.
64-dimensional vectors - abstract representations that capture relationships between multiple spectral, temporal, and environmental features.

These embeddings support a wide range of applications —from identifying high-risk areas and monitoring environmental changes to tracking urban expansion and enhancing disaster response by locating similarly vulnerable regions.

Integrating Satellite Embeddings into CARTO Workflows

With this new component available in CARTO Workflows, you can load the embeddings directly into your data pipelines - no need to manually manage complex API requests or storage! For any chosen set of geometries and time range, the system retrieves the summarized values for each of the 64 embedding dimensions from Google Earth Engine.

These features can be combined with your own geospatial datasets for advanced analytics, or be integrated into your AI and machine learning workflows for tasks like classification, clustering, and predictive modeling.

AlphaEarth in action: how you can use this data

The addition of Google DeepMind’s AlphaEarth Foundations to CARTO Workflows opens up new possibilities for organizations that rely on timely, high-resolution environmental intelligence. Here are just a few examples of how this data can be used:

Insurance – Refine risk models by identifying environmental conditions linked to flood, wildfire, or storm damage, and monitor changes in high-risk areas to guide underwriting and claims assessment.
Telecommunications – Optimize network planning by mapping terrain and vegetation changes that impact signal propagation to identify optimal sites for towers or infrastructure expansion.
Financial Services – Assess climate-related risks in investment portfolios or monitor supply-chain-critical areas for environmental change.
Logistics & Transportation – Predict and mitigate disruption from weather or terrain changes, and support route planning and infrastructure maintenance with up-to-date surface condition data.
Public Sector – Support disaster preparedness, land-use planning, and conservation efforts with consistent, scalable monitoring of ecosystems and urban growth.

Let’s do a deep dive into one of these examples.

Use Case: Detecting Fire-Prone Areas through vector search

To see these embeddings in action, we designed a practical workflow that identifies ZCTAs (ZIP Code Tabulated Areas) in Arizona with environmental and surface patterns similar to a wildfire-prone ZCTA in California - potentially indicating higher fire risk. When working with embeddings, this problem is known as vector (similarity) search.

We used the following datasets:

California and Arizona ZIP Code Tabulation Areas (ZCTA codes), publicly available in BigQuery under bigquery-public-data.utility_us.zipcode_area
California Wildfire perimeters from the California Department of Forestry and Fire Protection

California is well known for its susceptibility to wildfires. So, we selected a representative ZCTA in California with a high documented history of wildfires. Next, we extracted the 2024 satellite embeddings for all required ZCTAs, summarizing each set of embeddings by calculating the mean vector. With these summarized 64-dimensional feature vectors, we created arrays of embedding features for each ZCTA to later perform a similarity search using BigQuery’s VECTOR_SEARCH function.

‍

The obtained distance metric quantifies the similarity between the base vector (California’s selected ZCTA) and query vectors (all Arizona’s ZCTAs) based on their satellite embeddings. Lower distance values indicate areas in Arizona that share closely matching environmental and surface patterns with the reference region, potentially highlighting zones with similar wildfire risk profiles. This enables targeted analysis and decision-making by pinpointing areas that may require closer monitoring or preventive measures.

To validate the results, we compared the obtained similarities with the burned probability estimates provided by the Wildfire Risk to Communities model [2]. We first summarized the raster values to ZCTAs to then compute the Spearman's rank correlation coefficient between the two, obtaining a value of 0.748.

The obtained correlation coefficient is visible between the two maps below. The left map displays aggregated burn probability values at the ZCTA level, highlighting current high-risk areas based on historical and modeled fire occurrence. The right map shows similarity scores derived from our embedding-based vector search, using a high-risk region in California as the reference.

As seen, ZCTAs with elevated burn probability often align with ZCTAs identified as environmentally similar to the Californian reference region, suggesting that this approach has strong potential as a tool for identifying areas associated with elevated wildfire risk, complementing traditional risk assessments. Notably, the similarity map does not flag certain southern areas with high estimated burn probability, suggesting that wildfires there may arise under different environmental conditions. In fact, the reference Californian ZCTA is characterized by dense tree cover, whereas southern Arizona is dominated by shrubs, highlighting how differences in vegetation shape regional wildfire behavior.

A New Era of Geospatial Analysis

By bringing Google’s Satellite Embeddings into CARTO, we’re bridging the gap between raw satellite imagery and actionable environmental insights. This unlocks new opportunities for environmental monitoring, climate risk analysis, urban planning, and biodiversity research.

Whether you’re mapping wildfire risks, tracking ecosystem changes, or identifying new areas for conservation, the recently launched satellite embeddings open the door to richer, more informed decision-making. Want to try it yourself? Start your free 14-day trial here!

Want to get started?

Start free trial

‍

References

[1] Brown, Christopher F.; Kazmierski, Michal R.; Pasquarella, Valerie J.; Rucklidge, William J.; Samsikova, Masha; Zhang, Chenhui; Shelhamer, Evan; Lahera, Estefania; Wiles, Olivia; Ilyushchenko, Simon; Gorelick, Noel; Zhang, Lihui Lydia; Alj, Sophia; Schechter, Emily; Askay, Sean; Guinan, Oliver; Moore, Rebecca; Boukouvalas, Alexis; Kohli, Pushmeet. 2025. AlphaEarth Foundations: An embedding field model for accurate and efficient global mapping from sparse label data. arXiv. https://doi.org/10.48550/arXiv.2507.22291

[2] Scott, Joe H.; Dillon, Gregory K.; Jaffe, Melissa R.; Vogler, Kevin C.; Olszewski, Julia H.; Callahan, Michael N.; Karau, Eva C.; Lazarz, Mitchell T.; Short, Karen C.; Riley, Karin L.; Finney, Mark A.; Grenfell, Isaac C. 2024. Wildfire Risk to Communities: Spatial datasets of landscape-wide wildfire risk components for the United States. 2nd Edition. Fort Collins, CO: Forest Service Research Data Archive. https://doi.org/10.2737/RDS-2020-0016-2

Don’t forget to share this post on Twitter, Facebook and Linkedin!

About the author

Lucía is Data Scientist at CARTO, where she develops spatial statistics and machine learning solutions that unveil the hidden potential of location-based data, enabling organizations to maximize the value of geospatial information.

More Posts from

Lucía García-Duarte

About the author

Provided by our community, industry experts, or the CARTO Team, these blog posts cover the entire spectrum of spatial analysis. From location intelligence to GIS, spatial data science, industry trends, and much more, we’ve crafted relevant content to accompany you at every stage of your journey, whether you have a technical or business background. With our Blog, you are one step closer to taking spatial analysis to the next level.

About the author

About the author

About the author

About the author