From Imagery to Insight: Google AlphaEarth Foundations in CARTO
.png)
Sifting through raw satellite imagery to find insights can feel like hunting for a needle in a haystack - except the haystack is the entire planet. For analysts and data teams, the challenge isn’t lack of data, but the complexity of turning it into something usable for real-world decisions.
That’s why we’re excited to share another important milestone in providing access to cutting-edge geospatial AI capabilities for your analytical workflows and spatial models built in CARTO. A couple of months ago we launched the Geospatial Foundation Models Extension Package for CARTO Workflows, our low-code tool for automating spatial analysis pipelines. This enables our users to leverage the power of industry-leading geospatial foundation models by loading their spatial embeddings into analytical and data processing workflows.
Today, we’re taking it a step further — adding the recently launched Google Deepmind’s AlphaEarth Foundations [1] to the list of available geospatial foundation models. Built from petabytes of multimodal Earth observation data, AlphaEarth delivers annual, 64-dimensional “satellite embeddings” that capture the planet’s land and coastal waters at 10-meter resolution, distilling complex optical, radar, and climate inputs into a compact, analysis-ready format.
This new addition means that you can now integrate powerful, analysis-ready satellite embeddings of Earth’s surface directly into your spatial pipelines with the ease of CARTO’s low-code tools. Keep reading to learn how!
The Satellite Embeddings dataset by Google Earth Engine offers condensed, high-dimensional numerical representations of the Earth’s surface derived from multi-source remote sensing data. Instead of manually handling dozens of raw bands, users can work with streamlined representations that inherently capture key environmental patterns, unlocking powerful new ways to explore and compare the Earth’s surface.

Specifically, Google’s Satellite Embeddings dataset provides:
- Global coverage - embeddings available for every 10-meter land pixel on Earth.
- Annual temporal summaries - capturing year-round surface condition patterns, from 2017 to 2024.
- 64-dimensional vectors - abstract representations that capture relationships between multiple spectral, temporal, and environmental features.
These embeddings support a wide range of applications —from identifying high-risk areas and monitoring environmental changes to tracking urban expansion and enhancing disaster response by locating similarly vulnerable regions.
With this new component available in CARTO Workflows, you can load the embeddings directly into your data pipelines - no need to manually manage complex API requests or storage! For any chosen set of geometries and time range, the system retrieves the summarized values for each of the 64 embedding dimensions from Google Earth Engine.

These features can be combined with your own geospatial datasets for advanced analytics, or be integrated into your AI and machine learning workflows for tasks like classification, clustering, and predictive modeling.
The addition of Google DeepMind’s AlphaEarth Foundations to CARTO Workflows opens up new possibilities for organizations that rely on timely, high-resolution environmental intelligence. Here are just a few examples of how this data can be used:
- Insurance – Refine risk models by identifying environmental conditions linked to flood, wildfire, or storm damage, and monitor changes in high-risk areas to guide underwriting and claims assessment.
- Telecommunications – Optimize network planning by mapping terrain and vegetation changes that impact signal propagation to identify optimal sites for towers or infrastructure expansion.
- Financial Services – Assess climate-related risks in investment portfolios or monitor supply-chain-critical areas for environmental change.
- Logistics & Transportation – Predict and mitigate disruption from weather or terrain changes, and support route planning and infrastructure maintenance with up-to-date surface condition data.
- Public Sector – Support disaster preparedness, land-use planning, and conservation efforts with consistent, scalable monitoring of ecosystems and urban growth.
Let’s do a deep dive into one of these examples.
To see these embeddings in action, we designed a practical workflow that identifies ZCTAs (ZIP Code Tabulated Areas) in Arizona with environmental and surface patterns similar to a wildfire-prone ZCTA in California - potentially indicating higher fire risk. When working with embeddings, this problem is known as vector (similarity) search.
We used the following datasets:
- California and Arizona ZIP Code Tabulation Areas (ZCTA codes), publicly available in BigQuery under bigquery-public-data.utility_us.zipcode_area
- California Wildfire perimeters from the California Department of Forestry and Fire Protection
California is well known for its susceptibility to wildfires. So, we selected a representative ZCTA in California with a high documented history of wildfires. Next, we extracted the 2024 satellite embeddings for all required ZCTAs, summarizing each set of embeddings by calculating the mean vector. With these summarized 64-dimensional feature vectors, we created arrays of embedding features for each ZCTA to later perform a similarity search using BigQuery’s VECTOR_SEARCH function.

The obtained distance metric quantifies the similarity between the base vector (California’s selected ZCTA) and query vectors (all Arizona’s ZCTAs) based on their satellite embeddings. Lower distance values indicate areas in Arizona that share closely matching environmental and surface patterns with the reference region, potentially highlighting zones with similar wildfire risk profiles. This enables targeted analysis and decision-making by pinpointing areas that may require closer monitoring or preventive measures.
To validate the results, we compared the obtained similarities with the burned probability estimates provided by the Wildfire Risk to Communities model [2]. We first summarized the raster values to ZCTAs to then compute the Spearman's rank correlation coefficient between the two, obtaining a value of 0.748.

The obtained correlation coefficient is visible between the two maps below. The left map displays aggregated burn probability values at the ZCTA level, highlighting current high-risk areas based on historical and modeled fire occurrence. The right map shows similarity scores derived from our embedding-based vector search, using a high-risk region in California as the reference.
As seen, ZCTAs with elevated burn probability often align with ZCTAs identified as environmentally similar to the Californian reference region, suggesting that this approach has strong potential as a tool for identifying areas associated with elevated wildfire risk, complementing traditional risk assessments. Notably, the similarity map does not flag certain southern areas with high estimated burn probability, suggesting that wildfires there may arise under different environmental conditions. In fact, the reference Californian ZCTA is characterized by dense tree cover, whereas southern Arizona is dominated by shrubs, highlighting how differences in vegetation shape regional wildfire behavior.
By bringing Google’s Satellite Embeddings into CARTO, we’re bridging the gap between raw satellite imagery and actionable environmental insights. This unlocks new opportunities for environmental monitoring, climate risk analysis, urban planning, and biodiversity research.
Whether you’re mapping wildfire risks, tracking ecosystem changes, or identifying new areas for conservation, the recently launched satellite embeddings open the door to richer, more informed decision-making. Want to try it yourself? Start your free 14-day trial here!
References
[1] Brown, Christopher F.; Kazmierski, Michal R.; Pasquarella, Valerie J.; Rucklidge, William J.; Samsikova, Masha; Zhang, Chenhui; Shelhamer, Evan; Lahera, Estefania; Wiles, Olivia; Ilyushchenko, Simon; Gorelick, Noel; Zhang, Lihui Lydia; Alj, Sophia; Schechter, Emily; Askay, Sean; Guinan, Oliver; Moore, Rebecca; Boukouvalas, Alexis; Kohli, Pushmeet. 2025. AlphaEarth Foundations: An embedding field model for accurate and efficient global mapping from sparse label data. arXiv. https://doi.org/10.48550/arXiv.2507.22291
[2] Scott, Joe H.; Dillon, Gregory K.; Jaffe, Melissa R.; Vogler, Kevin C.; Olszewski, Julia H.; Callahan, Michael N.; Karau, Eva C.; Lazarz, Mitchell T.; Short, Karen C.; Riley, Karin L.; Finney, Mark A.; Grenfell, Isaac C. 2024. Wildfire Risk to Communities: Spatial datasets of landscape-wide wildfire risk components for the United States. 2nd Edition. Fort Collins, CO: Forest Service Research Data Archive. https://doi.org/10.2737/RDS-2020-0016-2