# Explore Open Buildings with CARTO Spatial Extension for BigQuery

Google recently released the Open Buildings dataset, including 516 million building footprints covering 64% of the African continent. Each building is described as a polygon with a confidence score indicating the degree of confidence that the building identified is indeed a building. There are many potential use cases for the data, including our recent visualization and blog post assessing the damage from a volcano on La Palma in the Spanish Canary Islands.

To analyze the entire continent, we can leverage BigQuery to store the information in one table and then use standard SQL to ask the data questions. In addition, to explore the entire continent at once, we can use CARTO’s BigQuery Tiler (part of the CARTO Spatial Extension for BigQuery) to visualize all 516 million buildings on one map efficiently.

Open Buildings has hosted all of the building polygons in a series of compressed CSV files in Google Cloud Storage, totaling 46 GB. Instead of downloading the files locally and pushing them back up into BigQuery, we can use the bq command-line tool to create a table in BigQuery directly from the files in cloud storage.

Each of the compressed CSV files are stored in the path: gs://open-buildings-data/v1/polygons_s2_level_4_gzip/

We can use the following command to take each of the compressed CSV files (the *.csv.gz denotes all files ending in that extension) and create a single table in BigQuery. It will automatically determine a schema to use and create a series of jobs to build a table.

bq load --autodetect YOUR_DATASET_NAME.open_buildings_raw "gs://open-buildings-data/v1/polygons_s2_level_4_gzip/*.csv.gz"


When previewing the table in BigQuery, the schema should look like this:

## Transforming the data

We now have the data of the buildings in a single BigQuery table. However, there are some refinements we can make to make geospatial analysis faster and easier.

1. Select only the columns we need (in this case, the confidence score and the geometry).
2. Create a column containing a GEOGRAPHY type, which is required for using geospatial functions in BigQuery and CARTO’s Spatial Extension (the raw table geometry column is type string from the import of the CSV files).
3. Cluster the table by the GEOGRAPHY column, which will improve certain geospatial query performance.
create table YOUR_PROJECT_NAME.YOUR_DATASET_NAME.open_buildings_clean cluster by geom as
select confidence, ST_GeogFromText(geometry) as geom
from YOUR_PROJECT_NAME.YOUR_DATASET_NAME.open_buildings_raw


## Creating the tileset

We can create a tileset using CARTO’s Spatial Extension for BigQuery with a single SQL command. The first argument is the source table (the clean table we just created). The second argument is the desired output tileset table name. Last is an optional argument, which could be null, causing the default parameters to be used. However, setting zoom_min to 10 and zoom_max to 20 will reduce the size and improve performance. Since we are viewing individual buildings, we do not need to visualize the data at a zoom level lower than 10.

CALL bqcarto.tiler.CREATE_TILESET(
'YOUR_PROJECT_NAME.YOUR_DATASET_NAME.open_buildings_clean',
'YOUR_PROJECT_NAME.YOUR_DATASET_NAME.open_buildings_clean_tileset',
STRUCT
(
null AS name,
null AS description,
null AS legend,
10 AS zoom_min,
20 AS zoom_max,
null AS geom_column_name,
null AS zoom_min_column,
null AS zoom_max_column,
null AS max_tile_size_kb,
null AS tile_feature_order,
null AS drop_duplicates,
)
);


## Creating the visualization

Using CARTO Builder, we can add the tileset, apply some styling based on confidence score and switch to a Google base map to see how the polygons line up versus Google Maps road and satellite imagery layers.

Click “Add source from…” and select the tileset created earlier.

Click the base map menu option and switch the base map to Google Maps. Optionally we can switch to Satellite if we wish to compare the building polygons to actual imagery.

Go into the “Layer settings”, click the three vertical dots next to Fill Color, select the confidence column in the “Color Based On” field, and switch the steps to 3 by clicking on the color ramp.

At the end of the steps outlined above, we can see a map with all of the building polygons shown like the screenshot above. We can move around to most areas of Africa and see each building allowing for rapid exploration.

A live map can be found here.

Solutions Engineer at CARTO

• ## CARTO Spatial Features, now with global urbanity, climatology, & elevation data

In late 2020, we released our first derivative data product, CARTO Spatial Features, providing global demographic data and Point of Interest (POI) aggregations by category....

• ## Uncovering Site Selection Strategies using Point of Interest Data

In the U.S. it can cost more than \$1 million to open a McDonald’s, Taco Bell, Burger King, or Wendy’s restaurant. With such high levels associated with funding startup cost...

• ## CKDelta’s international human mobility data now available in CARTO

Over the past year the importance of human mobility data has increased significantly, not only as a means to analyze how to mitigate the spread of Covid; but more recently ...