Analysis Guides

Enrich from Data Observatory

CARTO’s Data Observatory enables you to discover hidden patterns and gain comprehensive insights about unforeseen opportunities through data enrichment and analysis.

The Data Observatory provides demographic, financial, real estate, transportation, and various population segment data, directly through CARTO Builder (or SQL). See the Data Observatory catalog for a list of all available measures and regions of the world where there is coverage.

CHEATSHEET: Data Observatory Parameters

There are several parameters from The Data Observatory Catalog that can be applied when enriching your data from CARTO Builder. Options vary, depending on the REGION selected. If you are unsure of what parameters to select, view the catalog for details and descriptions.

  • REGION: Select the country, or part of the world, to begin exploring interesting Data Observatory measurements.
  • MEASUREMENT: Access measurements at individual point locations, or within polygons. These include variables for demographic, economic, transportation and other types of information. When a measurement is selected in Builder, a link to the licensing terms for the data provided appears; enabling you to view any copyright information that may or may not require additional map attributions.
  • NORMALIZE: Normalization is the process of dividing one measurement, either by its whole or by area, to give you a more accurate representation of your data as it relates to the greater whole, or a specific geography. Instead of relying on raw numbers (which may misrepresent data), normalization gives a more accurate representation of information for analysis and mapping. To see what parameters data can be normalized by, view the denominator examples in the catalog (by region).
  • TIMESPAN: A specific timespan to query retrieved data, in years. This enables control over the temporal dimension of data for current or historical analysis.
  • BOUNDARIES: Geospatial boundaries that render, or aggregate, your data based on geometric polygons. Data ranges from small areas (e.g. US Census Block Groups) to large areas (e.g. Countries). You can access boundaries by point location lookup, bounding box lookup, direct ID access and several other methods.

Up to ten Data Observatory measurements can be added under a single analysis node.

The Enrich from Data Observatory analysis is only available for Enterprise accounts. Request a demo if you are interested in enabling this service for your account.

Enrich your Data

For this guide, let’s find out how well Denver, Colorado’s light rail system targets commuters who are able to walk to the city’s public transit system.

  1. Import the template .carto file packaged from the “Download resources” of this guide and create the map. Builder opens with Light Rail Service as the first map layer, and Car-free Households near Denver Light Rail as the second map layer.

    Click on “Download resources” from this guide to download the zip file to your local machine. Extract the zip file to view the .carto file(s) used for this guide.

  2. Select the Car-free Households near Denver’s Light Rail layer.

  3. Click the ANALYSIS tab and apply the Enrich from Data Observatory analysis tab.

    The BASE LAYER is hard-coded and represents the selected map layer.

  4. For the REGION, select United States.

    CARTO Builder intuitively displays the regions that apply to the selected data. The number in parenthesis, next to each region, indicates the number of measurements available. This number is subject to change, as CARTO maintains the Data Observatory catalog.

    Select a region

    The selected REGION drives the measurement options that appear for the Data Observatory analysis.

  5. Select a MEASUREMENT:

    Since there is a large amount of data available, the MEASUREMENT drop-down list displays suggested results, by popularity. If you are unsure what to search for, view The Data Observatory Catalog for descriptions.

    • If you know the measurement name, manually type in the search term and select it from the list.

      Type measurement name

    • If you are unsure of the measurement name, click Filters to narrow down the results. This displays categories such as Age and Gender, Employment, Income, and many more location dimensions.

      Filter measurements

    For this example, filter the measurements by Transportation in order to select Car-free households.

    After selecting a measurement, the most recent TIMESPAN is applied. Additionally, the identified type of license for the selected measurement appears underneath the measurement name.

    Unrestricted appears, which defines the licensing terms for the data provided.

  6. Click the checkbox next to NORMALIZE and select Households. This value gives you the proportion (%) of all Households that are Car-free Households in the area.

    While this step is optional, normalization provides a more accurate measurement for analysis by giving you insights into the context of the data, as it pertains to the whole. In this particular case, we want the percentage of households that are car-free. In other cases, you can also normalize by area, which can give you a much clearer insight into distributions of measurements across a geographic area. Not selecting a normalization criteria yields the raw total of your selected measurement.

  7. For TIMESPAN, select from a range of defined years to narrow down your results.

    By default, the most recent data found in the Data Observatory is applied as the TIMESPAN.

    • For this example, select 2011-2015 as the TIMESPAN.
  8. For BOUNDARIES, use the slider button to select Shoreline clipped US Census Tracts.

    Boundaries slider

    Defining preferred boundaries gives you control over the granularity of your data, depending on the scope of your project. For more localized or detailed maps, smaller administrative regions will yield more reliable results. For national or global analysis, larger boundaries will be more accurate and execute more quickly.

  9. Click APPLY to run the analysis.

New Column Added

After the analysis runs, the selected measurement is automatically added to your data as a new column. Switch to the Data View of your map layer to view the column added, no_cars_rate_2011_2015.

The Data View and Map View appear as buttons on your map visualization when a map layer is selected. Click to switch between viewing your connected dataset as a table, or show the map view of your data.

Data View of map layer after analysis is applied

The Data Observatory distinguishes between point and polygon data when calculating measures. For example, when getting counts like population for a point, the returned value is the population density (in count per square kilometer). When getting population for a polygon, the returned value is the estimated count of the population living within the polygon’s area. Non-count numbers, like median income, are not available for polygons.

Legacy Data Observatory Analyses

If you applied the legacy Enrich from Data Observatory analysis, those map layers will display the original COUNTRY, MEASUREMENT, and SEGMENTS applied; so as not to interfere with any custom columns created in your dataset.

It is recommended to recreate the Enrich from Data Observatory analysis, as enhanced options enable you to include REGIONS, NORMALIZATION, and TIMESPAN dimensions for measurements of data.

Add Multiple Measurements

You can add up to ten Data Observatory measurements under a single analysis node. This empowers you to apply several dimensions of data augmentation in a single analysis request.

For this guide, let’s add another Transportation measurement within the demographic of car-free households to identify commuters who primarily travel to work by subway.

  1. From the Enrich from Data Observatory analysis, click ADD NEW MEASUREMENT.

    Add new measurement

  2. Add measurement for people who commute by subway or elevated rail:

    • Filter the measurement by Transportation.
    • Select Commuters by Subway or Elevated.
    • NORMALIZE by Commuters by Public Transportation, which measures the percent of commuters at one point.
    • For TIMESPAN, select 2011-2015.
    • For BOUNDARIES, select Shoreline clipped US Census Tracts.
  3. Click APPLY to run the analysis.

    A confirmation dialog appears, indicating the additional columns that were added to your dataset. This is useful, as you can style map layers by column values!

Style By Value

To visualize the results of your analysis, style the layer by any of the newly added column values.

  1. From the STYLE tab of the Car-free Households near Denver’s Light Rail layer, change the STROKE SIZE from the default Fixed selection to the By Value option to open the stroke properties for the map layer.

    Marker size by value options

  2. Select no_cars_per_rate_2011_2015 to symbolize the points by the number of car-free households, which are served at that location.

    Only number columns from your map layer appear when selecting size by value.

    • For MIN, enter 6.
    • For MAX, enter 18.

    Fill size by column value

  3. To display this data in a legend for map viewers, click the LEGEND tab and select the BUBBLE legend from the SIZE subtab.

    Bubble legend

Results of Analysis

The result of Enrich from Data Observatory allows us to visualize that concentrations of car-free households in West Colfax and Sheridan are benefiting the most from public transit lines. These calculations were normalized by total households; which enables us to highlight the total number of households in the area and style by calculated percentages.

External Resources