A Python package for integrating CARTO maps, analysis, and data services into data science workflows.

This component is still under support but it will not be further developed. We don’t recommend starting new projects with it as it will eventually become deprecated. Instead, learn more about our new Python packages here.

Introduction

The CARTOframes API is organized into three parts: auth, data, and viz.

Authentication

It is possible to use CARTOframes without having a CARTO account. However, to have access to data enrichment or to discover useful datasets, being a CARTO user offers many advantages. This module is responsible for connecting the user with its CARTO account through given user credentials.

Manage Data

From discovering and enriching data to applying data analyisis and geocoding methods, the CARTOframes API is built with the purpose of managing data without leaving the context of your notebook.

Visualize Data

The viz API is designed to create useful, beautiful and straightforward visualizations. It is both predefined and flexible, giving advanced users the possibility of building specific visualizations, but also offering multiple built-in methods to work faster with a few lines of code.

Auth

Auth namespace contains the class to manage authentication: cartoframes.auth.Credentials. It also includes the utility functions cartoframes.auth.set_default_credentials() and cartoframes.auth.get_default_credentials().

class cartoframes.auth.Credentials(username=None, api_key='default_public', base_url=None, session=None, allow_non_secure=False)

Bases: object

Credentials class is used for managing and storing user CARTO credentials. The arguments are listed in order of precedence: Credentials instances are first, key and base_url/username are taken next, and config_file (if given) is taken last. The config file is creds.json by default. If no arguments are passed, then there will be an attempt to retrieve credentials from a previously saved session. One of the above scenarios needs to be met to successfully instantiate a Credentials object.

Parameters
  • username (str, optional) – Username of CARTO account.

  • api_key (str, optional) – API key of user’s CARTO account. If the dataset is public, it can be set to ‘default_public’.

  • base_url (str, optional) – Base URL used for API calls. This is usually of the form https://johnsmith.carto.com/ for user johnsmith. On premises installations (and others) have a different URL pattern.

  • session (requests.Session, optional) – requests session. See requests documentation for more information.

  • allow_non_secure (bool, optional) – Allow non secure http connections. By default is not allowed.

Raises

ValueError – if not available username or base_url are found.

Example

>>> creds = Credentials(username='johnsmith', api_key='abcdefg')
property api_key

Credentials api_key

property username

Credentials username

property base_url

Credentials base_url

property allow_non_secure

Allow connections non secure over http

property session

Credentials session

property user_id

Credentials user ID

classmethod from_file(config_file=None, session=None)

Retrives credentials from a file. Defaults to the user config directory.

Parameters
  • config_file (str, optional) – Location where credentials are loaded from. If no argument is provided, it will be loaded from the default location.

  • session (requests.Session, optional) – requests session. See requests documentation for more information.

Returns

A (Credentials) instance.

Example

>>> creds = Credentials.from_file('creds.json')
classmethod from_credentials(credentials)

Retrives credentials from another Credentials object.

Parameters

credentials (Credentials) –

Returns

A (Credentials) instance.

Raises

ValueError – if the credentials argument is not an instance of Credentials.

Example

>>> creds = Credentials.from_credentials(orig_creds)
save(config_file=None)

Saves current user credentials to user directory.

Parameters

config_file (str, optional) – Location where credentials are to be stored. If no argument is provided, it will be send to the default location (creds.json).

Example

>>> credentials = Credentials(username='johnsmith', api_key='abcdefg')
>>> credentials.save('creds.json')
User credentials for `johnsmith` were successfully saved to `creds.json`
classmethod delete(config_file=None)

Deletes the credentials file specified in config_file. If no file is specified, it deletes the default user credential file (creds.json)

Parameters

config_file (str) – Path to configuration file. Defaults to delete the user default location if None.

Tip

To see if there is a default user credential file stored, do the following:

>>> print(Credentials.from_file())
Credentials(username='johnsmith', api_key='abcdefg',
base_url='https://johnsmith.carto.com/')
is_instant_licensing_active()

Returns if the user has instant licensing activated for the Data Observatory v2.

get_gcloud_credentials()

Returns the Data Observatory v2 Google Cloud Platform project and token.

Example

>>> from cartoframes.auth import Credentials
>>> from google.oauth2.credentials import Credentials as GoogleCredentials
>>> creds = Credentials(username='johnsmith', api_key='abcdefg')
>>> gcloud_project, gcloud_token = creds.get_gcloud_credentials()
>>> gcloud_credentials = GoogleCredentials(gcloud_token)
cartoframes.auth.set_default_credentials(first=None, second=None, credentials=None, filepath=None, username=None, base_url=None, api_key=None, session=None, allow_non_secure=False)

Set default credentials for all operations that require authentication against a CARTO account.

Parameters
  • credentials (Credentials, optional) – A Credentials instance can be used in place of a username | base_url/api_key combination.

  • filepath (str, optional) – Location where credentials are stored as a JSON file.

  • username (str, optional) – CARTO user name of the account.

  • base_url (str, optional) – Base URL of CARTO user account. Cloud-based accounts should use the form https://{username}.carto.com (e.g., https://johnsmith.carto.com for user johnsmith) whether on a personal or multi-user account. On-premises installation users should ask their admin.

  • api_key (str, optional) – CARTO API key. Depending on the application, this can be a project API key or the account master API key.

  • session (requests.Session, optional) – requests session. See requests documentation for more information.

  • allow_non_secure (bool, optional) – Allow non secure http connections. By default is not allowed.

Note

The recommended way to authenticate in CARTOframes is to read user credentials from a JSON file that is structured like this:

{
    "username": "your user name",
    "api_key": "your api key",
    "base_url": "https://your_username.carto.com"
}

Note that the ``base_url`` will be different for on premises installations.

By using the cartoframes.auth.Credentials.save() method, this file will automatically be created for you in a default location depending on your operating system. A custom location can also be specified as an argument to the method.

This file can then be read in the following ways:

>>> set_default_credentials('./carto-project-credentials.json')

Example

Create Credentials from a username, api_key pair.

>>> set_default_credentials('johnsmith', 'your api key')

Create credentials from only a username (only works with public datasets and those marked public with link). If the API key is not provided, the public API key default_public is used. With this setting, only read-only operations can occur (e.g., no publishing of maps, reading data from the Data Observatory, or creating new hosted datasets).

>>> set_default_credentials('johnsmith')

From a pair base_url, api_key.

>>> set_default_credentials('https://johnsmith.carto.com', 'your api key')

From a base_url (for public datasets). The API key default_public is used by default.

>>> set_default_credentials('https://johnsmith.carto.com')

From a Credentials class.

>>> credentials = Credentials(
...     base_url='https://johnsmith.carto.com',
...     api_key='your api key')
>>> set_default_credentials(credentials)
cartoframes.auth.get_default_credentials()

Retrieve the default credentials if previously set with cartoframes.auth.set_default_credentials() in Python session.

Example

>>> set_default_credentials('creds.json')
>>> current_creds = get_default_credentials()
Returns

Default credentials previously set in current Python session. None will be returned if default credentials were not previously set.

Return type

cartoframes.auth.Credentials

cartoframes.auth.unset_default_credentials()

Unset the default credentials if previously set with cartoframes.auth.set_default_credentials() in Python session.

Example

>>> set_default_credentials('creds.json')
>>> unset_default_credentials()

I/O functions

cartoframes.read_carto(source, credentials=None, limit=None, retry_times=3, schema=None, index_col=None, decode_geom=True, null_geom_value=None)

Read a table or a SQL query from the CARTO account.

Parameters
  • source (str) – table name or SQL query.

  • credentials (Credentials, optional) – instance of Credentials (username, api_key, etc).

  • limit (int, optional) – The number of rows to download. Default is to download all rows.

  • retry_times (int, optional) – Number of time to retry the download in case it fails. Default is 3.

  • schema (str, optional) – prefix of the table. By default, it gets the current_schema() using the credentials.

  • index_col (str, optional) – name of the column to be loaded as index. It can be used also to set the index name.

  • decode_geom (bool, optional) – convert the “the_geom” column into a valid geometry column.

  • null_geom_value (Object, optional) – value for the the_geom column when it’s null. Defaults to None

Returns

geopandas.GeoDataFrame

Raises

ValueError – if the source is not a valid table_name or SQL query.

cartoframes.to_carto(dataframe, table_name, credentials=None, if_exists='fail', geom_col=None, index=False, index_label=None, cartodbfy=True, log_enabled=True, retry_times=3, max_upload_size=2000000000, skip_quota_warning=False)

Upload a DataFrame to CARTO. The geometry’s CRS must be WGS 84 (EPSG:4326) so you can use it on CARTO.

Parameters
  • dataframe (pandas.DataFrame, geopandas.GeoDataFrame`) – data to be uploaded.

  • table_name (str) – name of the table to upload the data.

  • credentials (Credentials, optional) – instance of Credentials (username, api_key, etc).

  • if_exists (str, optional) – ‘fail’, ‘replace’, ‘append’. Default is ‘fail’.

  • geom_col (str, optional) – name of the geometry column of the dataframe.

  • index (bool, optional) – write the index in the table. Default is False.

  • index_label (str, optional) – name of the index column in the table. By default it uses the name of the index from the dataframe.

  • cartodbfy (bool, optional) – convert the table to CARTO format. Default True. More info here <https://carto.com/developers/sql-api/guides/creating-tables/#create-tables>.

  • log_enabled (bool, optional) – enable the logging mechanism. Default is True.

  • retry_times (int, optional) – Number of time to retry the upload in case it fails. Default is 3.

  • max_upload_size (int, optional) – defines the maximum size of the dataframe to be uploaded. Default is 2GB.

  • skip_quota_warning (bool, optional) – skip the quota exceeded check and force the upload. (The upload will still fail if the size of the dataset exceeds the remaining DB quota). Default is False.

Returns

the table name normalized.

Return type

string

Raises

ValueError – if the dataframe or table name provided are wrong or the if_exists param is not valid.

cartoframes.list_tables(credentials=None)

List all of the tables in the CARTO account.

Parameters

credentials (Credentials, optional) – instance of Credentials (username, api_key, etc).

Returns

A DataFrame with all the table names for the given credentials.

Return type

DataFrame

cartoframes.has_table(table_name, credentials=None, schema=None)

Check if the table exists in the CARTO account.

Parameters
  • table_name (str) – name of the table.

  • credentials (Credentials, optional) – instance of Credentials (username, api_key, etc).

  • schema (str, optional) – prefix of the table. By default, it gets the current_schema() using the credentials.

Returns

True if the table exists, False otherwise.

Return type

bool

Raises

ValueError – if the table name is not a valid table name.

cartoframes.delete_table(table_name, credentials=None, log_enabled=True)

Delete the table from the CARTO account.

Parameters
  • table_name (str) – name of the table.

  • credentials (Credentials, optional) – instance of Credentials (username, api_key, etc).

  • log_enabled (bool, optional) – enable the logging mechanism. Default is True.

Raises

ValueError – if the table name is not a valid table name.

cartoframes.rename_table(table_name, new_table_name, credentials=None, if_exists='fail', log_enabled=True)

Rename a table in the CARTO account.

Parameters
  • table_name (str) – name of the table.

  • new_table_name (str) – new name for the table.

  • credentials (Credentials, optional) – instance of Credentials (username, api_key, etc).

  • if_exists (str, optional) – ‘fail’, ‘replace’. Default is ‘fail’.

  • log_enabled (bool, optional) – enable the logging mechanism. Default is True.

Raises

ValueError – if the table names provided are wrong or the if_exists param is not valid.

cartoframes.copy_table(table_name, new_table_name, credentials=None, if_exists='fail', log_enabled=True, cartodbfy=True)

Copy a table into a new table in the CARTO account.

Parameters
  • table_name (str) – name of the original table.

  • new_table_name (str, optional) – name for the new table.

  • credentials (Credentials, optional) – instance of Credentials (username, api_key, etc).

  • if_exists (str, optional) – ‘fail’, ‘replace’, ‘append’. Default is ‘fail’.

  • log_enabled (bool, optional) – enable the logging mechanism. Default is True.

  • cartodbfy (bool, optional) – convert the table to CARTO format. Default True. More info here <https://carto.com/developers/sql-api/guides/creating-tables/#create-tables>.

Raises

ValueError – if the table names provided are wrong or the if_exists param is not valid.

cartoframes.create_table_from_query(query, new_table_name, credentials=None, if_exists='fail', log_enabled=True, cartodbfy=True)

Create a new table from an SQL query in the CARTO account.

Parameters
  • query (str) – SQL query

  • new_table_name (str) – name for the new table.

  • credentials (Credentials, optional) – instance of Credentials (username, api_key, etc).

  • if_exists (str, optional) – ‘fail’, ‘replace’, ‘append’. Default is ‘fail’.

  • log_enabled (bool, optional) – enable the logging mechanism. Default is True.

  • cartodbfy (bool, optional) – convert the table to CARTO format. Default True. More info here <https://carto.com/developers/sql-api/guides/creating-tables/#create-tables>.

Raises

ValueError – if the query or table name provided is wrong or the if_exists param is not valid.

cartoframes.describe_table(table_name, credentials=None, schema=None)

Describe the table in the CARTO account.

Parameters
  • table_name (str) – name of the table.

  • credentials (Credentials, optional) – instance of Credentials (username, api_key, etc).

  • schema (str, optional) – prefix of the table. By default, it gets the current_schema() using the credentials.

Returns

A dict with the privacy, num_rows and geom_type of the table.

Raises

ValueError – if the table name is not a valid table name.

cartoframes.update_privacy_table(table_name, privacy, credentials=None, log_enabled=True)

Update the table information in the CARTO account.

Parameters
  • table_name (str) – name of the table.

  • privacy (str) – privacy of the table: ‘private’, ‘public’, ‘link’.

  • credentials (Credentials, optional) – instance of Credentials (username, api_key, etc).

  • log_enabled (bool, optional) – enable the logging mechanism. Default is True.

Raises

ValueError – if the table name is wrong or the privacy name is not ‘private’, ‘public’, or ‘link’.

Data Services

class cartoframes.data.services.Geocoding(credentials=None)

Bases: cartoframes.data.services.service.Service

Geocoding using CARTO data services.

This requires a CARTO account with an API key that allows for using geocoding services; (through explicit argument in constructor or via the default credentials).

To prevent having to geocode records that have been previously geocoded, and thus spend quota unnecessarily, you should always preserve the the_geom and carto_geocode_hash columns generated by the geocoding process. This will happen automatically if your input is a table from CARTO processed in place (i.e. without a table_name parameter) or if you save your results in a CARTO table using the table_name parameter, and only use the resulting table for any further geocoding.

In case you’re geocoding local data from a DataFrame that you plan to re-geocode again, (e.g. because you’re making your work reproducible by saving all the data preparation steps in a notebook), we advise to save the geocoding results immediately to the same store from where the data is originally taken, for example:

>>> df = pandas.read_csv('my_data')
>>> geocoded_df = Geocoding().geocode(df, 'address').data
>>> geocoded_df.to_csv('my_data')

As an alternative, you can use the cached option to store geocoding results in a CARTO table and reuse them in later geocodings. To do this, you need to use the table_name parameter with the name of the table used to cache the results.

If the same dataframe is geocoded repeatedly no credits will be spent, but note there is a time overhead related to uploading the dataframe to a temporary table for checking for changes.

>>> df = pandas.read_csv('my_data')
>>> geocoded_df = Geocoding().geocode(df, 'address', table_name='my_data', cached=True).data

If you execute the previous code multiple times it will only spend credits on the first geocoding; later ones will reuse the results stored in the my_data table. This will require extra processing time. If the CSV file should ever change, cached results will only be applied to unmodified records, and new geocoding will be performed only on new or changed records.

geocode(source, street, city=None, state=None, country=None, status={'gc_status_rel': 'relevance'}, table_name=None, if_exists='fail', dry_run=False, cached=None, null_geom_value=None)

Geocode method.

Parameters
  • source (str, pandas.DataFrame, geopandas.GeoDataFrame) – table, SQL query or DataFrame object to be geocoded.

  • street (str) – name of the column containing postal addresses

  • city (dict, optional) – dictionary with either a column key with the name of a column containing the addresses’ city names or a value key with a literal city value, e.g. ‘New York’. It also accepts a string, in which case column is implied.

  • state (dict, optional) – dictionary with either a column key with the name of a column containing the addresses’ state names or a value key with a literal state value, e.g. ‘WA’. It also accepts a string, in which case column is implied.

  • country (dict, optional) – dictionary with either a column key with the name of a column containing the addresses’ country names or a value key with a literal country value, e.g. ‘US’. It also accepts a string, in which case column is implied.

  • status (dict, optional) – dictionary that defines a mapping from geocoding state attributes (‘relevance’, ‘precision’, ‘match_types’) to column names. (See https://carto.com/developers/data-services-api/reference/) Columns will be added to the result data for the requested attributes. By default a column gc_status_rel will be created for the geocoding _relevance_. The special attribute ‘*’ refers to all the status attributes as a JSON object.

  • table_name (str, optional) – the geocoding results will be placed in a new CARTO table with this name.

  • if_exists (str, optional) – Behavior for creating new datasets, only applicable if table_name isn’t None; Options are ‘fail’, ‘replace’, or ‘append’. Defaults to ‘fail’.

  • cached (bool, optional) – Use cache geocoding results, saving the results in a table. This parameter should be used along with table_name.

  • dry_run (bool, optional) – no actual geocoding will be performed (useful to check the needed quota)

  • null_geom_value (Object, optional) – value for the the_geom column when it’s null. Defaults to None

Returns

A named-tuple (data, metadata) containing either a data geopandas.GeoDataFrame and a metadata dictionary with global information about the geocoding process.

The data contains a geometry column with point locations for the geocoded addresses and also a carto_geocode_hash that, if preserved, can avoid re-geocoding unchanged data in future calls to geocode.

The metadata, as described in https://carto.com/developers/data-services-api/reference/, contains the following information:

Name

Type

Description

precision

text

precise or interpolated

relevance

number

0 to 1, higher being more relevant

match_types

array

list of match type strings point_of_interest, country, state, county, locality, district, street, intersection, street_number, postal_code

By default the relevance is stored in an output column named gc_status_rel. The name of the column and in general what attributes are added as columns can be configured by using a status dictionary associating column names to status attribute.

Raises

ValueError – if chached param is set without table_name.

Examples

Geocode a DataFrame:

>>> df = pandas.DataFrame([['Gran Vía 46', 'Madrid'], ['Ebro 1', 'Sevilla']], columns=['address','city'])
>>> geocoded_gdf, metadata = Geocoding().geocode(
...     df, street='address', city='city', country={'value': 'Spain'})
>>> geocoded_gdf.head()

Geocode a table from CARTO:

>>> gdf = read_carto('table_name')
>>> geocoded_gdf, metadata = Geocoding().geocode(gdf, street='address')
>>> geocoded_gdf.head()

Geocode a query against a table from CARTO:

>>> gdf = read_carto('SELECT * FROM table_name WHERE value > 1000')
>>> geocoded_gdf, metadata = Geocoding().geocode(gdf, street='address')
>>> geocoded_gdf.head()

Obtain the number of credits needed to geocode a CARTO table:

>>> gdf = read_carto('table_name')
>>> geocoded_gdf, metadata = Geocoding().geocode(gdf, street='address', dry_run=True)
>>> print(metadata['required_quota'])

Filter results by relevance:

>>> df = pandas.DataFrame([['Gran Vía 46', 'Madrid'], ['Ebro 1', 'Sevilla']], columns=['address','city'])
>>> geocoded_gdf, metadata = Geocoding().geocode(
...     df,
...     street='address',
...     city='city',
...     country={'value': 'Spain'},
...     status=['relevance'])
>>> # show rows with relevance greater than 0.7:
>>> print(geocoded_gdf[geocoded_gdf['carto_geocode_relevance'] > 0.7, axis=1)])
class cartoframes.data.services.Isolines(credentials=None)

Bases: cartoframes.data.services.service.Service

Time and distance Isoline services using CARTO dataservices.

isochrones(source, ranges, **args)

isochrone areas.

This method computes areas delimited by isochrone lines (lines of constant travel time) based upon public roads.

Parameters
  • source (str, pandas.DataFrame, geopandas.GeoDataFrame) – table, SQL query or DataFrame containing the source points for the isochrones: travel routes from the source points are computed to determine areas within specified travel times.

  • ranges (list) – travel time values in seconds; for each range value and source point a result polygon will be produced enclosing the area within range of the source.

  • exclusive (bool, optional) – when False (the default), inclusive range areas are generated, each one containing the areas for smaller time values (so the area is reachable from the source within the given time). When True, areas are exclusive, each one corresponding time values between the immediately smaller range value (or zero) and the area range value.

  • ascending (bool, optional) – when True, the isochornes are sorted ascending by travel time, and False (default) for the opposite case.

  • table_name (str, optional) – the resulting areas will be saved in a new CARTO table with this name.

  • if_exists (str, optional) – Behavior for creating new datasets, only applicable if table_name isn’t None; Options are ‘fail’, ‘replace’, or ‘append’. Defaults to ‘fail’.

  • dry_run (bool, optional) – no actual computation will be performed, and metadata will be returned including the required quota.

  • mode (str, optional) – defines the travel mode: 'car' (the default) or 'walk'.

  • is_destination (bool, optional) – indicates that the source points are to be taken as destinations for the routes used to compute the area, rather than origins.

  • mode_type (str, optional) – type of routes computed: 'shortest' (default) or 'fastests'.

  • mode_traffic (str, optional) – use traffic data to compute routes: 'disabled' (default) or 'enabled'.

  • resolution (float, optional) – level of detail of the polygons in meters per pixel. Higher resolution may increase the response time of the service.

  • maxpoints (int, optional) – Allows to limit the amount of points in the returned polygons. Increasing the number of maxpoints may increase the response time of the service.

  • quality – (int, optional): Allows you to reduce the quality of the polygons in favor of the response time. Admitted values: 1/2/3.

  • geom_col (str, optional) – string indicating the geometry column name in the source DataFrame.

  • source_col (str, optional) – string indicating the source column name. This column will be used to reference the generated isolines with the original geometry. By default it uses the cartodb_id column if exists, or the index of the source DataFrame.

Returns

A named-tuple (data, metadata) containing a data geopandas.GeoDataFrame and a metadata dictionary. For dry runs the data will be None. The data contains a range_data column with a numeric value and a the_geom geometry with the corresponding area. It will also contain a source_id column that identifies the source point corresponding to each area if the source has a cartodb_id column.

isodistances(source, ranges, **args)

isodistance areas.

This method computes areas delimited by isodistance lines (lines of constant travel distance) based upon public roads.

Parameters
  • source (str, pandas.DataFrame, geopandas.GeoDataFrame) – table, SQL query or DataFrame containing the source points for the isodistances: travel routes from the source points are computed to determine areas within specified travel distances.

  • ranges (list) – travel distance values in meters; for each range value and source point a result polygon will be produced enclosing the area within range of the source.

  • exclusive (bool, optional) – when False (the default), inclusive range areas are generated, each one containing the areas for smaller distance values (so the area is reachable from the source within the given distance). When True, areas are exclusive, each one corresponding distance values between the immediately smaller range value (or zero) and the area range value.

  • ascending (bool, optional) – when True, the isochornes are sorted ascending by travel time, and False (default) for the opposite case.

  • table_name (str, optional) – the resulting areas will be saved in a new CARTO table with this name.

  • if_exists (str, optional) – Behavior for creating new datasets, only applicable if table_name isn’t None; Options are ‘fail’, ‘replace’, or ‘append’. Defaults to ‘fail’.

  • dry_run (bool, optional) – no actual computation will be performed, and metadata will be returned including the required quota.

  • mode (str, optional) – defines the travel mode: 'car' (the default) or 'walk'.

  • is_destination (bool, optional) – indicates that the source points are to be taken as destinations for the routes used to compute the area, rather than origins.

  • mode_type (str, optional) – type of routes computed: 'shortest' (default) or 'fastests'.

  • mode_traffic (str, optional) – use traffic data to compute routes: 'disabled' (default) or 'enabled'.

  • resolution (float, optional) – level of detail of the polygons in meters per pixel. Higher resolution may increase the response time of the service.

  • maxpoints (int, optional) – Allows to limit the amount of points in the returned polygons. Increasing the number of maxpoints may increase the response time of the service.

  • quality – (int, optional): Allows you to reduce the quality of the polygons in favor of the response time. Admitted values: 1/2/3.

  • geom_col (str, optional) – string indicating the geometry column name in the source DataFrame.

  • source_col (str, optional) – string indicating the source column name. This column will be used to reference the generated isolines with the original geometry. By default it uses the cartodb_id column if exists, or the index of the source DataFrame.

Returns

A named-tuple (data, metadata) containing a data geopandas.GeoDataFrame and a metadata dictionary. For dry runs the data will be None. The data contains a range_data column with a numeric value and a the_geom geometry with the corresponding area. It will also contain a source_id column that identifies the source point corresponding to each area if the source has a cartodb_id column.

Raises
  • Exception – if the available quota is less than the required quota.

  • ValueError – if there is no valid geometry found in the dataframe.

Data Observatory

With CARTOframes it is possible to enrich your data by using our Data Observatory Catalog through the enrichment methods.

class cartoframes.data.observatory.Catalog

Bases: object

This class represents the Data Observatory metadata Catalog.

The catalog contains metadata that helps to discover and understand the data available in the Data Observatory for Dataset.download and Enrichment purposes.

You can get more information about the Data Observatory catalog from the CARTO website and in your CARTO user account dashboard.

The Catalog has three main purposes:
  • Explore and discover the datasets available in the repository (both public and premium datasets).

  • Subscribe to some premium datasets and manage your datasets licenses.

  • Download data and use your licensed datasets and variables to enrich your own data by means of the Enrichment functions.

The Catalog is public and can be explored without a CARTO account. Once you discover a Dataset of interest and want to acquire a license to use it, you’ll need a CARTO account to subscribe to it, by means of the Dataset.subscribe or Geography.subscribe functions.

The Catalog is composed of three main entities:
  • Dataset: It is the main CatalogEntity. It contains metadata of the actual data you can use to Dataset.download or for Enrichment purposes.

  • Geography: Datasets in the Data Observatory are aggregated by different geographic boundaries. The Geography entity contains metadata to understand the boundaries of a Dataset. It’s used for enrichment and you can also Geography.download the underlying data.

  • Variable: Variables contain metadata about the columns available in each dataset for enrichment. Let’s say you explore a dataset with demographic data for the whole US at the Census tract level. The variables give you information about the actual columns you have available, such as: total_population, total_males, etc. On the other hand, you can use lists of Variable instances, Variable.id, or Variable.slug to enrich your own data.

Every Dataset is related to a Geography. You can have for example, demographics data at the Census tract, block groups or blocks levels.

When subscribing to a premium dataset, you should subscribe to both the Dataset.subscribe and the Geography.subscribe to be able to access both tables to enrich your own data.

The two main entities of the Catalog (Dataset and Geography) are related to other entities, that are useful for a hierarchical categorization and discovery of available data in the Data Observatory:

  • Category: Groups datasets of the same topic, for example, demographics, financial, etc.

  • Country: Groups datasets available by country

  • Provider: Gives you information about the provider of the source data

You can just list all the grouping entities. Take into account this is not the preferred way to discover the catalog metadata, since there can be thousands of entities on it:

>>> Category.get_all()
[<Category.get('demographics')>, ...]
>>> Country.get_all()
[<Country.get('usa')>, ...]
>>> Provider.get_all()
[<Provider.get('mrli')>, ...]

Or you can get them by ID:

>>> Category.get('demographics')
<Category.get('demographics')>
>>> Country.get('usa')
<Country.get('usa')>
>>> Provider.get('mrli')
<Provider.get('mrli')>

Examples

The preferred way of discover the available datasets in the Catalog is through nested filters

>>> catalog = Catalog()
>>> catalog.country('usa').category('demographics').datasets
[<Dataset.get('acs_sociodemogr_b758e778')>, ...]

You can include the geography as part of the nested filter like this:

>>> catalog = Catalog()
>>> catalog.country('usa').category('demographics').geography('ags_blockgroup_1c63771c').datasets

If a filter is already applied to a Catalog instance and you want to do a new hierarchical search, clear the previous filters with the Catalog().clear_filters() method:

>>> catalog = Catalog()
>>> catalog.country('usa').category('demographics').geography('ags_blockgroup_1c63771c').datasets
>>> catalog.clear_filters()
>>> catalog.country('esp').category('demographics').datasets

Otherwise the filters accumulate and you’ll get unexpected results.

During the discovery process, it’s useful to understand the related metadata to a given Geography or Dataset. A useful way of reading or filtering by metadata values consists on converting the entities to a pandas DataFrame:

>>> catalog = Catalog()
>>> catalog.country('usa').category('demographics').geography('ags_blockgroup_1c63771c').datasets.to_dataframe()

For each dataset in the Catalog, you can explore its variables, get a summary of its stats, etc.

>>> dataset = Dataset.get('od_acs_13345497')
>>> dataset.variables
[<Variable.get('dwellings_2_uni_fb8f6cfb')> #'Two-family (two unit) dwellings', ...]

See the Catalog guides and examples in our public documentation website for more information.

property countries

Get all the countries with datasets available in the Catalog.

Returns

CatalogList

Raises

CatalogError – if there’s a problem when connecting to the catalog or no countries are found.

property categories

Get all the categories in the Catalog.

Returns

CatalogList

Raises

CatalogError – if there’s a problem when connecting to the catalog or no categories are found.

property providers

Get all the providers in the Catalog.

Returns

CatalogList

Raises

CatalogError – if there’s a problem when connecting to the catalog or no providers are found.

property datasets

Get all the datasets in the Catalog.

Returns

CatalogList

Raises

CatalogError – if there’s a problem when connecting to the catalog or no datasets are found.

property geographies

Get all the geographies in the Catalog.

Returns

CatalogList

Raises

CatalogError – if there’s a problem when connecting to the catalog or no geographies are found.

country(country_id)

Add a country filter to the current Catalog instance.

Parameters

country_id (str) – ID of the country to be used for filtering the Catalog.

Returns

Catalog

category(category_id)

Add a category filter to the current Catalog instance.

Parameters

category_id (str) – ID of the category to be used for filtering the Catalog.

Returns

Catalog

geography(geography_id)

Add a geography filter to the current Catalog instance.

Parameters

geography_id (str) – ID or slug of the geography to be used for filtering the Catalog

Returns

Catalog

provider(provider_id)

Add a provider filter to the current Catalog instance

Parameters

provider_id (str) – ID of the provider to be used for filtering the Catalog.

Returns

CatalogList

public(is_public=True)

Add a public filter to the current Catalog instance

Parameters

is_public (str, optional) – Flag to filter public (True) or private (False) datasets. Default is True.

Returns

CatalogList

clear_filters()

Remove the current filters from this Catalog instance.

subscriptions(credentials=None)

Get all the subscriptions in the Catalog. You’ll get all the Dataset or Geography instances you have previously subscribed to.

Parameters

credentials (Credentials, optional) – credentials of CARTO user account. If not provided, a default credentials (if set with set_default_credentials) will be used.

Returns

Subscriptions

Raises

CatalogError – if there’s a problem when connecting to the catalog or no datasets are found.

datasets_filter(filter_dataset)

Get all the datasets in the Catalog filtered :returns: Dataset

class cartoframes.data.observatory.Category(data)

Bases: cartoframes.data.observatory.catalog.entity.CatalogEntity

This class represents a Category in the Catalog. Catalog datasets (Dataset class) are grouped by categories, so you can filter available datasets and geographies that belong (or are related) to a given Category.

Examples

List the available categories in the Catalog

>>> catalog = Catalog()
>>> categories = catalog.categories

Get a Category from the Catalog given its ID

>>> category = Category.get('demographics')
property datasets

Get the list of Dataset related to this category.

Returns

CatalogList List of Dataset instances.

Raises

CatalogError – if there’s a problem when connecting to the catalog or no datasets are found.

Examples

Get all the datasets Dataset available in the catalog for a Category instance

>>> category = Category.get('demographics')
>>> datasets = category.datasets

Same example as above but using nested filters:

>>> catalog = Catalog()
>>> datasets = catalog.category('demographics').datasets

You can perform other operations with a CatalogList:

>>> catalog = Catalog()
>>> datasets = catalog.category('demographics').datasets
>>> # convert the list of datasets into a pandas DataFrame
>>> # for further filtering and exploration
>>> dataframe = datasets.to_dataframe()
>>> # get a dataset by ID or slug
>>> dataset = Dataset.get(A_VALID_ID_OR_SLUG)
property geographies

Get the list of Geography related to this category.

Returns

CatalogList List of Geography instances.

Raises

CatalogError – if there’s a problem when connecting to the catalog or no datasets are found.

Examples

Get all the geographies Dataset available in the catalog for a Category instance

>>> category = Category.get('demographics')
>>> geographies = category.geographies

Same example as above but using nested filters:

>>> catalog = Catalog()
>>> geographies = catalog.category('demographics').geographies

You can perform these other operations with a CatalogList:

>>> catalog = Catalog()
>>> geographies = catalog.category('demographics').geographies
>>> # convert the list of datasets into a pandas DataFrame
>>> # for further filtering and exploration
>>> dataframe = geographies.to_dataframe()
>>> # get a geography by ID or slug
>>> dataset = Geography.get(A_VALID_ID_OR_SLUG)
property name

Name of this category instance.

class cartoframes.data.observatory.Country(data)

Bases: cartoframes.data.observatory.catalog.entity.CatalogEntity

This class represents a Country in the Catalog. Catalog datasets (Dataset class) belong to a country, so you can filter available datasets and geographies that belong (or are related) to a given Country.

Examples

List the available countries in the Catalog

>>> catalog = Catalog()
>>> countries = catalog.countries

Get a Country from the Catalog given its ID

>>> # country ID is a lowercase ISO Alpha 3 Code
>>> country = Country.get('usa')
property datasets

Get the list of Dataset covering data for this country.

Returns

CatalogList List of Dataset instances.

Raises

CatalogError – if there’s a problem when connecting to the catalog or no datasets are found.

Examples

Get all the datasets Dataset available in the catalog for a Country instance

>>> country = Country.get('usa')
>>> datasets = country.datasets

Same example as above but using nested filters:

>>> catalog = Catalog()
>>> datasets = catalog.country('usa').datasets

You can perform these other operations with a CatalogList:

>>> datasets = catalog.country('usa').datasets
>>> # convert the list of datasets into a pandas DataFrame
>>> # for further filtering and exploration
>>> dataframe = datasets.to_dataframe()
>>> # get a dataset by ID or slug
>>> dataset = Dataset.get(A_VALID_ID_OR_SLUG)
property geographies

Get the list of Geography covering data for this country.

Returns

CatalogList List of Geography instances.

Raises

CatalogError – if there’s a problem when connecting to the catalog or no geographies are found.

Examples

Get all the geographies Geography available in the catalog for a Country instance

>>> country = Country.get('usa')
>>> geographies = country.geographies

Same example as above but using nested filters:

>>> catalog = Catalog()
>>> geographies = catalog.country('usa').geographies

You can perform these other operations with a CatalogList:

>>> geographies = catalog.country('usa').geographies
>>> # convert the list of geographies into a pandas DataFrame
>>> # for further filtering and exploration
>>> dataframe = geographies.to_dataframe()
>>> # get a geography by ID or slug
>>> geography = Geography.get(A_VALID_ID_OR_SLUG)
property categories

Get the list of Category that are assigned to Dataset that cover data for this country.

Returns

CatalogList List of Category instances.

Raises

CatalogError – if there’s a problem when connecting to the catalog or no datasets are found.

Examples

Get all the categories Category available in the catalog for a Country instance

>>> country = Country.get('usa')
>>> categories = country.categories

Same example as above but using nested filters:

>>> catalog = Catalog()
>>> category = catalog.country('usa').categories
class cartoframes.data.observatory.Dataset(data)

Bases: cartoframes.data.observatory.catalog.entity.CatalogEntity

A Dataset represents the metadata of a particular dataset in the catalog.

If you have Data Observatory enabled in your CARTO account you can:

  • Use any public dataset to enrich your data with the variables in it by means of the Enrichment functions.

  • Subscribe (Dataset.subscribe) to any premium dataset to get a license that grants you the right to enrich your data with the variables (Variable) in it.

See the enrichment guides for more information about datasets, variables and enrichment functions.

The metadata of a dataset allows you to understand the underlying data, from variables (the actual columns in the dataset, data types, etc.), to a description of the provider, source, country, geography available, etc.

See the attributes reference in this class to understand the metadata available for each dataset in the catalog.

Examples

There are many different ways to explore the available datasets in the catalog.

You can just list all the available datasets:

>>> catalog = Catalog()
>>> datasets = catalog.datasets

Since the catalog contains thousands of datasets, you can convert the list of datasets to a pandas DataFrame for further filtering:

>>> catalog = Catalog()
>>> dataframe = catalog.datasets.to_dataframe()

The catalog supports nested filters for a hierarchical exploration. This way you could list the datasets available for different hierarchies: country, provider, category, geography, or a combination of them.

>>> catalog = Catalog()
>>> catalog.country('usa').category('demographics').geography('ags_blockgroup_1c63771c').datasets
property variables

Get the list of Variable that corresponds to this dataset. Variables are used in the Enrichment functions to augment your local DataFrames with columns from a Dataset in the Data Observatory.

Returns

CatalogList List of Variable instances.

Raises

CatalogError – if there’s a problem when connecting to the catalog.

property variables_groups

Get the list of VariableGroup related to this dataset.

Returns

CatalogList List of VariableGroup instances.

Raises

CatalogError – if there’s a problem when connecting to the catalog.

property name

Name of this dataset.

property description

Description of this dataset.

property provider

ID of the Provider of this dataset.

property provider_name

Name of the Provider of this dataset.

property category

Get the Category ID assigned to this dataset.sets

property category_name

Name of the Category assigned to this dataset.

property data_source

ID of the data source of this dataset.

property country

ISO 3166-1 alpha-3 code of the Country of this dataset. More info in: https://en.wikipedia.org/wiki/ISO_3166-1_alpha-3.

property language

ISO 639-3 code of the language that corresponds to the data of this dataset. More info in: https://en.wikipedia.org/wiki/ISO_639-3.

property geography

Get the Geography ID associated to this dataset.

property geography_name

Get the name of the Geography associated to this dataset.

property geography_description

Description of the Geography associated to this dataset.

property temporal_aggregation

Time amount in which data is aggregated in this dataset.

This is a free text field in this form: seconds, daily, hourly, monthly, yearly, etc.

property time_coverage

Time range that covers the data of this dataset.

Returns

List of str

Example: [2015-01-01,2016-01-01)

property update_frequency

Frequency in which the dataset is updated.

Returns

str

Example: monthly, yearly, etc.

property version

Internal version info of this dataset.

Returns

str

property is_public_data

Allows to check if the content of this dataset can be accessed with public credentials or if it is a premium dataset that needs a subscription.

Returns

  • True if the dataset is public

  • False if the dataset is premium

    (it requires to Dataset.subscribe)

Return type

A boolean value

property summary

JSON object with extra metadata that summarizes different properties of the dataset content.

head()

Returns a sample of the 10 first rows of the dataset data.

If a dataset has fewer than 10 rows (e.g., zip codes of small countries), this method will return None

Returns

pandas.DataFrame

tail()

“Returns the last ten rows of the dataset”

If a dataset has fewer than 10 rows (e.g., zip codes of small countries), this method will return None

Returns

pandas.DataFrame

counts()

Returns a summary of different counts over the actual dataset data.

Returns

pandas.Series

Example

# rows:         number of rows in the dataset
# cells:        number of cells in the dataset (rows * columns)
# null_cells:   number of cells with null value in the dataset
# null_cells_percent:   percent of cells with null value in the dataset
fields_by_type()

Returns a summary of the number of columns per data type in the dataset.

Returns

pandas.Series

Example

# float        number of columns with type float in the dataset
# string       number of columns with type string in the dataset
# integer      number of columns with type integer in the dataset
geom_coverage()

Shows a map to visualize the geographical coverage of the dataset.

Returns

Map

describe(autoformat=True)

Shows a summary of the actual stats of the variables (columns) of the dataset. Some of the stats provided per variable are: avg, max, min, sum, range, stdev, q1, q3, median and interquartile_range

Parameters

autoformat (boolean) – set automatic format for values. Default is True.

Returns

pandas.DataFrame

Example

# avg                    average value
# max                    max value
# min                    min value
# sum                    sum of all values
# range
# stdev                  standard deviation
# q1                     first quantile
# q3                     third quantile
# median                 median value
# interquartile_range
classmethod get_all(filters=None, credentials=None)

Get all the Dataset instances that comply with the indicated filters (or all of them if no filters are passed). If credentials are given, only the datasets granted for those credentials are returned.

Parameters
  • credentials (Credentials, optional) – credentials of CARTO user account. If provided, only datasets granted for those credentials are returned.

  • filters (dict, optional) – Dict containing pairs of dataset properties and its value to be used as filters to query the available datasets. If none is provided, no filters will be applied to the query.

Returns

CatalogList List of Dataset instances.

Raises
  • CatalogError – if there’s a problem when connecting to the catalog or no datasets are found.

  • DOError – if DO is not enabled.

to_csv(file_path, credentials=None, limit=None, order_by=None, sql_query=None, add_geom=None)

Download dataset data as a local csv file. You need Data Observatory enabled in your CARTO account, please contact us at support@carto.com for more information.

For premium datasets (those with is_public_data set to False), you need a subscription to the dataset. Check the subscription guides for more information.

Parameters
  • file_path (str) – the file path where save the dataset (CSV).

  • credentials (Credentials, optional) – credentials of CARTO user account. If not provided, a default credentials (if set with set_default_credentials) will be used.

  • limit (int, optional) – The number of rows to download. Default is to download all rows.

  • order_by (str, optional) – Field(s) used to order the rows to download. Default is unordered.

  • sql_query (str, optional) – a query to select, filter or aggregate the content of the dataset. For instance, to download just one row: select * from $dataset$ limit 1. The placeholder $dataset$ is mandatory and it will be replaced by the actual dataset before running the query. You can build any arbitrary query.

  • add_geom (boolean, optional) – to include the geography when using the sql_query argument. Default to True.

Raises
  • DOError – if you have not a valid license for the dataset being downloaded, DO is not enabled or there is an issue downloading the data.

  • ValueError – if the credentials argument is not valid.

to_dataframe(credentials=None, limit=None, order_by=None, sql_query=None, add_geom=None)

Download dataset data as a geopandas.GeoDataFrame. You need Data Observatory enabled in your CARTO account, please contact us at support@carto.com for more information.

For premium datasets (those with is_public_data set to False), you need a subscription to the dataset. Check the subscription guides for more information.

Parameters
  • credentials (Credentials, optional) – credentials of CARTO user account. If not provided, a default credentials (if set with set_default_credentials) will be used.

  • limit (int, optional) – The number of rows to download. Default is to download all rows.

  • order_by (str, optional) – Field(s) used to order the rows to download. Default is unordered.

  • sql_query (str, optional) – a query to select, filter or aggregate the content of the dataset. For instance, to download just one row: select * from $dataset$ limit 1. The placeholder $dataset$ is mandatory and it will be replaced by the actual dataset before running the query. You can build any arbitrary query.

  • add_geom (boolean, optional) – to include the geography when using the sql_query argument. Default to True.

Returns

geopandas.GeoDataFrame

Raises
  • DOError – if you have not a valid license for the dataset being downloaded, DO is not enabled or there is an issue downloading the data.

  • ValueError – if the credentials argument is not valid.

subscribe(credentials=None)

Subscribe to a dataset. You need Data Observatory enabled in your CARTO account, please contact us at support@carto.com for more information.

Datasets with is_public_data set to True do not need a license (i.e., a subscription) to be used. Datasets with is_public_data set to False do need a license (i.e., a subscription) to be used. You’ll get a license to use this dataset depending on the estimated_delivery_days set for this specific dataset.

See subscription_info for more info

Once you subscribe to a dataset, you can download its data by Dataset.to_csv or Dataset.to_dataframe and use the Enrichment functions. See the enrichment guides for more info.

You can check the status of your subscriptions by calling the subscriptions method in the Catalog with your CARTO Credentials.

Parameters

credentials (Credentials, optional) – credentials of CARTO user account. If not provided, a default credentials (if set with set_default_credentials) will be used.

Raises
  • CatalogError – if there’s a problem when connecting to the catalog.

  • DOError – if DO is not enabled.

subscription_info(credentials=None)

Get the subscription information of a Dataset, which includes the license, Terms of Service, rights, price, and estimated time of delivery, among other metadata of interest during the Dataset.subscription process.

Parameters

credentials (Credentials, optional) – credentials of CARTO user account. If not provided, a default credentials (if set with set_default_credentials) will be used.

Returns

SubscriptionInfo SubscriptionInfo instance.

Raises
  • CatalogError – if there’s a problem when connecting to the catalog.

  • DOError – if DO is not enabled.

class cartoframes.data.observatory.Geography(data)

Bases: cartoframes.data.observatory.catalog.entity.CatalogEntity

A Geography represents the metadata of a particular geography dataset in the catalog.

If you have Data Observatory enabled in your CARTO account you can:

  • Use any public geography to enrich your data with the variables in it by means of the Enrichment functions.

  • Subscribe (Geography.subscribe) to any premium geography to get a license that grants you the right to enrich your data with the variables in it.

See the enrichment guides for more information about geographies, variables, and enrichment functions.

The metadata of a geography allows you to understand the underlying data, from variables (the actual columns in the geography, data types, etc.), to a description of the provider, source, country, geography available, etc.

See the attributes reference in this class to understand the metadata available for each geography in the catalog.

Examples

There are many different ways to explore the available geographies in the catalog.

You can just list all the available geographies:

>>> catalog = Catalog()
>>> geographies = catalog.geographies

Since the catalog contains thousands of geographies, you can convert the list of geographies to a pandas DataFrame for further filtering:

>>> catalog = Catalog()
>>> dataframe = catalog.geographies.to_dataframe()

The catalog supports nested filters for a hierarchical exploration. This way you could list the geographies available for different hierarchies: country, provider, category or a combination of them.

>>> catalog = Catalog()
>>> catalog.country('usa').category('demographics').geographies

Usually you use a geography ID as an intermediate filter to get a list of datasets with aggregate data for that geographical resolution

>>> catalog = Catalog()
>>> catalog.country('usa').category('demographics').geography('ags_blockgroup_1c63771c').datasets
property datasets

Get the list of datasets related to this geography.

Returns

CatalogList List of Dataset instances.

Raises

CatalogError – if there’s a problem when connecting to the catalog or no datasets are found.

property name

Name of this geography.

property description

Description of this geography.

property country

Code (ISO 3166-1 alpha-3) of the country of this geography.

property language

Code (ISO 639-3) of the language that corresponds to the data of this geography.

property provider

ID of the Provider of this geography.

property provider_name

Name of the Provider of this geography.

property geom_coverage

Geographical coverage geometry encoded in WKB.

property geom_type

Info about the type of geometry of this geography.

property update_frequency

Frequency in which the geography data is updated.

Example: monthly, yearly, etc.

property version

Internal version info of this geography.

property is_public_data

Allows to check if the content of this geography can be accessed with public credentials or if it is a premium geography that needs a subscription.

Returns

  • True if the geography is public

  • False if the geography is premium

    (it requires to Geography.subscribe)

Return type

A boolean value

property summary

dict with extra metadata that summarizes different properties of the geography content.

classmethod get_all(filters=None, credentials=None)

Get all the Geography instances that comply with the indicated filters (or all of them if no filters are passed. If credentials are given, only the geographies granted for those credentials are returned.

Parameters
  • credentials (Credentials, optional) – credentials of CARTO user account. If provided, only geographies granted for those credentials are returned.

  • filters (dict, optional) – Dict containing pairs of geography properties and its value to be used as filters to query the available geographies. If none is provided, no filters will be applied to the query.

Returns

CatalogList List of Geography instances.

Raises
  • CatalogError – if there’s a problem when connecting to the catalog or no geographies are found.

  • DOError – if DO is not enabled.

to_csv(file_path, credentials=None, limit=None, order_by=None, sql_query=None)

Download geography data as a local csv file. You need Data Observatory enabled in your CARTO account, please contact us at support@carto.com for more information.

For premium geographies (those with is_public_data set to False), you need a subscription to the geography. Check the subscription guides for more information.

Parameters
  • file_path (str) – the file path where save the dataset (CSV).

  • credentials (Credentials, optional) – credentials of CARTO user account. If not provided, a default credentials (if set with set_default_credentials) will be used.

  • limit (int, optional) – The number of rows to download. Default is to download all rows.

  • order_by (str, optional) – Field(s) used to order the rows to download. Default is unordered.

  • sql_query (str, optional) – a query to select, filter or aggregate the content of the geography dataset. For instance, to download just one row: select * from $geography$ limit 1. The placeholder $geography$ is mandatory and it will be replaced by the actual geography dataset before running the query. You can build any arbitrary query.

Raises
  • DOError – if you have not a valid license for the geography being downloaded, DO is not enabled or there is an issue downloading the data.

  • ValueError – if the credentials argument is not valid.

to_dataframe(credentials=None, limit=None, order_by=None, sql_query=None)

Download geography data as a pandas.DataFrame. You need Data Observatory enabled in your CARTO account, please contact us at support@carto.com for more information.

For premium geographies (those with is_public_data set to False), you need a subscription to the geography. Check the subscription guides for more information.

Parameters
  • credentials (Credentials, optional) – credentials of CARTO user account. If not provided, a default credentials (if set with set_default_credentials) will be used.

  • limit (int, optional) – The number of rows to download. Default is to download all rows.

  • order_by (str, optional) – Field(s) used to order the rows to download. Default is unordered.

  • sql_query (str, optional) – a query to select, filter or aggregate the content of the geography dataset. For instance, to download just one row: select * from $geography$ limit 1. The placeholder $geography$ is mandatory and it will be replaced by the actual geography dataset before running the query. You can build any arbitrary query.

Returns

pandas.DataFrame

Raises
  • DOError – if you have not a valid license for the geography being downloaded, DO is not enabled or there is an issue downloading the data.

  • ValueError – if the credentials argument is not valid.

subscribe(credentials=None)

Subscribe to a Geography. You need Data Observatory enabled in your CARTO account, please contact us at support@carto.com for more information.

Geographies with is_public_data set to True, do not need a license (i.e. a subscription) to be used. Geographies with is_public_data set to False, do need a license (i.e. a subscription) to be used. You’ll get a license to use this geography depending on the estimated_delivery_days set for this specific geography.

See subscription_info for more info

Once you Geography.subscribe to a geography you can download its data by Geography.to_csv or Geography.to_dataframe and use the enrichment functions. See the enrichment guides for more info.

You can check the status of your subscriptions by calling the subscriptions method in the Catalog with your CARTO credentials.

Parameters

credentials (Credentials, optional) – credentials of CARTO user account. If not provided, a default credentials (if set with set_default_credentials) will be used.

Raises
  • CatalogError – if there’s a problem when connecting to the catalog.

  • DOError – if DO is not enabled.

subscription_info(credentials=None)

Get the subscription information of a Geography, which includes the license, TOS, rights, prize and estimated_time_of_delivery, among other metadata of interest during the subscription process.

Parameters

credentials (Credentials, optional) – credentials of CARTO user account. If not provided, a default credentials (if set with set_default_credentials) will be used.

Returns

SubscriptionInfo SubscriptionInfo instance.

Raises
  • CatalogError – if there’s a problem when connecting to the catalog.

  • DOError – if DO is not enabled.

class cartoframes.data.observatory.Provider(data)

Bases: cartoframes.data.observatory.catalog.entity.CatalogEntity

This class represents a Provider of datasets and geographies in the Catalog.

Examples

List the available providers in the Catalog in combination with nested filters (categories, countries, etc.)

>>> providers = Provider.get_all()

Get a Provider from the Catalog given its ID

>>> catalog = Catalog()
>>> provider = catalog.provider('mrli')
property datasets

Get the list of datasets related to this provider.

Returns

CatalogList List of Dataset instances.

Raises

CatalogError – if there’s a problem when connecting to the catalog or no datasets are found.

Examples

>>> provider = Provider.get('mrli')
>>> datasets = provider.datasets

Same example as above but using nested filters:

>>> catalog = Catalog()
>>> datasets = catalog.provider('mrli').datasets
property name

Name of this provider.

class cartoframes.data.observatory.Variable(data)

Bases: cartoframes.data.observatory.catalog.entity.CatalogEntity

This class represents a Variable of datasets in the Catalog.

Variables contain column names, description, data type, aggregation method, and some other metadata that is useful to understand the underlying data inside a Dataset

Examples

List the variables of a Dataset in combination with nested filters (categories, countries, etc.)

>>> dataset = Dataset.get('mbi_retail_turn_705247a')
>>> dataset.variables
[<Variable.get('RT_CI_95050c10')> #'Retail Turnover: index (country eq.100)', ...]
property datasets

Get the list of datasets related to this variable.

Returns

CatalogList List of Dataset instances.

Raises

CatalogError – if there’s a problem when connecting to the catalog or no datasets are found.

property name

Name of this variable.

property description

Description of this variable.

property column_name

Column name of the actual table related to the variable in the Dataset.

property db_type

Type in the database.

Returns

str

Examples: INTEGER, STRING, FLOAT, GEOGRAPHY, JSON, BOOL, etc.

property dataset

ID of the Dataset to which this variable belongs.

property agg_method

Text representing a description of the aggregation method used to compute the values in this Variable

property variable_group

If any, ID of the variable group to which this variable belongs.

property summary

JSON object with extra metadata that summarizes different properties of this variable.

property project_name
property schema_name
property dataset_name
describe(autoformat=True)

Shows a summary of the actual stats of the variable (column) of the dataset. Some of the stats provided per variable are: avg, max, min, sum, range, stdev, q1, q3, median and interquartile_range

Parameters

autoformat (boolean) – set automatic format for values. Default is True.

Example

# avg                    average value
# max                    max value
# min                    min value
# sum                    sum of all values
# range
# stdev                  standard deviation
# q1                     first quantile
# q3                     third quantile
# median                 median value
# interquartile_range
head()

Returns a sample of the 10 first values of the variable data.

For the cases of datasets with a content fewer than 10 rows (i.e. zip codes of small countries), this method won’t return anything

tail()

Returns a sample of the 10 last values of the variable data.

For the cases of datasets with a content fewer than 10 rows (i.e. zip codes of small countries), this method won’t return anything

counts()

Returns a summary of different counts over the actual variable values.

Example

# all               total number of values
# null              total number of null values
# zero              number of zero-valued entries
# extreme           number of values 3stdev outside the interquartile range
# distinct          number of distinct (unique) entries
# outliers          number of outliers (outside 1.5stdev the interquartile range
# zero_percent      percent of values that are zero
# distinct_percent  percent of values that are distinct
quantiles()

Returns the quantiles of the variable data.

top_values()

Returns information about the top values of the variable data.

histogram()

Plots an histogram with the variable data.

class cartoframes.data.observatory.Enrichment(credentials=None)

Bases: cartoframes.data.observatory.enrichment.enrichment_service.EnrichmentService

This is the main class to enrich your own data with data from the Data Observatory

To be able to use the Enrichment functions you need A CARTO account with Data Observatory v2 enabled. Contact us at support@carto.com for more information about this.

Please, see the Catalog discovery and subscription guides, to understand how to explore the Data Observatory repository and subscribe to premium datasets to be used in your enrichment workflows.

Parameters

credentials (Credentials, optional) – credentials of user account. If not provided, a default credentials (if set with set_default_credentials) will attempted to be used.

enrich_points(dataframe, variables, geom_col=None, filters=None)

Enrich your points DataFrame with columns (Variable) from one or more Dataset in the Data Observatory, intersecting the points in the source DataFrame with the geographies in the Data Observatory.

Extra columns as area and population will be provided in the resulting DataFrame for normalization purposes.

Parameters
  • (pandas.DataFrame (dataframe) – a DataFrame instance to be enriched.

  • geopandas.GeoDataFrame – a DataFrame instance to be enriched.

  • variables (Variable, list, str) – variable ID, slug or Variable instance or list of variable IDs, slugs or Variable instances taken from the Data Observatory Catalog.

  • geom_col (str, optional) – string indicating the geometry column name in the source DataFrame.

  • filters (dict, optional) – dictionary to filter results by variable values. As a key it receives the variable id, and as value receives a SQL operator, for example: {variable1.id: “> 30”}. It works by appending the filter SQL operators to the WHERE clause of the resulting enrichment SQL with the AND operator (in the example: WHERE {variable1.column_name} > 30). If you want to filter the same variable several times you can use a list as a dict value: {variable1.id: [”> 30”, “< 100”]}. The variables used to filter results should exist in variables property list.

Returns

A geopandas.GeoDataFrame enriched with the variables passed as argument.

Raises

EnrichmentError – if there is an error in the enrichment process.

Note that if the points of the `dataframe` you provide are contained in more than one geometry in the enrichment dataset, the number of rows of the returned `GeoDataFrame` could be different than the `dataframe` argument number of rows.

Examples

Enrich a points DataFrame with Catalog classes:

>>> df = pandas.read_csv('path/to/local/csv')
>>> variables = Catalog().country('usa').category('demographics').datasets[0].variables
>>> gdf_enrich = Enrichment().enrich_points(df, variables, geom_col='the_geom')

Enrich a points dataframe with several Variables using their ids:

>>> df = pandas.read_csv('path/to/local/csv')
>>> all_variables = Catalog().country('usa').category('demographics').datasets[0].variables
>>> variables = all_variables[:2]
>>> gdf_enrich = Enrichment().enrich_points(df, variables, geom_col='the_geom')

Enrich a points dataframe with filters:

>>> df = pandas.read_csv('path/to/local/csv')
>>> variable = Catalog().country('usa').category('demographics').datasets[0].variables[0]
>>> filters = {variable.id: "= '2019-09-01'"}
>>> gdf_enrich = Enrichment().enrich_points(
...     df,
...     variables=[variable],
...     filters=filters,
...     geom_col='the_geom')
enrich_polygons(dataframe, variables, geom_col=None, filters=None, aggregation='default')

Enrich your polygons DataFrame with columns (Variable) from one or more Dataset in the Data Observatory by intersecting the polygons in the source DataFrame with geographies in the Data Observatory.

When a polygon intersects with multiple geographies, the proportional part of the intersection will be used to interpolate the quantity of the polygon value intersected, aggregating them. Most of Variable instances have a Variable.agg_method property which is used by default as an aggregation function, but you can overwrite it using the aggregation parameter (not even doing the aggregation). If a variable does not have the agg_method property set and you do not overwrite it (with the aggregation parameter), the variable column will be skipped from the enrichment.

Parameters
  • dataframe (pandas.DataFrame, geopandas.GeoDataFrame) – a DataFrame instance to be enriched.

  • variables (Variable, list, str) – variable ID, slug or Variable instance or list of variable IDs, slugs or Variable instances taken from the Data Observatory Catalog.

  • geom_col (str, optional) – string indicating the geometry column name in the source DataFrame.

  • filters (dict, optional) – dictionary to filter results by variable values. As a key it receives the variable id, and as value receives a SQL operator, for example: {variable1.id: “> 30”}. It works by appending the filter SQL operators to the WHERE clause of the resulting enrichment SQL with the AND operator (in the example: WHERE {variable1.column_name} > 30). If you want to filter the same variable several times you can use a list as a dict value: {variable1.id: [”> 30”, “< 100”]}. The variables used to filter results should exist in variables property list.

  • aggregation (None, str, list, optional) –

    sets the data aggregation. The polygons in the source DataFrame can intersect with one or more polygons from the Data Observatory. With this method you can select how to aggregate the resulting data.

    An aggregation method can be one of these values: ‘MIN’, ‘MAX’, ‘SUM’, ‘AVG’, ‘COUNT’, ‘ARRAY_AGG’, ‘ARRAY_CONCAT_AGG’, ‘STRING_AGG’ but check this documentation for a complete list of aggregate functions.

    The options are: - str (default): ‘default’. Most Variable`s has a default aggregation method in the :py:attr:`Variable.agg_method property and it will be used to aggregate the data (a variable could not have agg_method defined and in this case, the variable will be skipped). - None: use this option to do the aggregation locally by yourself. You will receive a row of data from each polygon intersected. Also, you will receive the areas of the polygons intersection and the polygons intersected. - str: if you want to overwrite every default aggregation method, you can pass a string with the aggregation method to use. - dictionary: if you want to overwrite some default aggregation methods from your selected variables, use a dict as Variable.id: aggregation method pairs, for example: {variable1.id: ‘SUM’, variable3.id: ‘AVG’}. Or if you want to use several aggregation method for one variable, you can use a list as a dict value: {variable1.id: [‘SUM’, ‘AVG’], variable3.id: ‘AVG’}

Returns

A geopandas.GeoDataFrame enriched with the variables passed as argument.

Raises

EnrichmentError – if there is an error in the enrichment process.

Note that if the geometry of the `dataframe` you provide intersects with more than one geometry in the enrichment dataset, the number of rows of the returned `GeoDataFrame` could be different than the `dataframe` argument number of rows.

Examples

Enrich a polygons dataframe with one Variable:

>>> df = pandas.read_csv('path/to/local/csv')
>>> variable = Catalog().country('usa').category('demographics').datasets[0].variables[0]
>>> variables = [variable]
>>> gdf_enrich = Enrichment().enrich_polygons(df, variables, geom_col='the_geom')

Enrich a polygons dataframe with all Variables from a Catalog Dataset:

>>> df = pandas.read_csv('path/to/local/csv')
>>> variables = Catalog().country('usa').category('demographics').datasets[0].variables
>>> gdf_enrich = Enrichment().enrich_polygons(df, variables, geom_col='the_geom')

Enrich a polygons dataframe with several Variables using their ids:

>>> df = pandas.read_csv('path/to/local/csv')
>>> all_variables = Catalog().country('usa').category('demographics').datasets[0].variables
>>> variables = [all_variables[0].id, all_variables[1].id]
>>> cdf_enrich = Enrichment().enrich_polygons(df, variables, geom_col='the_geom')

Enrich a polygons dataframe with filters:

>>> df = pandas.read_csv('path/to/local/csv')
>>> variable = Catalog().country('usa').category('demographics').datasets[0].variables[0]
>>> filters = {variable.id: "= '2019-09-01'"}
>>> gdf_enrich = Enrichment().enrich_polygons(
...     df,
...     variables=[variable],
...     filters=filters,
...     geom_col='the_geom')

Enrich a polygons dataframe overwriting every variables aggregation method to use SUM function:

>>> df = pandas.read_csv('path/to/local/csv')
>>> all_variables = Catalog().country('usa').category('demographics').datasets[0].variables
>>> variables = all_variables[:3]
>>> gdf_enrich = Enrichment().enrich_polygons(
...     df,
...     variables,
...     aggregation='SUM',
...     geom_col='the_geom')

Enrich a polygons dataframe overwriting some of the variables aggregation methods:

>>> df = pandas.read_csv('path/to/local/csv')
>>> all_variables = Catalog().country('usa').category('demographics').datasets[0].variables
>>> variable1 = all_variables[0] // variable1.agg_method is 'AVG' but you want 'SUM'
>>> variable2 = all_variables[1] // variable2.agg_method is 'AVG' and it is what you want
>>> variable3 = all_variables[2] // variable3.agg_method is 'SUM' but you want 'AVG'
>>> variables = [variable1, variable2, variable3]
>>> aggregation = {
...     variable1.id: 'SUM',
...     variable3.id: 'AVG'
>>> }
>>> gdf_enrich = Enrichment().enrich_polygons(
...     df,
...     variables,
...     aggregation=aggregation,
...     geom_col='the_geom')

Enrich a polygons dataframe using several aggregation methods for a variable:

>>> df = pandas.read_csv('path/to/local/csv')
>>> all_variables = Catalog().country('usa').category('demographics').datasets[0].variables
>>> variable1 = all_variables[0] // variable1.agg_method is 'AVG' but you want 'SUM' and 'AVG'
>>> variable2 = all_variables[1] // variable2.agg_method is 'AVG' and it is what you want
>>> variable3 = all_variables[2] // variable3.agg_method is 'SUM' but you want 'AVG'
>>> variables = [variable1, variable2, variable3]
>>> aggregation = {
...     variable1.id: ['SUM', 'AVG'],
...     variable3.id: 'AVG'
>>> }
>>> cdf_enrich = Enrichment().enrich_polygons(df, variables, aggregation=aggregation)
Enrich a polygons dataframe without aggregating variables (because you want to it yourself, for example,

in case you want to use your custom function for aggregating the data):

>>> df = pandas.read_csv('path/to/local/csv')
>>> all_variables = Catalog().country('usa').category('demographics').datasets[0].variables
>>> variables = all_variables[:3]
>>> gdf_enrich = Enrichment().enrich_polygons(
...     df,
...     variables,
...     aggregation=None,
...     geom_col='the_geom')

The next example uses filters to calculate the SUM of car-free households Variable of the Catalog for each polygon of my_local_dataframe pandas DataFrame only for areas with more than 100 car-free households:

>>> variable = Variable.get('no_cars_d19dfd10')
>>> gdf_enrich = Enrichment().enrich_polygons(
...     my_local_dataframe,
...     variables=[variable],
...     aggregation={variable.id: 'SUM'},
...     filters={variable.id: '> 100'},
...     geom_col='the_geom')
class cartoframes.data.observatory.Subscriptions(credentials)

Bases: object

This class is used to list the datasets and geographies you have acquired a subscription (or valid license) for.

This class won’t show any dataset or geography tagged in the catalog as is_public_data since those data do not require a subscription.

property datasets

List of Dataset you have a subscription for.

Raises

CatalogError – if there’s a problem when connecting to the catalog.

property geographies

List of Geography you have a subscription for.

Raises

CatalogError – if there’s a problem when connecting to the catalog.

class cartoframes.data.observatory.SubscriptionInfo(raw_data)

Bases: object

This class represents a SubscriptionInfo of datasets and geographies in the Catalog

It contains private metadata (you need a CARTO account to query them) that is useful when you want a subscription license for a specific dataset or geography.

property id

The ID of the dataset or geography.

property estimated_delivery_days

Estimated days in which, once you Dataset.subscribe or Geography.subscribe, you’ll get a license.

Your licensed datasets and geographies will be returned by the catalog.subscriptions method.

For the datasets and geographies listed in the catalog.subscriptions method you can: - Dataset.download or Geography.download - Use their Dataset.variables in the Enrichment functions

property subscription_list_price

Price in $ for a one year subscription for this dataset.

property tos

Legal Terms Of Service.

property tos_link

Link to additional information for the legal Terms Of Service.

property licenses

Description of the licenses.

property licenses_link

Link to additional information about the available licenses.

property rights

Rights over the dataset or geography when you buy a license by means of a subscription.

class cartoframes.data.observatory.CatalogEntity(data)

Bases: abc.ABC

This is an internal class the rest of the classes related to the catalog discovery extend.

It contains:
  • Properties: id, slug (a shorter ID).

  • Static methods: get, get_all, get_list to retrieve elements or lists of objects in the catalog such as datasets, categories, variables, etc.

  • Instance methods to convert to pandas Series, Python dict, compare instances, etc.

As a rule of thumb you don’t directly use this class, it is documented for inheritance purposes.

id_field = 'id'
export_excluded_fields = ['summary_json', 'geom_coverage']
property id

The ID of the entity.

property slug

The slug (short ID) of the entity.

classmethod get(id_)

Get an instance of an entity by ID or slug.

Parameters

id (str) – ID or slug of a catalog entity.

Raises

CatalogError – if there’s a problem when connecting to the catalog or no entities are found.

classmethod get_all(filters=None)

List all instances of an entity.

Parameters

filters (dict, optional) – Dict containing pairs of entity properties and its value to be used as filters to query the available entities. If none is provided, no filters will be applied to the query.

classmethod get_list(id_list)

Get a list of instance of an entity by a list of IDs or slugs.

Parameters

id_list (list) – List of ID or slugs of entities in the catalog to retrieve instances.

Raises

CatalogError – if there’s a problem when connecting to the catalog or no entities are found.

to_series()

Converts the entity instance to a pandas Series.

to_dict()

Converts the entity instance to a Python dict.

is_subscribed(credentials, entity_type)

Check if the entity is subscribed

class cartoframes.data.observatory.CatalogList(data)

Bases: list

This is an internal class that represents a list of entities in the catalog of the same type.

It contains:
  • Instance methods to convert to get an instance of the entity by ID and to convert the list to a pandas DataFrame for further filtering and exploration.

As a rule of thumb you don’t directly use this class, it is documented for inheritance purposes.

to_dataframe()

Converts a list to a pandas DataFrame.

Examples

>>> catalog = Catalog()
>>> catalog.categories.to_dataframe()

Data Clients

class cartoframes.data.clients.SQLClient(credentials=None)

Bases: object

SQLClient class is a client to run SQL queries in a CARTO account. It also provides basic SQL utilities for analyzing and managing tables.

Parameters

credentials (Credentials) – A Credentials instance can be used in place of a username`|`base_url / api_key combination.

Example

>>> sql = SQLClient(credentials)
query(query, verbose=False)

Run a SQL query. It returns a list with content of the response. If the verbose param is True it returns the full SQL response in a dict. For more information check the SQL API documentation <https://carto.com/developers/sql-api/reference/#tag/Single-SQL-Statement>.

Parameters
  • query (str) – SQL query.

  • verbose (bool, optional) – flag to return all the response. Default False.

Example

>>> sql.query('SELECT * FROM table_name')
execute(query)

Run a long running query. It returns an object with the status and information of the job. For more information check the Batch API documentation <https://carto.com/developers/sql-api/reference/#tag/Batch-Queries>.

Parameters

query (str) – SQL query.

Example

>>> sql.execute('DROP TABLE table_name')
distinct(table_name, column_name)

Get the distict values and their count in a table for a specific column.

Parameters
  • table_name (str) – name of the table.

  • column_name (str) – name of the column.

Example

>>> sql.distinct('table_name', 'column_name')
[('value1', 10), ('value2', 5)]
count(table_name)

Get the number of elements of a table.

Parameters

table_name (str) – name of the table.

Example

>>> sql.count('table_name')
15
bounds(table_name)

Get the bounds of the geometries in a table.

Parameters

table_name (str) – name of the table containing a “the_geom” column.

Example

>>> sql.bounds('table_name')
[[-1,-1], [1,1]]
schema(table_name, raw=False)

Show information about the schema of a table.

Parameters
  • table_name (str) – name of the table.

  • raw (bool, optional) – return raw dict data if set to True. Default False.

Example

>>> sql.schema('table_name')
Column name          Column type
-------------------------------------
cartodb_id           number
the_geom             geometry
the_geom_webmercator geometry
column1              string
column2              number
describe(table_name, column_name)

Show information about a column in a specific table. It returns the COUNT of the table. If the column type is number it also returns the AVG, MIN and MAX.

Parameters
  • table_name (str) – name of the table.

  • column_name (str) – name of the column.

Example

>>> sql.describe('table_name', 'column_name')
count     1.00e+03
avg       2.00e+01
min       0.00e+00
max       5.00e+01
type: number
create_table(table_name, columns_types, if_exists='fail', cartodbfy=True)

Create a table with a specific table name and columns.

Parameters
  • table_name (str) – name of the table.

  • column_types (dict) – dictionary with the column names and types.

  • if_exists (str, optional) – collision strategy if the table already exists in CARTO. Options are ‘fail’ or ‘replace’. Default ‘fail’.

  • cartodbfy (bool, optional) – convert the table to CARTO format. Default True. More info here <https://carto.com/developers/sql-api/guides/creating-tables/#create-tables>.

Example

>>> sql.create_table('table_name', {'column1': 'text', 'column2': 'integer'})
insert_table(table_name, columns_values)

Insert a row to the table.

Parameters
  • table_name (str) – name of the table.

  • columns_values (dict) – dictionary with the column names and values.

Example

>>> sql.insert_table('table_name', {'column1': ['value1', 'value2'], 'column2': [1, 2]})
update_table(table_name, column_name, column_value, condition)

Update the column’s value for the rows that match the condition.

Parameters
  • table_name (str) – name of the table.

  • column_name (str) – name of the column.

  • column_value (str) – value of the column.

  • condition (str) – “where” condition of the request.

Example

>>> sql.update_table('table_name', 'column1', 'VALUE1', 'column1='value1'')
rename_table(table_name, new_table_name)

Rename a table from its table name.

Parameters
  • table_name (str) – name of the original table.

  • new_table_name (str) – name of the new table.

Example

>>> sql.rename_table('table_name', 'table_name2')
drop_table(table_name)

Remove a table from its table name.

Parameters

table_name (str) – name of the table.

Example

>>> sql.drop_table('table_name')

Viz

Viz namespace contains all the classes to create visualizations based on data

Map

class cartoframes.viz.Map(layers=None, basemap='Positron', bounds=None, size=None, viewport=None, show_info=None, theme=None, title=None, description=None, is_static=None, layer_selector=False, **kwargs)

Bases: object

Map to display a data visualization. It must contain a one or multiple Map instances. It provides control of the basemap, bounds and properties of the visualization.

Parameters
  • layers (list of Layer) – List of layers. Zero or more of Layer.

  • basemap (str, optional) –

    • if a str, name of a CARTO vector basemap. One of positron,

      voyager, or darkmatter from the BaseMaps class, or a hex, rgb or named color value.

    • if a dict, Mapbox or other style as the value of the style key.

      If a Mapbox style, the access token is the value of the token key.

  • bounds (dict or list, optional) – a dict with west, south, east, north keys, or an array of floats in the following structure: [[west, south], [east, north]]. If not provided the bounds will be automatically calculated to fit all features.

  • size (tuple, optional) – a (width, height) pair for the size of the map. Default is (1024, 632).

  • viewport (dict, optional) – Properties for display of the map viewport. Keys can be bearing or pitch.

  • show_info (bool, optional) – Whether to display center and zoom information in the map or not. It is False by default.

  • is_static (bool, optional) – Default False. If True, instead of showing and interactive map, a png image will be displayed. Warning: UI components are not properly rendered in the static view, we recommend to remove legends and widgets before rendering a static map.

  • theme (string, optional) – Use a different UI theme (legends, widgets, popups). Available themes are dark and ligth. By default, it is light for Positron and Voyager basemaps and dark for DarkMatter basemap.

  • title (string, optional) – Title to label the map. and will be displayed in the default legend.

  • description (string, optional) – Text that describes the map and will be displayed in the default legend after the title.

Raises

ValueError – if input parameters are not valid.

Examples

Basic usage.

>>> Map(Layer('table in your account'))

Display more than one layer on a map.

>>> Map(layers=[
...     Layer('table1'),
...     Layer('table2')
>>> ])

Change the CARTO basemap style.

>>> Map(Layer('table in your account'), basemap=basemaps.darkmatter)

Choose a custom basemap style. Here we use the Mapbox streets style, which requires an access token.

>>> basemap = {
...     'style': 'mapbox://styles/mapbox/streets-v9',
...     'token': 'your Mapbox token'
>>> }
>>> Map(Layer('table in your account'), basemap=basemap)

Remove basemap and show a custom color.

>>> Map(Layer('table in your account'), basemap='yellow')  # None, False, 'white', 'rgb(255, 255, 0)'

Set custom bounds.

>>> bounds = {
...     'west': -10,
...     'east': 10,
...     'north': -10,
...     'south': 10
>>> } # or bounds = [[-10, 10], [10, -10]]
>>> Map(Layer('table in your account'), bounds=bounds)

Show the map center and zoom value on the map (lower left-hand corner).

>>> Map(Layer('table in your account'), show_info=True)
publish(name, password, credentials=None, if_exists='fail', maps_api_key=None)

Publish the map visualization as a CARTO custom visualization.

Parameters
  • name (str) – The visualization name on CARTO.

  • password (str) – By setting it, your visualization will be protected by password. When someone tries to show the visualization, the password will be requested. To disable password you must set it to None.

  • credentials (Credentials, optional) – A Credentials instance. If not provided, the credentials will be automatically obtained from the default credentials if available. It is used to create the publication and also to save local data (if exists) into your CARTO account.

  • if_exists (str, optional) – ‘fail’ or ‘replace’. Behavior in case a publication with the same name already exists in your account. Default is ‘fail’.

  • maps_api_key (str, optional) – The Maps API key used for private datasets.

Example

Publishing the map visualization.

>>> tmap = Map(Layer('tablename'))
>>> tmap.publish('Custom Map Title', password=None)
update_publication(name, password, if_exists='fail')

Update the published map visualization.

Parameters
  • name (str) – The visualization name on CARTO.

  • password (str) – setting it your visualization will be protected by password and using None the visualization will be public.

  • if_exists (str, optional) – ‘fail’ or ‘replace’. Behavior in case a publication with the same name already exists in your account. Default is ‘fail’.

Raises

PublishError – if the map has not been published yet.

Layer

class cartoframes.viz.Layer(source, style=None, legends=None, widgets=None, popup_hover=None, popup_click=None, credentials=None, bounds=None, geom_col=None, default_legend=True, default_widget=False, default_popup_hover=True, default_popup_click=False, title=None, parent_map=None, encode_data=True)

Bases: object

Layer to display data on a map. This class can be used as one or more layers in Map or on its own in a Jupyter notebook to get a preview of a Layer.

Note: in a Jupyter notebook, it is not required to explicitly add a Layer to a

Map if only visualizing data as a single layer.

Parameters
  • source (str, pandas.DataFrame, geopandas.GeoDataFrame) – The source data: table name, SQL query or a dataframe. If dataframe, the geometry’s CRS must be WGS 84 (EPSG:4326).

  • style (dict, or Style, optional) – The style of the visualization.

  • legends (bool, Legend list, optional) – The legends definition for a layer. It contains a list of legend helpers. See Legend for more information.

  • widgets (bool, list, or WidgetList, optional) – Widget or list of widgets for a layer. It contains the information to display different widget types on the top right of the map. See WidgetList for more information.

  • popup_click (popup_element <cartoframes.viz.popup_element> list, optional) – Set up a popup to be displayed on a click event.

  • popup_hover (bool, popup_element <cartoframes.viz.popup_element> list, optional) – Set up a popup to be displayed on a hover event. Style helpers include a default hover popup, set it to popup_hover=False to remove it.

  • credentials (Credentials, optional) – A Credentials instance. This is only used for the simplified Source API. When a Source is passed as source, these credentials is simply ignored. If not provided the credentials will be automatically obtained from the default credentials.

  • bounds (dict or list, optional) – a dict with west, south, east, north keys, or an array of floats in the following structure: [[west, south], [east, north]]. If not provided the bounds will be automatically calculated to fit all features.

  • geom_col (str, optional) – string indicating the geometry column name in the source DataFrame.

  • default_legend (bool, optional) – flag to set the default legend. This only works when using a style helper. Default True.

  • default_widget (bool, optional) – flag to set the default widget. This only works when using a style helper. Default False.

  • default_popup_hover (bool, optional) – flag to set the default popup hover. This only works when using a style helper. Default True.

  • default_popup_click (bool, optional) – flag to set the default popup click. This only works when using a style helper. Default False.

  • title (str, optional) – title for the default legend, widget and popups.

  • encode_data (bool, optional) – By default, local data is encoded in order to save local space. However, when using very large files, it might not be possible to encode all the data. By disabling this parameter with encode_data=False the resulting notebook will be large, but there will be no encoding issues.

Raises

ValueError – if the source is not valid.

Examples

Create a layer with the defaults (style, legend).

>>> Layer('table_name')  # or Layer(gdf)

Create a layer with a custom style, legend, widget and popups.

>>> Layer(
...     'table_name',
...     style=color_bins_style('column_name'),
...     legends=color_bins_legend(title='Legend title'),
...     widgets=histogram_widget('column_name', title='Widget title'),
...     popup_click=popup_element('column_name', title='Popup title')
...     popup_hover=popup_element('column_name', title='Popup title'))

Create a layer specifically tied to a Credentials.

>>> Layer(
...     'table_name',
...     credentials=Credentials.from_file('creds.json'))
property map_index

Layer map index

Source

class cartoframes.viz.Source(source, credentials=None, geom_col=None, encode_data=True)

Bases: object

Parameters
  • source (str, pandas.DataFrame, geopandas.GeoDataFrame) – a table name, SQL query, DataFrame, GeoDataFrame instance.

  • credentials (Credentials, optional) – A Credentials instance. If not provided, the credentials will be automatically obtained from the default credentials if available.

  • geom_col (str, optional) – string indicating the geometry column name in the source DataFrame.

  • encode_data (bool, optional) – Indicates whether the data needs to be encoded. Default is True.

Example

Table name.

>>> Source('table_name')

SQL query.

>>> Source('SELECT * FROM table_name')

DataFrame object.

>>> Source(df, geom_col='my_geom')

GeoDataFrame object.

>>> Source(gdf)

Setting the credentials.

>>> Source('table_name', credentials)

Layout

class cartoframes.viz.Layout(maps, n_size=None, m_size=None, viewport=None, map_height=250, full_height=True, is_static=False, **kwargs)

Bases: object

Create a layout of visualizations in order to compare them.

Parameters
  • maps (list of Map) – List of maps. Zero or more of Map.

  • n_size (number, optional) – Number of columns of the layout

  • m_size (number, optional) – Number of rows of the layout

  • viewport (dict, optional) – Properties for display of the maps viewport. Keys can be bearing or pitch.

  • is_static (boolean, optional) – By default is False. All the maps in each visualization are interactive. In order to set them static images for performance reasons set is_static to True.

  • map_height (number, optional) – Height in pixels for each visualization. Default is 250.

  • full_height (boolean, optional) – When a layout visualization is published, it will fit the screen height. Otherwise, each visualization height will be map_height. Default True.

Raises

ValueError – if the input elements are not instances of Map.

Examples

Basic usage.

>>> Layout([
...    Map(Layer('table_in_your_account')), Map(Layer('table_in_your_account')),
...    Map(Layer('table_in_your_account')), Map(Layer('table_in_your_account'))
>>> ])

Display a 2x2 layout.

>>> Layout([
...     Map(Layer('table_in_your_account')), Map(Layer('table_in_your_account')),
...     Map(Layer('table_in_your_account')), Map(Layer('table_in_your_account'))
>>> ], 2, 2)

Custom Titles.

>>> Layout([
...     Map(Layer('table_in_your_account'), title="Visualization 1 custom title"),
...     Map(Layer('table_in_your_account'), title="Visualization 2 custom title")),
>>> ])

Viewport.

>>> Layout([
...     Map(Layer('table_in_your_account')),
...     Map(Layer('table_in_your_account')),
...     Map(Layer('table_in_your_account')),
...     Map(Layer('table_in_your_account'))
>>> ], viewport={ 'zoom': 2 })
>>> Layout([
...     Map(Layer('table_in_your_account'), viewport={ 'zoom': 0.5 }),
...     Map(Layer('table_in_your_account')),
...     Map(Layer('table_in_your_account')),
...     Map(Layer('table_in_your_account'))
>>> ], viewport={ 'zoom': 2 })

Create an static layout

>>> Layout([
...    Map(Layer('table_in_your_account')), Map(Layer('table_in_your_account')),
...    Map(Layer('table_in_your_account')), Map(Layer('table_in_your_account'))
>>> ], is_static=True)
publish(name, password, credentials=None, if_exists='fail', maps_api_key=None)

Publish the layout visualization as a CARTO custom visualization.

Parameters
  • name (str) – The visualization name on CARTO.

  • password (str) – By setting it, your visualization will be protected by password. When someone tries to show the visualization, the password will be requested. To disable password you must set it to None.

  • credentials (Credentials, optional) – A Credentials instance. If not provided, the credentials will be automatically obtained from the default credentials if available. It is used to create the publication and also to save local data (if exists) into your CARTO account.

  • if_exists (str, optional) – ‘fail’ or ‘replace’. Behavior in case a publication with the same name already exists in your account. Default is ‘fail’.

  • maps_api_key (str, optional) – The Maps API key used for private datasets.

Example

Publishing the map visualization.

>>> tlayout = Layout([
...    Map(Layer('table_in_your_account')), Map(Layer('table_in_your_account')),
...    Map(Layer('table_in_your_account')), Map(Layer('table_in_your_account'))
>>> ])
>>> tlayout.publish('Custom Map Title', password=None)
update_publication(name, password, if_exists='fail')

Update the published layout visualization.

Parameters
  • name (str) – The visualization name on CARTO.

  • password (str) – setting it your visualization will be protected by password and using None the visualization will be public.

  • if_exists (str, optional) – ‘fail’ or ‘replace’. Behavior in case a publication with the same name already exists in your account. Default is ‘fail’.

Raises

PublishError – if the map has not been published yet.

Styles

cartoframes.viz.animation_style(value, duration=20, fade_in=1, fade_out=1, color=None, size=None, opacity=None, stroke_color=None, stroke_width=None)

Helper function for quickly creating an animated style.

Parameters
  • value (str) – Column to symbolize by.

  • duration (float, optional) – Time of the animation in seconds. Default is 20s.

  • fade_in (float, optional) – Time of fade in transitions in seconds. Default is 1s.

  • fade_out (float, optional) – Time of fade out transitions in seconds. Default is 1s.

  • color (str, optional) – Hex, rgb or named color value. Default is ‘#EE5D5A’ for points, ‘#4CC8A3’ for lines and #826DBA for polygons.

  • size (int, optional) – Size of point or line features.

  • opacity (float, optional) – Opacity value. Default is 1 for points and lines and 0.9 for polygons.

  • stroke_width (int, optional) – Size of the stroke on point features.

  • stroke_color (str, optional) – Color of the stroke on point features. Default is ‘#222’.

Returns

cartoframes.viz.style.Style

cartoframes.viz.basic_style(color=None, size=None, opacity=None, stroke_color=None, stroke_width=None)

Helper function for quickly creating a basic style.

Parameters
  • color (str, optional) – hex, rgb or named color value. Defaults is ‘#FFB927’ for point geometries and ‘#4CC8A3’ for lines.

  • size (int, optional) – Size of point or line features.

  • opacity (float, optional) – Opacity value. Default is 1 for points and lines and 0.9 for polygons.

  • stroke_color (str, optional) – Color of the stroke on point features. Default is ‘#222’.

  • stroke_width (int, optional) – Size of the stroke on point features.

Returns

cartoframes.viz.style.Style

cartoframes.viz.cluster_size_style(value, operation='count', resolution=32, color=None, opacity=None, stroke_color=None, stroke_width=None, animate=None)

Helper function for quickly creating a cluster map with continuously sized points. Cluster operations are performed in the back-end, so this helper can be used only with CARTO tables or SQL queries. It cannot be used with GeoDataFrames.

Parameters
  • value (str) – Numeric column to aggregate.

  • operation (str, optional) – Cluster operation, defaults to ‘count’. Other options available are ‘avg’, ‘min’, ‘max’, and ‘sum’.

  • resolution (int, optional) – Resolution of aggregation grid cell. Set to 32 by default.

  • color (str, optional) – Hex, rgb or named color value. Defaults is ‘#FFB927’ for point geometries.

  • opacity (float, optional) – Opacity value. Default is 0.8.

  • stroke_color (str, optional) – Color of the stroke on point features. Default is ‘#222’.

  • stroke_width (int, optional) – Size of the stroke on point features.

  • animate (str, optional) – Animate features by date/time or other numeric field.

Returns

cartoframes.viz.style.Style

cartoframes.viz.color_bins_style(value, method='quantiles', bins=5, breaks=None, palette=None, size=None, opacity=None, stroke_color=None, stroke_width=None, animate=None)

Helper function for quickly creating a color bins style.

Parameters
  • value (str) – Column to symbolize by.

  • method (str, optional) – Classification method of data: “quantiles”, “equal”, “stdev”. Default is “quantiles”.

  • bins (int, optional) – Number of size classes (bins) for map. Default is 5.

  • breaks (list<int>, optional) – Assign manual class break values.

  • palette (str, optional) – Palette that can be a named cartocolor palette or other valid color palette. Use help(cartoframes.viz.palettes) to get more information. Default is “purpor”.

  • size (int, optional) – Size of point or line features.

  • opacity (float, optional) – Opacity value. Default is 1 for points and lines and 0.9 for polygons.

  • stroke_color (str, optional) – Color of the stroke on point features. Default is ‘#222’.

  • stroke_width (int, optional) – Size of the stroke on point features.

  • animate (str, optional) – Animate features by date/time or other numeric field.

Returns

cartoframes.viz.style.Style

cartoframes.viz.color_category_style(value, top=11, cat=None, palette=None, size=None, opacity=None, stroke_color=None, stroke_width=None, animate=None)

Helper function for quickly creating a color category style.

Parameters
  • value (str) – Column to symbolize by.

  • top (int, optional) – Number of categories. Default is 11. Values can range from 1 to 16.

  • cat (list<str>, optional) – Category list. Must be a valid list of categories.

  • palette (str, optional) – Palette that can be a named cartocolor palette or other valid color palette. Use help(cartoframes.viz.palettes) to get more information. Default is “bold”.

  • size (int, optional) – Size of point or line features.

  • opacity (float, optional) – Opacity value. Default is 1 for points and lines and 0.9 for polygons.

  • stroke_color (str, optional) – Color of the stroke on point features. Default is ‘#222’.

  • stroke_width (int, optional) – Size of the stroke on point features.

  • animate (str, optional) – Animate features by date/time or other numeric field.

Returns

cartoframes.viz.style.Style

cartoframes.viz.color_continuous_style(value, size=None, range_min=None, range_max=None, palette=None, opacity=None, stroke_color=None, stroke_width=None, animate=None)

Helper function for quickly creating a color continuous style.

Parameters
  • value (str) – Column to symbolize by.

  • range_min (int, optional) – The minimum value of the data range for the continuous color ramp. Defaults to the globalMIN of the dataset.

  • range_max (int, optional) – The maximum value of the data range for the continuous color ramp. Defaults to the globalMAX of the dataset.

  • palette (str, optional) – Palette that can be a named cartocolor palette or other valid color palette. Use help(cartoframes.viz.palettes) to get more information. Default is “bluyl”.

  • opacity (float, optional) – Opacity value. Default is 1 for points and lines and 0.9 for polygons.

  • stroke_color (str, optional) – Color of the stroke on point features. Default is ‘#222’.

  • stroke_width (int, optional) – Size of the stroke on point features.

  • animate (str, optional) – Animate features by date/time or other numeric field.

Returns

cartoframes.viz.style.Style

cartoframes.viz.isolines_style(value='range_label', top=11, cat=None, palette='pinkyl', size=None, opacity=0.8, stroke_color='rgba(150,150,150,0.4)', stroke_width=None)

Helper function for quickly creating an isolines style. Based on the color category style.

Parameters
  • value (str, optional) – Column to symbolize by. Default is “range_label”.

  • top (int, optional) – Number of categories. Default is 11. Values can range from 1 to 16.

  • cat (list<str>, optional) – Category list. Must be a valid list of categories.

  • palette (str, optional) – Palette that can be a named cartocolor palette or other valid color palette. Use help(cartoframes.viz.palettes) to get more information. Default is “pinkyl”.

  • size (int, optional) – Size of point or line features.

  • opacity (float, optional) – Opacity value for point color and line features. Default is 0.8.

  • stroke_color (str, optional) – Color of the stroke on point features. Default is ‘rgba(150,150,150,0.4)’.

  • stroke_width (int, optional) – Size of the stroke on point features.

Returns

cartoframes.viz.style.Style

cartoframes.viz.size_bins_style(value, method='quantiles', bins=5, breaks=None, size_range=None, color=None, opacity=None, stroke_width=None, stroke_color=None, animate=None)

Helper function for quickly creating a size bind style with classification method/buckets.

Parameters
  • value (str) – Column to symbolize by.

  • method (str, optional) – Classification method of data: “quantiles”, “equal”, “stdev”. Default is “quantiles”.

  • bins (int, optional) – Number of size classes (bins) for map. Default is 5.

  • breaks (list<int>, optional) – Assign manual class break values.

  • size_range (list<int>, optional) – Min/max size array as a string. Default is [2, 14] for point geometries and [1, 10] for lines.

  • color (str, optional) – Hex, rgb or named color value. Default is ‘#EE5D5A’ for point geometries and ‘#4CC8A3’ for lines.

  • opacity (float, optional) – Opacity value for point color and line features. Default is 0.8.

  • stroke_color (str, optional) – Color of the stroke on point features. Default is ‘#222’.

  • stroke_width (int, optional) – Size of the stroke on point features.

  • animate (str, optional) – Animate features by date/time or other numeric field.

Returns

cartoframes.viz.style.Style

cartoframes.viz.size_category_style(value, top=5, cat=None, size_range=None, color=None, opacity=None, stroke_color=None, stroke_width=None, animate=None)

Helper function for quickly creating a size category style.

Parameters
  • value (str) – Column to symbolize by.

  • top (int, optional) – Number of size categories. Default is 5. Values can range from 1 to 16.

  • cat (list<str>, optional) – Category list as a string.

  • size_range (list<int>, optional) – Min/max size array as a string. Default is [2, 20] for point geometries and [1, 10] for lines.

  • color (str, optional) – hex, rgb or named color value. Default is ‘#F46D43’ for point geometries and ‘#4CC8A3’ for lines.

  • opacity (float, optional) – Opacity value for point color and line features. Default is 0.8.

  • stroke_color (str, optional) – Color of the stroke on point features. Default is ‘#222’.

  • stroke_width (int, optional) – Size of the stroke on point features.

  • animate (str, optional) – Animate features by date/time or other numeric field.

Returns

cartoframes.viz.style.Style

cartoframes.viz.size_continuous_style(value, size_range=None, range_min=None, range_max=None, color=None, opacity=None, stroke_color=None, stroke_width=None, animate=None)

Helper function for quickly creating a size continuous style.

Parameters
  • value (str) – Column to symbolize by.

  • size_range (list<int>, optional) – Min/max size array as a string. Default is [2, 40] for point geometries and [1, 10] for lines.

  • range_min (int, optional) – The minimum value of the data range for the continuous size ramp. Defaults to the globalMIN of the dataset.

  • range_max (int, optional) – The maximum value of the data range for the continuous size ramp. Defaults to the globalMAX of the dataset.

  • color (str, optional) – hex, rgb or named color value. Defaults is ‘#FFB927’ for point geometries and ‘#4CC8A3’ for lines.

  • opacity (float, optional) – Opacity value for point color and line features. Default is 0.8.

  • stroke_color (str, optional) – Color of the stroke on point features. Default is ‘#222’.

  • stroke_width (int, optional) – Size of the stroke on point features.

  • animate (str, optional) – Animate features by date/time or other numeric field.

Returns

cartoframes.viz.style.Style

Legends

cartoframes.viz.basic_legend(title=None, description=None, footer=None)

Helper function for quickly creating a basic legend.

Parameters
  • title (str, optional) – Title of legend.

  • description (str, optional) – Description in legend.

  • footer (str, optional) – Footer of legend. This is often used to attribute data sources

Returns

cartoframes.viz.legend.Legend

Example

>>> basic_legend(
...     title='Legend title',
...     description='Legend description',
...     footer='Legend footer')
cartoframes.viz.color_bins_legend(title=None, description=None, footer=None, prop='color', variable=None, dynamic=True, ascending=False, format=None)

Helper function for quickly creating a color bins legend.

Parameters
  • title (str, optional) – Title of legend.

  • description (str, optional) – Description in legend.

  • footer (str, optional) – Footer of legend. This is often used to attribute data sources.

  • prop (str, optional) – Allowed properties are ‘color’ and ‘stroke_color’. It is ‘color’ by default.

  • variable (str, optional) – If the information in the legend depends on a different value than the information set to the style property, it is possible to set an independent variable.

  • dynamic (boolean, optional) – Update and render the legend depending on viewport changes. Defaults to True.

  • ascending (boolean, optional) – If set to True the values are sorted in ascending order. Defaults to False.

  • format (str, optional) – Format to apply to number values in the widget, based on d3-format specifier (https://github.com/d3/d3-format#locale_format).

Returns

cartoframes.viz.legend.Legend

Example

>>> color_bins_legend(
...     title='Legend title',
...     description='Legend description',
...     footer='Legend footer',
...     dynamic=False,
...     format='.2~s')
cartoframes.viz.color_category_legend(title=None, description=None, footer=None, prop='color', variable=None, dynamic=True)

Helper function for quickly creating a color category legend.

Parameters
  • title (str, optional) – Title of legend.

  • description (str, optional) – Description in legend.

  • footer (str, optional) – Footer of legend. This is often used to attribute data sources.

  • prop (str, optional) – Allowed properties are ‘color’ and ‘stroke_color’. It is ‘color’ by default.

  • variable (str, optional) – If the information in the legend depends on a different value than the information set to the style property, it is possible to set an independent variable.

  • dynamic (boolean, optional) – Update and render the legend depending on viewport changes. Defaults to True.

Returns

cartoframes.viz.legend.Legend

Example

>>> color_category_legend(
...     title='Legend title',
...     description='Legend description',
...     footer='Legend footer',
...     dynamic=False)
cartoframes.viz.color_continuous_legend(title=None, description=None, footer=None, prop='color', variable=None, dynamic=True, ascending=False, format=None)

Helper function for quickly creating a color continuous legend.

Parameters
  • title (str, optional) – Title of legend.

  • description (str, optional) – Description in legend.

  • footer (str, optional) – Footer of legend. This is often used to attribute data sources.

  • prop (str, optional) – Allowed properties are ‘color’ and ‘stroke_color’. It is ‘color’ by default.

  • variable (str, optional) – If the information in the legend depends on a different value than the information set to the style property, it is possible to set an independent variable.

  • dynamic (boolean, optional) – Update and render the legend depending on viewport changes. Defaults to True.

  • ascending (boolean, optional) – If set to True the values are sorted in ascending order. Defaults to False.

  • format (str, optional) – Format to apply to number values in the widget, based on d3-format specifier (https://github.com/d3/d3-format#locale_format).

Returns

cartoframes.viz.legend.Legend

Example

>>> color_continuous_legend(
...     title='Legend title',
...     description='Legend description',
...     footer='Legend footer',
...     dynamic=False,
...     format='.2~s')
cartoframes.viz.default_legend(title=None, description=None, footer=None, format=None, **kwargs)

Helper function for quickly creating a default legend based on the style. A style helper is required.

Parameters
  • title (str, optional) – Title of legend.

  • description (str, optional) – Description in legend.

  • footer (str, optional) – Footer of legend. This is often used to attribute data sources.

  • format (str, optional) – Format to apply to number values in the widget, based on d3-format specifier (https://github.com/d3/d3-format#locale_format).

Returns

cartoframes.viz.legend.Legend

Example

>>> default_legend(
...     title='Legend title',
...     description='Legend description',
...     footer='Legend footer',
...     format='.2~s')
cartoframes.viz.size_bins_legend(title=None, description=None, footer=None, prop='size', variable=None, dynamic=True, ascending=False, format=None)

Helper function for quickly creating a size bins legend.

Parameters
  • title (str, optional) – Title of legend.

  • description (str, optional) – Description in legend.

  • footer (str, optional) – Footer of legend. This is often used to attribute data sources.

  • prop (str, optional) – Allowed properties are ‘size’ and ‘stroke_width’. It is ‘size’ by default.

  • variable (str, optional) – If the information in the legend depends on a different value than the information set to the style property, it is possible to set an independent variable.

  • dynamic (boolean, optional) – Update and render the legend depending on viewport changes. Defaults to True.

  • ascending (boolean, optional) – If set to True the values are sorted in ascending order. Defaults to False.

  • format (str, optional) – Format to apply to number values in the widget, based on d3-format specifier (https://github.com/d3/d3-format#locale_format).

Returns

cartoframes.viz.legend.Legend

Example

>>> size_bins_style(
...     title='Legend title',
...     description='Legend description',
...     footer='Legend footer',
...     dynamic=False,
...     format='.2~s')
cartoframes.viz.size_category_legend(title=None, description=None, footer=None, prop='size', variable=None, dynamic=True)

Helper function for quickly creating a size category legend.

Parameters
  • title (str, optional) – Title of legend.

  • description (str, optional) – Description in legend.

  • footer (str, optional) – Footer of legend. This is often used to attribute data sources.

  • prop (str, optional) – Allowed properties are ‘size’ and ‘stroke_width’. It is ‘size’ by default.

  • variable (str, optional) – If the information in the legend depends on a different value than the information set to the style property, it is possible to set an independent variable.

  • dynamic (boolean, optional) – Update and render the legend depending on viewport changes. Defaults to True.

Returns

cartoframes.viz.legend.Legend

Example

>>> size_category_legend(
...     title='Legend title',
...     description='Legend description',
...     footer='Legend footer',
...     dynamic=False)
cartoframes.viz.size_continuous_legend(title=None, description=None, footer=None, prop='size', variable='size_value', dynamic=True, ascending=False, format=None)

Helper function for quickly creating a size continuous legend.

Parameters
  • prop (str, optional) – Allowed properties are ‘size’ and ‘stroke_width’.

  • dynamic (boolean, optional) – Update and render the legend depending on viewport changes. Defaults to True.

  • title (str, optional) – Title of legend.

  • description (str, optional) – Description in legend.

  • footer (str, optional) – Footer of legend. This is often used to attribute data sources.

  • variable (str, optional) – If the information in the legend depends on a different value than the information set to the style property, it is possible to set an independent variable.

  • ascending (boolean, optional) – If set to True the values are sorted in ascending order. Defaults to False.

  • format (str, optional) – Format to apply to number values in the widget, based on d3-format specifier (https://github.com/d3/d3-format#locale_format).

Returns

cartoframes.viz.legend.Legend

Example

>>> size_continuous_legend(
...     title='Legend title',
...     description='Legend description',
...     footer='Legend footer',
...     dynamic=False,
...     format='.2~s')

Widgets

cartoframes.viz.animation_widget(title=None, description=None, footer=None, prop='filter')

Helper function for quickly creating an animated widget.

The animation widget includes an animation status bar as well as controls to play or pause animated data. The filter property of your map’s style, applied to either a date or numeric field, drives both the animation and the widget. Only one animation can be controlled per layer.

Parameters
  • title (str, optional) – Title of widget.

  • description (str, optional) – Description text widget placed under widget title.

  • footer (str, optional) – Footer text placed on the widget bottom.

  • prop (str, optional) – Property of the style to get the animation. Default “filter”.

Returns

cartoframes.viz.widget.Widget

Example

>>> animation_widget(
...     title='Widget title',
...     description='Widget description',
...     footer='Widget footer')
cartoframes.viz.basic_widget(title=None, description=None, footer=None)

Helper function for quickly creating a default widget.

The default widget is a general purpose widget that can be used to provide additional information about your map.

Parameters
  • title (str, optional) – Title of widget.

  • description (str, optional) – Description text widget placed under widget title.

  • footer (str, optional) – Footer text placed on the widget bottom.

Returns

cartoframes.viz.widget.Widget

Example

>>> basic_widget(
...     title='Widget title',
...     description='Widget description',
...     footer='Widget footer')
cartoframes.viz.category_widget(value, title=None, description=None, footer=None, read_only=False, weight=1)

Helper function for quickly creating a category widget.

Parameters
  • value (str) – Column name of the category value.

  • title (str, optional) – Title of widget.

  • description (str, optional) – Description text widget placed under widget title.

  • footer (str, optional) – Footer text placed on the widget bottom.

  • read_only (boolean, optional) – Interactively filter a category by selecting it in the widget. Set to “False” by default.

  • weight (int, optional) – Weight of the category widget. Default value is 1.

Returns

cartoframes.viz.widget.Widget

Example

>>> category_widget(
...     'column_name',
...     title='Widget title',
...     description='Widget description',
...     footer='Widget footer')
cartoframes.viz.default_widget(title=None, description=None, footer=None, **kwargs)

Helper function for quickly creating a default widget based on the style. A style helper is required.

Parameters
  • title (str, optional) – Title of widget.

  • description (str, optional) – Description text widget placed under widget title.

  • footer (str, optional) – Footer text placed on the widget bottom.

Returns

cartoframes.viz.widget.Widget

Example

>>> default_widget(
...     title='Widget title',
...     description='Widget description',
...     footer='Widget footer')
cartoframes.viz.formula_widget(value, operation=None, title=None, description=None, footer=None, is_global=False, format=None)

Helper function for quickly creating a formula widget.

Formula widgets calculate aggregated values (‘avg’, ‘max’, ‘min’, ‘sum’) from numeric columns or counts of features (‘count’) in a dataset.

A formula widget’s aggregations can be calculated on ‘global’ or ‘viewport’ based values. If you want the values in a formula widget to update on zoom and/or pan, use viewport based aggregations.

Parameters
  • value (str) – Column name of the numeric value.

  • operation (str) – attribute for widget’s aggregated value (‘count’, ‘avg’, ‘max’, ‘min’, ‘sum’).

  • title (str, optional) – Title of widget.

  • description (str, optional) – Description text widget placed under widget title.

  • footer (str, optional) – Footer text placed on the widget bottom.

  • is_global (boolean, optional) – Account for calculations based on the entire dataset (‘global’) vs. the default of ‘viewport’ features.

  • format (str, optional) – Format to apply to number values in the widget, based on d3-format specifier (https://github.com/d3/d3-format#locale_format).

Returns

cartoframes.viz.widget.Widget

Example

>>> formula_widget(
...     'column_name',
...     title='Widget title',
...     description='Widget description',
...     footer='Widget footer')
>>> formula_widget(
...     'column_name',
...     operation='sum',
...     title='Widget title',
...     description='Widget description',
...     footer='Widget footer',
...     format='.2~s')
cartoframes.viz.histogram_widget(value, title=None, description=None, footer=None, read_only=False, buckets=20, weight=1)

Helper function for quickly creating a histogram widget.

Histogram widgets display the distribution of a numeric attribute, in buckets, to group ranges of values in your data.

By default, you can hover over each bar to see each bucket’s values and count, and also filter your map’s data within a given range

Parameters
  • value (str) – Column name of the numeric or date value.

  • title (str, optional) – Title of widget.

  • description (str, optional) – Description text widget placed under widget title.

  • footer (str, optional) – Footer text placed on the widget bottom.

  • read_only (boolean, optional) – Interactively filter a range of numeric values by selecting them in the widget. Set to “False” by default.

  • buckets (number, optional) – Number of histogram buckets. Set to 20 by default.

  • weight (int, optional) – Weight of the category widget. Default value is 1.

Returns

cartoframes.viz.widget.Widget

Example

>>> histogram_widget(
...     'column_name',
...     title='Widget title',
...     description='Widget description',
...     footer='Widget footer',
...     buckets=9)
cartoframes.viz.time_series_widget(value, title=None, description=None, footer=None, read_only=False, buckets=20, prop='filter', weight=1)

Helper function for quickly creating a time series widget.

The time series widget enables you to display animated data (by aggregation) over a specified date or numeric field. Time series widgets provide a status bar of the animation, controls to play or pause, and the ability to filter on a range of values.

Parameters
  • value (str) – Column name of the numeric or date value.

  • title (str, optional) – Title of widget.

  • description (str, optional) – Description text widget placed under widget title.

  • footer (str, optional) – Footer text placed on the widget bottom

  • read_only (boolean, optional) – Interactively filter a range of numeric values by selecting them in the widget. Set to “False” by default.

  • buckets (number, optional) – Number of histogram buckets. Set to 20 by default.

  • weight (int, optional) – Weight of the category widget. Default value is 1.

Returns

cartoframes.viz.widget.Widget

Example

>>> time_series_widget(
...     'column_name',
...     title='Widget title',
...     description='Widget description',
...     footer='Widget footer',
...     buckets=10)

Popups

cartoframes.viz.default_popup_element(title=None, operation=None, format=None)

Helper function for quickly adding a default popup element based on the style. A style helper is required.

Parameters
  • title (str, optional) – Title for the given value. By default, it’s the name of the value.

  • operation (str, optional) – Cluster operation, defaults to ‘count’. Other options available are ‘avg’, ‘min’, ‘max’, and ‘sum’.

  • format (str, optional) – Format to apply to number values in the widget, based on d3-format specifier (https://github.com/d3/d3-format#locale_format).

Example

>>> default_popup_element(title='Popup title', format='.2~s')
cartoframes.viz.popup_element(value, title=None, format=None)

Helper function for quickly adding a popup element to a layer.

Parameters
  • value (str) – Column name to display the value for each feature.

  • title (str, optional) – Title for the given value. By default, it’s the name of the value.

  • format (str, optional) – Format to apply to number values in the widget, based on d3-format specifier (https://github.com/d3/d3-format#locale_format).

Example

>>> popup_element('column_name', title='Popup title', format='.2~s')

Publications

cartoframes.viz.all_publications(credentials=None)

Get all map visualizations published by the current user.

Parameters

credentials (Credentials, optional) – A Credentials instance. If not provided, the credentials will be automatically obtained from the default credentials if available.

cartoframes.viz.delete_publication(name, credentials=None)

Delete a map visualization published by id.

Parameters
  • name (str) – name of the publication to be deleted.

  • credentials (Credentials, optional) – A Credentials instance. If not provided, the credentials will be automatically obtained from the default credentials if available.

Utils

cartoframes.utils.setup_metrics(enabled)

Update the metrics configuration.

Parameters

enabled (bool) – flag to enable/disable metrics.

cartoframes.utils.set_log_level(level)

Set the level of the log in the library.

Parameters
  • level (str) – log level name. By default it’s set to “info”. Valid log levels are:

  • "critical"

  • "error"

  • "warning"

  • "info"

  • "debug"

  • "notset".

cartoframes.utils.decode_geometry(geom_col)

Decodes a DataFrame column. It detects the geometry encoding and it decodes the column if required. Supported geometry encodings are:

  • WKB (Bytes, Hexadecimal String, Hexadecimal Bytestring)

  • Extended WKB (Bytes, Hexadecimal String, Hexadecimal Bytestring)

  • WKT (String)

  • Extended WKT (String)

Parameters

geom_col (array) – Column containing the encoded geometry.

Example

>>> decode_geometry(df['the_geom'])

Exceptions

exception cartoframes.exceptions.DOError(message)

Bases: Exception

This exception is raised when a problem is encountered while using DO functions.

exception cartoframes.exceptions.CatalogError(message)

Bases: cartoframes.exceptions.DOError

This exception is raised when a problem is encountered while using catalog functions.

exception cartoframes.exceptions.EnrichmentError(message)

Bases: cartoframes.exceptions.DOError

This exception is raised when a problem is encountered while using enrichment functions.

exception cartoframes.exceptions.PublishError(message)

Bases: Exception

This exception is raised when a problem is encountered while publishing visualizations.