CARTOframes

A Python package for integrating CARTO maps, analysis, and data services into data science workflows.

CARTOframes v1.0.4 includes breaking changes from betas and 0.10 version, check the migration guide to learn how to update your code.

What is CARTOframes?

CARTOframes is a Python package for integrating CARTO maps, analysis, and data services into data science workflows.

Python data analysis workflows often rely on the pandas and jupyter notebook de facto standards. Integrating CARTO into this workflow saves data scientists time and energy by not having to export datasets as files or retain multiple copies of the data. To understand the fundamentals of CARTOframes, read the guides. To view the source code, browse the open-source repository in GitHub and contribute. Otherwise, read the full reference API, or find different support options.

Guides

Quick reference guides for learning how to use CARTOframes features.

Reference

Browse the interactive API documentation to search for specific CARTOframes methods, arguments, and sample code that can be used to build your applications.

Check Full Reference API
1
2
3
4
5
6
7
8
9
10
from cartoframes.auth import set_default_credentials
from cartoframes.data.observatory import Enrichment

set_default_credentials('creds.json')
enrichment = Enrichment()

enriched_df = enrichment.enrich_points(
    df,
    variables=['total_pop_3cf008b3']
)

Examples

Play with real examples and learn by doing.

Load data from a CSV file

These examples illustrate how to load data from a CSV file using the Pandas and GeoPandas libraries.

From latitude and longitude columns

In [1]:
from pandas import read_csv
from geopandas import GeoDataFrame, points_from_xy

remote_file_path = 'http://data.sfgov.org/resource/wg3w-h783.csv'

df = read_csv(remote_file_path)

# Clean rows where the `longitude` column is NULL
df = df[df['longitude'].notna()]

gdf = GeoDataFrame(df, geometry=points_from_xy(df['longitude'], df['latitude']))
gdf.head()
Out[1]:
incident_datetime incident_date incident_time incident_year incident_day_of_week report_datetime row_id incident_id incident_number cad_number ... :@computed_region_qgnn_b9vv :@computed_region_26cr_cadq :@computed_region_ajp5_b2md :@computed_region_nqbw_i6c3 :@computed_region_2dwj_jsy4 :@computed_region_h4ep_8xdi :@computed_region_y6ts_4iup :@computed_region_jg9y_a9du :@computed_region_6pnf_4xz7 geometry
0 2019-05-01T01:00:00.000 2019-05-01T00:00:00.000 01:00 2019 Wednesday 2019-06-12T20:27:00.000 81097515200 810975 190424067 191634131.0 ... 10.0 7.0 35.0 NaN NaN NaN NaN NaN 1.0 POINT (-122.49963 37.76257)
1 2019-06-22T07:45:00.000 2019-06-22T00:00:00.000 07:45 2019 Saturday 2019-06-22T08:05:00.000 81465564020 814655 190450880 191730737.0 ... 1.0 10.0 34.0 1.0 NaN 1.0 NaN NaN 2.0 POINT (-122.40816 37.78054)
2 2019-06-03T16:16:00.000 2019-06-03T00:00:00.000 16:16 2019 Monday 2019-06-03T16:16:00.000 80769875000 807698 190397016 191533509.0 ... 2.0 9.0 1.0 NaN NaN NaN NaN NaN 2.0 POINT (-122.39075 37.72160)
3 2018-11-16T16:34:00.000 2018-11-16T00:00:00.000 16:34 2018 Friday 2018-11-16T16:34:00.000 73857915041 738579 180870806 183202539.0 ... 6.0 3.0 6.0 NaN 18.0 NaN NaN NaN 2.0 POINT (-122.40488 37.79486)
4 2019-05-27T02:25:00.000 2019-05-27T00:00:00.000 02:25 2019 Monday 2019-05-27T02:55:00.000 80509204134 805092 190378555 191470256.0 ... 4.0 6.0 13.0 NaN NaN NaN NaN NaN 1.0 POINT (-122.43056 37.79772)

5 rows × 37 columns

In [2]:
from cartoframes.viz import Layer

Layer(gdf)
Out[2]:

From a WKT column

In [3]:
from cartoframes.utils import decode_geometry

remote_file_path='http://libs.cartocdn.com/cartoframes/files/starbucks_brooklyn_geocoded.csv'

df = read_csv(remote_file_path)

gdf = GeoDataFrame(df, geometry=decode_geometry(df['the_geom']))
gdf.head()
Out[3]:
the_geom cartodb_id field_1 name address revenue id_store geometry
0 0101000020E61000005EA27A6B607D52C01956F146E655... 1 0 Franklin Ave & Eastern Pkwy 341 Eastern Pkwy,Brooklyn, NY 11238 1321040.772 A POINT (-73.95901 40.67109)
1 0101000020E6100000B610E4A0847D52C0B532E197FA49... 2 1 607 Brighton Beach Ave 607 Brighton Beach Avenue,Brooklyn, NY 11235 1268080.418 B POINT (-73.96122 40.57796)
2 0101000020E6100000E5B8533A587F52C05726FC523F4F... 3 2 65th St & 18th Ave 6423 18th Avenue,Brooklyn, NY 11204 1248133.699 C POINT (-73.98976 40.61912)
3 0101000020E61000008BA6B393C18152C08D62B9A5D550... 4 3 Bay Ridge Pkwy & 3rd Ave 7419 3rd Avenue,Brooklyn, NY 11209 1185702.676 D POINT (-74.02744 40.63152)
4 0101000020E6100000CEFC6A0E108052C080D4264EEE4B... 5 4 Caesar's Bay Shopping Center 8973 Bay Parkway,Brooklyn, NY 11214 1148427.411 E POINT (-74.00098 40.59321)
In [4]:
Layer(gdf)
Out[4]:

Load data from a JSON file

This example illustrates how to load data from a remote JSON file using pandas and the process of preparing the data for spatial operations.

In [1]:
import requests

# Download the JSON file
remote_file_path = 'http://opendata.paris.fr/api/records/1.0/search/?dataset=arbresremarquablesparis&rows=200'
data_json = requests.get(remote_file_path).json()['records']
data_json[0].keys()
Out[1]:
dict_keys(['datasetid', 'recordid', 'fields', 'geometry', 'record_timestamp'])
In [2]:
from pandas import json_normalize

# Normalize the data
df = json_normalize(data_json)
df.head()
Out[2]:
datasetid recordid record_timestamp fields.geom_x_y fields.libellefrancais fields.objectid fields.idemplacement fields.arrondissement fields.circonferenceencm fields.hauteurenm ... fields.stadedeveloppement fields.remarquable fields.idbase fields.genre fields.typeemplacement fields.dateplantation geometry.type geometry.coordinates fields.varieteoucultivar fields.complementadresse
0 arbresremarquablesparis 29a0ea77b5379a1e15b5c34d0ed8bbadb9f5789c 2020-05-08T10:45:25.056000+00:00 [48.8542732882, 2.33573525468] Paulownia 7493 000101001 PARIS 6E ARRDT 295.0 20.0 ... M 1 216766.0 Paulownia Arbre 1999-01-25T01:00:00+00:00 Point [2.33573525468, 48.8542732882] NaN NaN
1 arbresremarquablesparis 24ab994e9b2b51c4a27394ec52f038c4faa5fc1c 2020-05-08T10:45:25.056000+00:00 [48.8217882346, 2.3228497157] Hêtre 57013 00000174 PARIS 14E ARRDT 310.0 15.0 ... M 1 121632.0 Fagus Arbre 1700-01-01T00:09:21+00:00 Point [2.3228497157, 48.8217882346] ''Atropunicea'' 14-09
2 arbresremarquablesparis 2138bdaedd6c46f681ff30904f8017a292fc2b3b 2020-05-08T10:45:25.056000+00:00 [48.8648326277, 2.25217460792] Plaqueminier 58739 000501002 BOIS DE BOULOGNE 146.0 15.0 ... A 1 2002388.0 Diospyros Arbre 1897-01-01T00:09:21+00:00 Point [2.25217460792, 48.8648326277] NaN 16-13
3 arbresremarquablesparis 63bc7232fe659c4585923d9e65e5275418c359b9 2020-05-08T10:45:25.056000+00:00 [48.8564741998, 2.39461847904] Platane 70026 001802036 PARIS 20E ARRDT 407.0 23.0 ... M 1 223748.0 Platanus Arbre 1700-01-01T00:09:21+00:00 Point [2.39461847904, 48.8564741998] NaN 148
4 arbresremarquablesparis 602c7b2fd878c56e5efc532739abb28d86d9f365 2020-05-08T10:45:25.056000+00:00 [48.831216573, 2.41167739693] Cryptomeria 85651 12-13 BOIS DE VINCENNES 122.0 13.0 ... M 1 2002359.0 Cryptomeria Arbre 1893-01-01T00:09:21+00:00 Point [2.41167739693, 48.831216573] NaN 12-13

5 rows × 24 columns

In [3]:
# Add Latitude and Longitude columns
df['lng'] = df.apply(lambda row: row['geometry.coordinates'][0], axis=1)
df['lat'] = df.apply(lambda row: row['geometry.coordinates'][1], axis=1)
df.head()
Out[3]:
datasetid recordid record_timestamp fields.geom_x_y fields.libellefrancais fields.objectid fields.idemplacement fields.arrondissement fields.circonferenceencm fields.hauteurenm ... fields.idbase fields.genre fields.typeemplacement fields.dateplantation geometry.type geometry.coordinates fields.varieteoucultivar fields.complementadresse lng lat
0 arbresremarquablesparis 29a0ea77b5379a1e15b5c34d0ed8bbadb9f5789c 2020-05-08T10:45:25.056000+00:00 [48.8542732882, 2.33573525468] Paulownia 7493 000101001 PARIS 6E ARRDT 295.0 20.0 ... 216766.0 Paulownia Arbre 1999-01-25T01:00:00+00:00 Point [2.33573525468, 48.8542732882] NaN NaN 2.335735 48.854273
1 arbresremarquablesparis 24ab994e9b2b51c4a27394ec52f038c4faa5fc1c 2020-05-08T10:45:25.056000+00:00 [48.8217882346, 2.3228497157] Hêtre 57013 00000174 PARIS 14E ARRDT 310.0 15.0 ... 121632.0 Fagus Arbre 1700-01-01T00:09:21+00:00 Point [2.3228497157, 48.8217882346] ''Atropunicea'' 14-09 2.322850 48.821788
2 arbresremarquablesparis 2138bdaedd6c46f681ff30904f8017a292fc2b3b 2020-05-08T10:45:25.056000+00:00 [48.8648326277, 2.25217460792] Plaqueminier 58739 000501002 BOIS DE BOULOGNE 146.0 15.0 ... 2002388.0 Diospyros Arbre 1897-01-01T00:09:21+00:00 Point [2.25217460792, 48.8648326277] NaN 16-13 2.252175 48.864833
3 arbresremarquablesparis 63bc7232fe659c4585923d9e65e5275418c359b9 2020-05-08T10:45:25.056000+00:00 [48.8564741998, 2.39461847904] Platane 70026 001802036 PARIS 20E ARRDT 407.0 23.0 ... 223748.0 Platanus Arbre 1700-01-01T00:09:21+00:00 Point [2.39461847904, 48.8564741998] NaN 148 2.394618 48.856474
4 arbresremarquablesparis 602c7b2fd878c56e5efc532739abb28d86d9f365 2020-05-08T10:45:25.056000+00:00 [48.831216573, 2.41167739693] Cryptomeria 85651 12-13 BOIS DE VINCENNES 122.0 13.0 ... 2002359.0 Cryptomeria Arbre 1893-01-01T00:09:21+00:00 Point [2.41167739693, 48.831216573] NaN 12-13 2.411677 48.831217

5 rows × 26 columns

In [4]:
from geopandas import GeoDataFrame, points_from_xy

gdf = GeoDataFrame(df, geometry=points_from_xy(df['lng'], df['lat']))
gdf.head()
Out[4]:
datasetid recordid record_timestamp fields.geom_x_y fields.libellefrancais fields.objectid fields.idemplacement fields.arrondissement fields.circonferenceencm fields.hauteurenm ... fields.genre fields.typeemplacement fields.dateplantation geometry.type geometry.coordinates fields.varieteoucultivar fields.complementadresse lng lat geometry
0 arbresremarquablesparis 29a0ea77b5379a1e15b5c34d0ed8bbadb9f5789c 2020-05-08T10:45:25.056000+00:00 [48.8542732882, 2.33573525468] Paulownia 7493 000101001 PARIS 6E ARRDT 295.0 20.0 ... Paulownia Arbre 1999-01-25T01:00:00+00:00 Point [2.33573525468, 48.8542732882] NaN NaN 2.335735 48.854273 POINT (2.33574 48.85427)
1 arbresremarquablesparis 24ab994e9b2b51c4a27394ec52f038c4faa5fc1c 2020-05-08T10:45:25.056000+00:00 [48.8217882346, 2.3228497157] Hêtre 57013 00000174 PARIS 14E ARRDT 310.0 15.0 ... Fagus Arbre 1700-01-01T00:09:21+00:00 Point [2.3228497157, 48.8217882346] ''Atropunicea'' 14-09 2.322850 48.821788 POINT (2.32285 48.82179)
2 arbresremarquablesparis 2138bdaedd6c46f681ff30904f8017a292fc2b3b 2020-05-08T10:45:25.056000+00:00 [48.8648326277, 2.25217460792] Plaqueminier 58739 000501002 BOIS DE BOULOGNE 146.0 15.0 ... Diospyros Arbre 1897-01-01T00:09:21+00:00 Point [2.25217460792, 48.8648326277] NaN 16-13 2.252175 48.864833 POINT (2.25217 48.86483)
3 arbresremarquablesparis 63bc7232fe659c4585923d9e65e5275418c359b9 2020-05-08T10:45:25.056000+00:00 [48.8564741998, 2.39461847904] Platane 70026 001802036 PARIS 20E ARRDT 407.0 23.0 ... Platanus Arbre 1700-01-01T00:09:21+00:00 Point [2.39461847904, 48.8564741998] NaN 148 2.394618 48.856474 POINT (2.39462 48.85647)
4 arbresremarquablesparis 602c7b2fd878c56e5efc532739abb28d86d9f365 2020-05-08T10:45:25.056000+00:00 [48.831216573, 2.41167739693] Cryptomeria 85651 12-13 BOIS DE VINCENNES 122.0 13.0 ... Cryptomeria Arbre 1893-01-01T00:09:21+00:00 Point [2.41167739693, 48.831216573] NaN 12-13 2.411677 48.831217 POINT (2.41168 48.83122)

5 rows × 27 columns

In [5]:
from cartoframes.viz import Layer

Layer(gdf)
Out[5]:

Load data from a GeoJSON file

This example illustrates how to load data from a GeoJSON file using GeoPandas.

In [1]:
from geopandas import read_file

gdf = read_file('http://libs.cartocdn.com/cartoframes/files/sustainable_palm_oil_production_mills.geojson')
gdf.head()
Out[1]:
objectid cartodb_id entity_id latitude longitude audit_stat legal_radi illegal_ra radius_umd radius_for ... carbon_r_3 peat_for_2 peat_for_3 primary_10 primary_11 mill_name parent_com rspo_certi date_updat geometry
0 1 59 ID1822 -1.585833 103.205556 ASA 1 0 0.321764 0.225759 1099 ... 1224 0 29 0.004508 0.135391 Muara Bulian Mill PT Inti Indosawit Subur yes 14-Aug POINT (103.20556 -1.58583)
1 2 153 ID1847 0.077043 102.030838 Renewal Certification 0 0.445960 0.258855 1979 ... 2038 7 523 0.017304 0.321418 Pabrik Kelapa Sawit Batang Kulim POM PT Musim Mas yes 14-Aug POINT (102.03084 0.07704)
2 3 103 ID1720 1.660222 100.590611 Initial Certification 0 0.498531 0.248520 1432 ... 1518 0 476 0.000811 0.193365 Kayangan and Kencana POM PT Salim Ivomas Pratama Tbk yes 14-Aug POINT (100.59061 1.66022)
3 4 216 ID1945 -2.894444 112.543611 ASA 1 0 0.662863 0.186332 226 ... 269 59 66 0.124882 0.189169 PT Sarana Titian Permata POM Wilmar International Ltd yes 14-Aug POINT (112.54361 -2.89444)
4 5 156 ID1553 3.593333 98.947222 Initial Certification 0 0.533668 0.028972 382 ... 412 0 0 0.000216 0.009296 Adolina POM PT Perkebunan Nusantara IV (PERSERO) yes 14-Aug POINT (98.94722 3.59333)

5 rows × 73 columns

In [2]:
from cartoframes.viz import Layer

Layer(gdf)
Out[2]:

Support

Get help or learn about known issues.