Hey! This content applies only to previous CARTO products

Please check if it's relevant to your use case. On October 2021 we released a new version of our platform.
You can learn more and read the latest documentation at docs.carto.com

Questions  /  Working with Data  /  Intro to Data

Preparing and Formatting Data

A guide containing some tips that you can use to prepare and format your data before importing to CARTO.

CARTO automatically includes guessing functionality during the import process, which is useful when files (or fields of data) are missing required upload information.

Why do you need this?

✅ You want to understand how CARTO works

✅ You want to check if your data are ready for CARTO

✅ You want to understand the different steps involved into a standard data workflow

As a best practice, you can also prepare and format data before connecting it to CARTO. This helps avoid errors and performance issues, even if you are using one of the supported geospatial formats.

Preparing Geocoded Data

When you import a file into your account, CARTO checks if the file contains latitude and longitude column names. If detected, those values are used to automatically geocode your data during the import process.

Behind the scenes, CARTO uses PostgreSQL extensions to programmatically convert your data into geometries, based on services from our data providers. :sparkles:

:-1: If lon/lat is not found, CARTO searches for the IP address, city, and country names to geocode your data.

:-1: If no geometry coordinates are found, the file is imported with null values in the_geom column of your dataset.

:-1: Street address data is not automatically converted to geometry data during the import process.

Before importing data, it is recommended to change any column headers to latitude and longitude to populate the_geom column with geometry coordinates. There should be one column for latitude, and one column for longitude. Otherwise, once data is imported, you can geocode your data to convert it to geometry coordinates. The Geocode analysis is subject to quota limitations and extra fees may apply.

Geocoding street address data is allocated to your account, and is subject to quota limitations. A permitted amount of credits are allowed per month, based on your account plan. Any geocode matches to the indicated street address consumes credits from your account.

Proper Encoding

If you have a .CSV file, it should be saved with UTF-8 encoding so that the data is imported into CARTO properly. This helps if there are any special characters in your data.

It is also important to confirm that proper formatting of columns and values are applied. View more tips about preparing CSV data, and other spreadsheets, for CARTO.

Map Projections

Confirm that your data includes proper projections. By default, CARTO uses the EPSG 4326 projection to store geospatial data in your dataset. When data is imported, the_geom column is created and displays the latitude and longitude in a single projection, using the WGS84 cartographic method (EPSG 4326 projection).

You can always change the map projection to a compatible projection, using the_geom_webmercator column in your dataset.

Formatting Shapefiles

Shapefiles, a widespread format for transferring spatial data (created by ESRI), can be imported into CARTO. Shapefiles are collections of three or more associated files. To import them into CARTO, make sure all files (SHP, DBF, SHX, and possibly PRJ) have the same name, and are compressed as a ZIP file.

View more tips about preparing Shapefile data for CARTO.

Other Formatting Tips

Some of these other data formatting tips may help with accuracy and performance in CARTO.

  • If you are importing numeric data for thematic maps, are your data values normalized? As a general rule, be sure to normalize numeric data before importing it to CARTO; since using raw numbers may misrepresent data.

  • For performance issues, it is recommended to sample your data in order to work with a smaller, more manageable size of data. For example, when importing OSM data.

CHEATSHEET: Importing OpenStreetMap Data

OpenStreetMap data contains features that make up our cities, including neighborhoods, streets, roads, and even lampposts. OpenStreetMap data is contributed by a diverse community, is rich with local knowledge, and frequently updated.

  • OpenStreetMap data can be exported directly from the OpenStreetMap website. OSM data can be quite large, ensure you are zoomed in enough to limit the size of the dataset that you are downloading.
  • From the Connect Dataset options, drag and drop, or Browse to upload the downloaded OSM file. Ensure the file download is one of CARTOs supported geospatial formats.
  • As part of using OpenStreetMap data, you must give credit to OpenStreetMap, as per their Copyright and License agreement. Include a link to OpenStreetMap in the Edit metadata options, which is accessible from the dataset name menu option.

What’s next?

Now that you have reviewed and prepared your data, import it to CARTO! :tada:

Sampling your data is also recommended when you are applying long-running analyses in Builder. Ideally, your data is sampled before it is imported to CARTO. Otherwise, you can apply the Subsample percent of rows analysis to a map layer as the first analysis in a workflow. This keeps the size of your data managable and improves performance.