Glossary

Here we have collected all the key terms you need to deepen your understanding of spatial data science and location intelligence.

A

Learn more

AI Agent

An intelligent system that can autonomously perform tasks, make decisions, or interact with users.

Learn more

Agglomerative Clustering

Type of hierarchical clustering where clusters are built from the bottom up. This algorithm starts building clusters where each object is in its own cluster, then clusters are recursively merged (agglomerated) using a 'linkage strategy' such as minimizing the sum of squared distances within a cluster.

Learn more

Alteryx

Alteryx provides a user-friendly interface to perform tasks such as data cleansing, transformation, and predictive analytics without requiring extensive coding skills.

Learn more

Analytics Toolbox

CARTO’s Analytics Toolbox is a set of user-defined functions (UDFs) and Stored Procedures to unlock Spatial Analytics directly on top of your cloud data warehouse platform.

Learn more

Artificial Intelligence (AI)

In spatial analytics, AI can be used to uncover trends in location data, automate decision-making, and optimize everything from delivery routes to urban planning.

Learn more

B

Learn more

Bayes Theorem

Provides a way to calculate the probability of a hypothesis based on its prior probability, the probabilities of observing various data given the hypothesis, and the observed data itself.

Learn more

C

Learn more

Cloud Data Warehouse

A specialized database system hosted and managed as a service in a cloud computing environment. It serves as a centralized repository for storing and managing structured, semi-structured, and unstructured data from various sources within an organization.

Learn more

Cloud Native

Cloud native refers to a modern approach in data storage and architecture, as well as software development and deployment that harnesses the capabilities of cloud computing environments to build, deploy, and manage applications.

Learn more

Cross-Validation

Model validation technique for assessing how the results of a statistical model will generalize to new data.

Learn more

D

Learn more

DBSCAN

It is a clustering method that groups data that are close to each other based on a distance metric and a minimum number of data points.

Learn more

Data Lake

A data lake is a collection of data or files generally stored in a cloud file storage or blob infrastructure. This data may or may not be unprocessed or clean.

Learn more

Data Warehouse

A data warehouse is an OLAP (Online Analytical Processing) that can distribute query workloads across compute infrastructure and where data is organized in a structured manner to make it ready to analyze.

Learn more

Databricks

Databricks Data Intelligence Platform simplifies the process of building and deploying big data applications. Founded by the creators of Apache Spark, Delta Lake, and MLflow.

Learn more

Deck.gl

deck.gl is one of the most popular Open Source map visualization libraries and is the preferred library to use with CARTO.

Learn more

E

Learn more

Esri

Esri is a Geographic Information System (GIS) technology. It develops GIS software, provides geospatial solutions, and offers consulting services for various industries such as government, natural resources, utilities, and many others.

Learn more

F

Learn more

G

Learn more

GPKG

Geopackage (GPKG) is an open standard vector file format developed by the Open Geospatial Consortium (OGC).

Learn more

Generative AI

For spatial data science, Generative AI allows to generate diverse and augmented datasets, improve model training, simulate scenarios, and assist in spatial analysis and modeling by producing realistic spatial information.

Learn more

GeoJSON

GeoJSON is a type of JSON format designed for encoding geographic data structures. It supports various geometry types such as points, lines, polygons, and multi-geometries.

Learn more

GeoParquet

GeoParquet is a project aimed at extending the Parquet file format to directly support geometry data, eliminating the need for later translation and reducing computational costs.

Learn more

Geocoding

Geocoding is the process of assigning longitude and latitude values to street addresses, whilst reverse geocoding is the opposite: identifying the addresses of locations. You can use geocoding to position a marker on a map.

Learn more

Geographic Coordinates

Geographic coordinates, also known as geographic or unprojected coordinates, use latitude and longitude to specify a location on the Earth’s curved surface. These are stored as a geography field type.

Learn more

Geographically Weighted Regression GWR

Geographically Weighted Regression (GWR) quantifies the strength of relationships across space between a target and correlation variables.

Learn more

Google Cloud's BigQuery

Google Cloud's BigQuery is the serverless, cost-effective, and multi-cloud data warehouse offered by Google Cloud Platform (GCP).

Learn more

H

Learn more

HEAVY.AI

HEAVY.AI is a San Francisco-based company that specializes in providing advanced analytics solutions for businesses and government organizations.

Learn more

HTML

HTML is used for creating the structure and content of web pages. It plays a crucial role in being used to embed interactive maps, geospatial visualizations, and geospatial data in web applications.

Learn more

Hotspot Analysis

Hotspot analysis enables the identification of clusters of high or low values of data on a map, going beyond visual representation.

Learn more

I

Learn more

Isoline Map

An isoline map is a way of presenting numerical data cartographically, by connecting points with similar values with lines or polygons, helping readers recognize geographical patterns and relationships, such as travel time catchments.

Learn more

J

Learn more

JavaScript

JavaScript is often employed for creating interactive and visually appealing web maps. It allows developers to manipulate and visualize spatial data using libraries such as React and deck.gl.

Learn more

K

Learn more

KML

KML stands for Keyhole Markup Language and it is an XML-based file format used to display geographic data.

Learn more

Kinetica

Kinetica is a real-time GPU-accelerated analytic database for big data. Kinetica leverages GPUs and modern many-core CPUs to improve performance.

Learn more

Kriging

Spatial interpolation method for predicting missing values, taking into account the distance and spatial correlation of known data points.

Learn more

L

Learn more

Lake House

The lake house architecture combines these two approaches with the data lake being the storage layer for the data warehouse and the warehouse acting as the scalable compute engine. In this architecture the separation between compute and storage is very clear.

Learn more

Large Language Model (LLM)

A type of generative AI trained on massive datasets to understand and produce human-like language.

Learn more

Location Intelligence

Location Intelligence (LI) is the methodology of deriving insights from location data to answer spatial questions. Location Intelligence goes beyond simple data visualization on maps, to analyzing location data as an integral part of a business or societal problem.

Learn more

M

Learn more

Machine Learning (ML)

A subset of AI where algorithms learn from data to make predictions or decisions, improving over time without being explicitly programmed.

Learn more

Map Software

It refers to computer programs or applications designed to create, display, and manipulate spatial information in the form of maps.

Learn more

Mapbox

Mapbox enables users to create and integrate custom maps into their applications and websites. It provides a range of tools and APIs for map design, geocoding, search, navigation, and data visualization.

Learn more

Markov Chain Monte Carlo MCMC

Class of simulation methods used to approximate the posterior distribution by randomly sampling in a probabilistic space.

Learn more

N

Learn more

Natural Language Processing (NLP)

A branch of AI that enables computers to understand, interpret, and generate human language.

Learn more

O

Learn more

P

Learn more

Projected Coordinates

Projected Coordinates, also referred to as geometries, utilize a two-dimensional Cartesian coordinate system to represent locations on a flat surface, such as a map or a plane. These are stored as geometries.

Learn more

Python

Python is a popular programming language which can be used for spatial analysis by leveraging various libraries like GeoPandas, Shapely, and PySAL, which provide functionalities to manipulate, visualize, and analyze spatial data.

Learn more

Q

Learn more

R

Learn more

R (Programming Language)

R is a powerful programming language specifically designed for statistical computing and data analysis.

Learn more

Raster Data

Raster data is represented as a grid of cells or pixels, with each cell containing a value or attribute. It has a grid-based structure and represents continuous values such as elevation, temperature, or satellite imagery.

Learn more

S

Learn more

SQL

Structured Query Language (pronounced either as sequel or “ess-queue-ell,” depending on who you talk to) is used for managing and querying relational database management systems (RDBMS), as well as running analysis.

Learn more

Snowflake

Snowflake is a single, global platform that powers the Data Cloud. Its architecture separates computing and storage, allowing independent scaling and cost optimization.

Learn more

Spatial Data Catalog

Use the Spatial Data Catalog to browse and filter our collection of public and premium datasets available in the Data Observatory.

Learn more

Spatial Data Science

Spatial data science (SDS) is a subset of Data Science that focuses on the unique characteristics of spatial data, moving beyond simply looking at where things happen to understand why they happen there.

Learn more

Spatial Indexes

Spatial Indexes are global grids - in that sense, they are a lot like raster data. However, they render a lot like vector data; each “cell” in the grid is an individual feature that can be interrogated. They are well suited to storing, analyzing, and rendering big spatial data as their location is encoded as a short string, rather than a large geometry field.

Learn more

Spatial SQL

Spatial SQL uses the same elements & structure of normal SQL but allows you to work with geospatial data types such as geometries & geographies.

Learn more

T

Learn more

Tileset

A tileset is a representation of spatial data optimized for map visualizations. Tilesets are designed to facilitate faster loading, optimized bandwidth usage, and seamless navigation in map visualizations.

Learn more

U

Learn more

V

Learn more

Vector Data

Vector data represents geographic features as discrete points, lines, and polygons.It has a geometry-based structure in which each element in vector data represents a discrete geographic object, such as roads, buildings, or administrative boundaries.

Learn more