CARTO's Use of Foreign Data Wrappers

Summary

Very different databases can be linked using this PostgreSQL feature.

This post may describe functionality for an old version of CARTO. Find out about the latest and cloud-native version here.
CARTO's Use of Foreign Data Wrappers

PostgreSQL is packed full of under-appreciated features (pg_trgm  full-text search  logical replication  ranges  to name a few) but I'd like to highlight one feature that we use under the covers in CARTO more than you might expect: foreign data wrappers.

Foreign data wrappers (FDW for short) allow developers to expose external sources of data as tables inside PostgreSQL. For data sources that are already tabular  FDW just looks like a mirror of a remote table (Oracle FDW  MySQL FDW  etc)  where local queries are magically fulfilled by the remote server. For other data sources (Twitter FDW  passwd FDW  random number FDW  etc) the FDW developer maps the external source into a tabular shape  where queries can be evaluated against the source and still returned as a PostgreSQL tuple set.

CARTO's use of FDW is pretty conventional:

  • We use postgres_fdw to link data in remote databases to users Carto databases  and
  • We use odbc_fdw to expose data in customer's operational databases to their CARTO instances.

In both cases  the FDW connection allows us to provide data access without incurring the cost and complication of replicating that actual data from one database to another.

Push Queries to Remote Servers Faster with postgres_fdw

The native postgres_fdw ships with PostgreSQL and is naturally the most full-featured FDW implementation -- there is an almost perfect mapping of features from one instance of PostgreSQL to another  so the FDW can be very clever about pushing queries to the remote server.

During development  when we joined local spatial data with remote data we found that PostGIS types and functions weren't being sent to the remote server. That rendered remote data access impossibly slow.

To achieve our ends  we had to extend postgres_fdw to support the PostGIS geometry types and functions  so that spatial queries could be sent to the remote server.

In order to get our enhancements accepted into the PostgreSQL community  we then generalized them to support any PostgreSQL extended type.

When creating a foreign server definition with spatial data  just add an extensions option to let FDW know that it is safe to ship PostGIS objects and functions to the remote server.

CREATE SERVER foreign_server
    FOREIGN DATA WRAPPER postgres_fdw
    OPTIONS (
        host '192.83.123.89'  
        port '5432'  
        dbname 'foreign_db' 
        extensions 'postgis'
        );

With the "extension shipping"  it's possible to make use of any custom type or function over a postgres_fdw connection as long as it is a part of an extension that is installed in both the local and remote databases.

Schedule Live Data Syncs with odbc_fdw

CARTO Database Connectors allow users to connect their CARTO instances directly to remote databases  with queries that can be refreshed on a schedule.  For folks with live data they want to feed into analysis and display  the database connectors can be very useful!

Under the covers  the connectors are setting up and managing odbc_fdw connections between the CARTO instance and the remote databases. By using the generic ODBC driver as the base for our connectors  we can support connections to

  • MySQL
  • PostgreSQL
  • SQL Server  and
  • Apache Hive.

In order to keep up with new PostgreSQL releases  as we upgrade our infrastructure to the latest version  we have taken on the maintenance of the odbc_fdw.