Site Selection for Retailers Using Commercial Hotspot Analysis


Explore an alternative analytical approach to tackle site selection decision-making based on advanced statistical techniques for retailers.

This post may describe functionality for an old version of CARTO. Find out about the latest and cloud-native version here.
Site Selection for Retailers Using Commercial Hotspot Analysis

One of the most impactful applications of spatial data and analytics is in the area of retail site selection. Retailers continue to evaluate their network of brick-and-mortar assets to be better aligned with their consumers' changing buying behaviors  by opening new stores  testing innovative new formats and reviewing their existing location footprint.

In previous blog posts, we have showcased how retailers can find optimal locations for their stores using a collection of analytical procedures available in our Analytics Toolbox for BigQuery. These procedures solve the revenue prediction and whitespace analysis use cases, ubiquitous in site planning approaches, by letting CARTO users build predictive models - natively in BigQuery - that approximate the revenue a store in a given location will yield.

However, predictive model building is not always the right approach. It requires having enough data points for model training  which is often not the case for smaller retailers or for geographical areas where a retailer has just started their expansion.

In this post, we will present an alternative analytical approach to tackle site selection decision-making based on advanced statistical techniques that can be applied in scenarios where predictive models are not the most suitable solution.

In this example, our area of study is going to be a 30km buffer around the city of Waterloo, Iowa where the retailer Hy-Vee - a successful liquor store chain - is looking to expand past its 5 existing locations. Their existing store locations have been extracted from the publicly available Iowa Liquor Sales dataset. For this use case, all other liquor stores in Waterloo with an annual revenue higher than $35 000 will be considered Hy-Vee's direct competitors. This information, resulting from the following query  can be found in the table

WITH store_locations AS (
 SELECT store_name AS store  SUM(sale_dollars) as sales  ST_GEOGFROMTEXT(ANY_VALUE(store_location)) AS geom
 FROM `bigquery-public-data.iowa_liquor_sales.sales`
 WHERE EXTRACT(year FROM date) = 2019
 GROUP BY store_name ) 

stores AS (
 SELECT store  sales  IF(store like "%Hy-Vee%"  'Hy-Vee' 'Competitor') as store_type 
 geom FROM store_locations
 WHERE ST_INTERSECTS(geom  ST_BUFFER(ST_GEOGPOINT(-92.335358  42.497854)  30000))

SELECT *  FROM stores
WHERE sales > 35000 OR store_type = 'Hy-Vee'

Next, we are going to identify the priority areas of opportunity for Hy-Vee as those areas with a high level of commercial activity, where preferably competitors are still not present. For that, we are going to make use of the COMMERCIAL_HOTSPOTS procedure of the Analytics Toolbox, a component of the recently launched Retail domain-specific module.

The COMMERCIAL_HOTSPOTS procedure locates hotspot areas by calculating a combined Getis-Ord Gi* statistic using a uniform kernel over several variables, which can be optionally weighted. In essence, it finds areas where the value of the input variables are considerably higher than the average. For this particular example, we are going to use the following two variables with equal weights; the Total Population and the number of Points of Interest (POIs) for all categories. This data has been extracted from the CARTO Spatial Features dataset in quadkey resolution 15  and are sensible proxies for identifying areas of commercial activity. This dataset is publicly available from CARTO’s Data Observatory, although a sample for the city of Waterloo can also be found in

Finding these potential areas of expansion is as easy as running the following query. Please note we are specifying a very low p-value threshold to keep only those cells of very high statistical significance.

   ['population'  'total_pois'] 
   [0.5  0.5] 

The result of this analysis can be visualized on the following maps, where the hotspot areas are represented with a pink outline in the context of the total population (left) and number of POIs (right). Existing stores are represented in red, while competitors in yellow. We have filtered the result of the procedure further to discard those cells with a combined_gi below the 15% percentile. As we can observe, there are three defined areas where neither Hy-Vee stores nor its competitors are present, and another three areas where there is only one competitor present. These areas could be potential expansion opportunities for the Hy-Vee chain.

This procedure is now available to CARTO users as part of the Analytics Toolbox for BigQuery. They can be run directly from the BigQuery console or from your SQL or Python Notebooks using the BigQuery client.

Banner promoting CARTO Workflows with the text "Prefer a low-code approach? Try Workflows" and a screenshot of an example workflow

With the addition of this new procedure to the Analytics Toolbox, we continue to open up advanced spatial analytics in the cloud for both retailers and the CPG brands they distribute. Look out for further announcements on the new functions we are working on  as we continue to expand the availability of analytical features. If you’d like to discuss your own retail business challenges with us, or see the CARTO platform in action, then contact our location intelligence experts for a demonstration. You can also try CARTO for yourself - sign up for our Free 14-Day Trial today.

EU Flag This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 960401.