One of the most impactful applications of spatial data and analytics is in the area of retail site selection. Retailers continue to evaluate their network of brick-and-mortar assets to be better aligned with their consumers’ changing buying behaviors, by opening new stores, testing innovative new formats and reviewing their existing location footprint.
In previous blog posts, we have showcased how retailers can find optimal locations for their stores using a collection of analytical procedures available in our Analytics Toolbox for BigQuery. These procedures solve the revenue prediction and whitespace analysis use cases, ubiquitous in site planning approaches, by letting CARTO users build predictive models, natively in BigQuery, that approximate the revenue a store in a given location will yield.
However, predictive model building is not always the right approach. It requires having enough data points for model training, which is often not the case for smaller retailers or for geographical areas where a retailer has just started their expansion.
In this post, we will present an alternative analytical approach to tackle site selection decision-making based on advanced statistical techniques that can be applied in scenarios where predictive models are not the most suitable solution.
In this example, our area of study is going to be a 30km buffer around the city of Waterloo, Iowa where the retailer Hy-Vee -a successful liquor store chain- is looking to expand past its 5 existing locations. Their existing store locations have been extracted from the publicly available Iowa Liquor Sales dataset. For this use case, all other liquor stores in Waterloo with an annual revenue higher than $35,000 will be considered Hy-Vee’s direct competitors. This information, resulting from the following query, can be found in the table
WITH store_locations AS ( SELECT store_name AS store, SUM(sale_dollars) as sales, ST_GEOGFROMTEXT(ANY_VALUE(store_location)) AS geom FROM `bigquery-public-data.iowa_liquor_sales.sales` WHERE EXTRACT(year FROM date) = 2019 GROUP BY store_name ), stores AS ( SELECT store, sales, IF(store like "%Hy-Vee%", 'Hy-Vee','Competitor') as store_type, geom FROM store_locations WHERE ST_INTERSECTS(geom, ST_BUFFER(ST_GEOGPOINT(-92.335358, 42.497854), 30000)) ) SELECT * FROM stores WHERE sales > 35000 OR store_type = 'Hy-Vee'
Next, we are going to identify the priority areas of opportunity for Hy-Vee as those areas with a high level of commercial activity, where preferably competitors are still not present. For that, we are going to make use of the COMMERCIAL_HOTSPOTS procedure of the Analytics Toolbox, a component of the recently launched Retail domain-specific module.
The COMMERCIAL_HOTSPOTS procedure locates hotspot areas by calculating a combined Getis-Ord Gi* statistic using a uniform kernel over several variables, which can be optionally weighted. In essence, it finds areas where the value of the input variables are considerably higher than the average. For this particular example, we are going to use the following two variables with equal weights; the Total Population and the number of Points of Interest (POIs) for all categories. This data has been extracted from the CARTO Spatial Features dataset in quadkey resolution 15, and are sensible proxies for identifying areas of commercial activity. This dataset is publicly available from CARTO’s Data Observatory, although a sample for the city of Waterloo can also be found in
Finding these potential areas of expansion is as easy as running the following query. Please note we are specifying a very low p-value threshold to keep only those cells of very high statistical significance.
CALL `carto-un`.carto.COMMERCIAL_HOTSPOTS( 'cartobq.docs.waterloo_spatial_features_qk15', 'cartobq.docs.waterloo_commercial_hotspots', 'geoid', 'quadkey', ['population', 'total_pois'], [0.5, 0.5], 1, 0.005 );
The result of this analysis can be visualized on the following maps, where the hotspot areas are represented with a pink outline in the context of the total population (left) and number of POIs (right). Existing stores are represented in red, while competitors in yellow. We have filtered the result of the procedure further to discard those cells with a
combined_gi below the 15% percentile. As we can observe, there are three defined areas where neither Hy-Vee stores nor its competitors are present, and another three areas where there is only one competitor present. These areas could be potential expansion opportunities for the Hy-Vee chain.
This procedure is now available to CARTO users as part of the Analytics Toolbox for BigQuery. They can be run directly from the BigQuery console or from your SQL or Python Notebooks using the BigQuery client.
With the addition of this new procedure to the Analytics Toolbox, we continue to open up advanced spatial analytics in the cloud for both retailers and the CPG brands they distribute. Look out for further announcements on the new functions we are working on, as we continue to expand the availability of analytical features. If you’d like to discuss your own retail business challenges with us, or see the CARTO platform in action, then contact our location intelligence experts for a demonstration. You can also try CARTO for yourself - sign up for our Free 14-Day Trial today.
|This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 960401.|
The Data Appeal Company’s Hannah Babineau - International Partnerships and Marketing Manager - guest blogs for CARTO!Spatial Data
Read any news website in 2022 and there’s one phrase you can’t escape; the “cost of living crisis.” In the UK, living costs have been rising at their fastest rate in 30 yea...Spatial Data
Please fill out the below form and we'll be in touch real soon.