SiteSeer Blog

Five Things You Need to Know About Data Providers

Written by Lance Blick | Dec 23, 2019 12:15:00 PM

Attention retailers/restaurateurs/operators and executives of other chain businesses: have you ever received an email like this before:

Hi [Your Name], are you still using yesterday’s data and tools to choose sites?  Did you know that in recent studies, our data has been shown to accurately forecast retail performance 97.4% of the time?  

If so, you’ve probably wondered, “What if I’m missing out?  Should I question what I’m doing in the way of site selection and chase the latest software and data trends? What if there is something better out there?

Here at SiteSeer, we are obsessed with these questions. We work with dozens of high-quality data partners and every week we are reviewing and testing new data sources to see if they live up to the hype.  Let’s talk about where data is as good as its marketing and where it falls short. Here are five things you need to know:

  1. One-size-does-not-fit-all. 

    One of the unfortunate trends in data is the fallacy of one-size-fits-all. Data is expensive – if you can only afford one data source, then you need Unicorn Data, LLC!  The SiteSeer team has built models for 20+ years and uses the latest cutting-edge artificial intelligence algorithms as well as traditional statistical approaches. The reality remains: there is no single source of truth when it comes to site selection, market analysis or sales forecasting. That means understanding supply (competition/sister stores), demand (customer demographics, psychographics, behavior, spending), and your differentiators (site characteristics, store size, and operations) all affect future performance.         
  2. Newer is not always better.

    In the tech sector – which data loosely falls under – disruption is the buzz word of the day. New is always better than old, computers are better than humans, and you’re either advancing or you are failing.  The reality is much more complex.  We have data sources that, although updated regularly, have been largely unchanged for decades—and still prove immensely predictive.  We also have tested many newer data sources that provide a new take on an old problem, but fall short when tested against real-world problems.
  1. Historical accuracy and future accuracy are not the same thing.

    When using data to model the future, one first needs to “train” the model using historical and current events, then test the model against other (validation) samples to estimate how the model will work in the real world. When a data provider starts talking about extremely high forecasting accuracy, they are likely saying that the data did a good job against historical events.  This would be akin to a basketball player making 98% of their free throws during practice and claiming they had a 98% accuracy rate. It is highly doubtful that the same player would achieve this level of accuracy in actual real-world games with all of the unknowns.   
  1. Data is only useful when used properly.

    Most data sources require some level of effort to make them useful. You can make a better model when you clean and improve the data, munge and wrangle the data to transform it into more useful and predictive variables. Statistical techniques such as factor analysis, principal components and feature engineering to create new variables from combinations of other variables also help. Think of each of your data sources as notes that don’t do much on their own but when formed into chords and strung together into songs, can make beautiful music.
  1. Be clear what problem you are trying to solve.

    Another trend in data is for the provider to sell a dataset without a clear use case in mind. In other words, the data provides an answer in search of a question. This would be akin to a pharmaceutical company marketing a new mystery drug and hoping users will tell them what symptoms it cured. In data, there are many providers that are launching new and exciting technology that will probably have merit in the near future but just aren’t ready for mass consumption.

Data is the most expensive part of any analytics engagement. It's expensive to buy and its time consuming to collect. Unless you have an unlimited budget, you will need to make trade-offs.

Perhaps creating a robust competitive intelligence data gathering program isn't worth your time and effort, but having excellent customer behavioral metrics is important enough to fit it into your budget. The key for most companies is to have a well-rounded data program and over time, improve the areas that fall short. This tried-and-true approach provides a better risk vs. reward scenario than banking on one trendy new data source. 

On the flipside, it is important that you revisit your approaches often and make sure your data sources continue to provide you the answer you seek. Choose a software partner that offers flexible data options and helps you separate value from the hype.

Interested in learning more about SiteSeer and our data partners and options? Contact us for a demo.