How are monthly sales estimates calculated?
Our algorithm that generates monthly sales estimates was built using traffic estimates and actual sales data from merchants. We started by training a neural network (a machine learning model) using sales data from public financial disclosures (for big merchants that are public and disclose their revenue quarterly).
Our model uses traffic estimates (from Alexa) as the primary input, and falls back to Common Crawl centrality and page rank in cases where the store's domain is not covered by Alexa traffic estimates. There are a few other inputs that have a smaller influence on the results.
Model parameters were adjusted to ensure that the results are in-line with other public sources. For instance, Shopify 2019 Q3 GPV was approximately $6.2 billion, so we make sure that the sum of all estimates across all Shopify stores are close to that amount.
As with anything, the results are only as good as the input. In this case, we're largely at the mercy of Alexa traffic estimates. Alexa's estimates are generally quite good but there are a few cases where the data is clearly wrong. In the future, we hope to integrate more sources of traffic estimates to improve results.