Playbook to Assess the Value of your Data
Mis à jour : janv 22
I did not find much information about the practice of assessing the value of data, so I thought about sharing some insights on a practical way to do this. This is a mix between top down and bottom up approaches. Scoping which data points and segments must be covered is a top down approach, while putting a monetary value in front of data points is a bottom up one.
Describe your Segments, Competitors and Alternative Data
The initial step is to write down the customer segments to target. For each of them, you need to define at a high level what information is valuable to know for their own business. Usually, this is best to get this clear before collecting data, processing it and exposing APIs. For instance, John Deere's primary revenue comes from tractors and big agriculture machinery. Now, they collect tons of data based on the sensors so they know when data is aggregated which crop categories are going to be the most produced. They could be selling yield information based on crop category to farmers but as well to traders to anticipate the stock prices. It would help farmers to know which crop category is the most beneficial to take care of. Traders would have better predictions on the expected stock market price. Your customer segments can include your existing customers but new customers out of the classical customer segment or customer base as well.
Then comes the market analysis phase: what alternative data is available in the market and who are the providers? Usually, Product managers are the best ones to work on this. Unlike products, software and services, data is unique (value data as a working skill). Alternative data means data which could fulfill similar needs from customers. This analysis should not be limited to similar data.
After Product Managers have smoothed down the overall value they could bring to a market segment, they need to understand who the potential competitors are, and find the right use cases for the targeted segments. For each customer segment, you need to describe between one to three business cases. Recently, I made this exercise with an API providing individuals’ reviews and marks on hotels. I found six different market segments being Hotel association, Tourist authorities, Hotel chains, Banks, Companies Travel Management services, Enterprise Travel booking tools. For each of them, I found between one to three business cases relevant with the data. It means thirteen business cases in total. However, initially, I had five use cases for the Hotel segment. I had to review this segment and define two sub-segments (Hotel chains/Hotel association). If you have more than three business cases per segment, it might be that your segment definition is too wide and you need to review it. This is important to keep that in mind as it will have a direct impact on your API product and the way you communicate it later on.
The next step is to assess the value of data points provided by the API for each business case. Quite often, the issue is that the number of data points and potential use cases is huge. Therefore, this is hard to find where to start. Prioritize your work by selecting one or two use cases that came out as per the most promising ones during the first analysis. Then, dig into the data points.
Apply the Pareto Rule
Pareto rule is known as the law of the vital few or the principle of factor sparsity. It states that roughly 80% of the effects are the product of 20% of the causes.
Another way to look at it is that 80% of the value is provided by 20% of your data.
If the number of data points representing the 20% is in an order of magnitude of hundreds, then, select a handful of them to simplify the analysis. As a principle, I have always seen that for any API, only a few data points are truly valuable for a specific segment and related use case.
The types of your selected data points can be either covering geolocation, time relevance, interval, string value, text, etc. For each of those data points that potentially provide most value, you need to describe at least three results from which you could infer different values for your customers. Below are ideas about how to take sample data examples for the three potential results for Number/Location/Text value. The idea is to take objectively two boundary/extreme data results and one average result to understand how it impacts the processes and/or the business of data buyers. The list below is just an example on how to look at it. Other views can be very relevant depending on how well the person making the analysis knows about the available data.
Analyze Data Points Value
Once you have the most extreme/boundary results, you can start analyze the value for each customer segment. This matrix helps for the analysis.
Analyzing the top five most important data points gives a reliable information about the value provided to data buyers.
Therefore, it means making this analysis with this table fifteen times for each business case (five data points with three different results each). It forces data sellers as well to wonder how their data is going to be used by Data buyers and the related value they bring on to them.
It can be a good preparation or supporting document to explore data value with initial prospects. However monetary value that you provide does not relate to their willingness to pay directly. It can help of course in the price discussion with customers. When you have a clearer idea about the monetary benefit you would provide to your different targeted segments depending on data results for each business case, here comes the willingness to pay discussion.
From Value to Willingness to Pay
Now, that you have a good assessment about the value you provide, the idea is to assess how much your customers in your different customer segments are ready to pay.
Basically, to find the right balance between the market share and maximized revenue per customer the best approach is to ask the following questions.
Those are the classic four key questions to be asked to assess the willingness to pay:
At what price would you consider this data to be so expensive that you would not consider buying it? (Too Expensive)
At what price would you consider this data to be priced so low that you would feel the quality could not be very good (Too cheap)
At what price would you consider data is starting to get expensive so that is not out of the question, but you would have to give some thought to buying it (Expensive/High side)
At what price would you consider data to be a bargain - a great buy for the money (Cheap/Good value)
Then, when you have enough replies which highly depends on the business case and potential customers, you need to map out the answers in such a graph (above) so you can map out your price range. Based on this graph, if you price 150$, you have the highest market share and 500$ the highest return per customer.
Next, we will look at how to choose and implement the right subscription business model to an API product.