Enough With Feel Good Data Science

Your SaaS startup reaches its two-year anniversary, and you lock a new round of funding. Every measure of customer success is strong. Users report high levels of satisfaction. They log in a lot, they like you on Facebook and they read a lot of your emails. In a survey, 90% said theyd recommend your product to a friend. Investors are impressed. Churn is at a high but acceptable level for a young startup, but over the next six months, it fails to improve. Instead, it slowly creeps up to problematic levels and you cant understand why.

Startups get blindsided like this when they rely on feel good data science: big data analytics that mashup qualitative measurements with quantitative science. Being data-driven is the stated goal of most tech executives, but you cant be data-driven just because you wave your magical data science hands in the air. If you want to really understand what your customers think, and whether they are prime for upselling, conversion or churn, you need to strictly separate qualitative and quantitative data. Its time to discover rather than assume what metrics mean, and its time to stop dicing customers into imaginary groups.

We intuitively know that qualitative metrics are unscientific, but they look good. When you take a number like average log-ins and arbitrarily give it a weight of 20% in your customer success algorithm, youre converting it into a qualitative metric. This kills the data science and lulls you into a fantasy.

Unfortunately, that is how most data science is conducted today. All sorts of measurements logins, time spent in the product, engagement with marketing emails, etc. are given subjective weights.

Companies also rely heavily on self-reported data. Customers are often willing to give their satisfaction levels, rate different experiences and declare whether or not theyd recommend the service to a friend. Theres nothing wrong with this data, but if you mash it and weight together with data based on user actions, you spoil the quantitative data.

Stop tricking yourself.

When it comes to understanding a customers probability of upgrading, continuing to pay for your service or unsubscribing, you cannot equate what people say with what they do. Likewise, you cant impose meaning on quantitative data until you establish correlations between actions.

The whole point of big data is to find patterns and trends independent of opinions. However, drive-by data science occasionally running large-scale data science projects to uncover correlations is common and misleading because the conclusions begin to decay immediately as your customer base, onboarding process, marketing campaigns and other variables change.

An even bigger problem is the practice of pre-assigning meaning to data. For instance, you could (smartly) assume that your most active users are most likely to upgrade. And you could be wrong.

One way is to routinely take random samples of SaaS users and split them into three groups: a random control group, the most active users (those who log in most) and an algorithmically-selected group that we identified as most likely to upgrade by applying machine learning to a large number of behavioral inputs for each customer. Then observe.

Read the original post:
Enough With Feel Good Data Science

Related Posts

Comments are closed.