Hey there, it’s Jacob at Retention.Blog 👋
I got tired of reading high-level strategy articles, so I started writing actionable advice I would want to read.
Every Tuesday I share practical learnings you can apply to your business.
What’s a cohort?
A cohort is just a fancy name for a group of users.
Cohort analysis is used to understand how different user groups engage with your product over time.
What types of cohorts are there?
Time-based cohorts
Uses acquisition week/month/year
Behavior-based cohorts
Purchasers vs non-purchases
Types of features used
Engagement level
Campaign recipients
Attribute-based cohorts
Use demographics like geo or age
Time-based cohort analyses are some of the most common.
An example could be looking at groups of users who all joined in the same week.
What users joined in the week of May 1st?
What users joined in the week of May 8th?
And then what did those groups of users do in the subsequent weeks after joining?
Why should I care?
These are often used to understand different acquisition cohorts.
By looking at the behavior of all the users who joined in a specific week, you can understand how the changes you’re making are improving or hurting your retention.
Or you can look back in time to see how campaigns you launched in the past affected the longer-term retention of those users.
Grouping users into cohorts allows you to isolate changes in conversion or behavior. Looking at averages or data in aggregate will usually be misleading and not tell you the whole story.
Let’s look at a simple cohort analysis chart:
This chart looks at new user retention grouped by week.
Column A shows the signup week of the cohort
Column B shows how many new users signed up each week
Columns C - L show what percentage of the initial cohort who joined that week are still active in the product.
My favorite part of this chart is the colors (yes, I’m a child). It’s so easy to glance at the chart and have quick takeaways without even reading the numbers!
What do we see?
The first question we should ask is what happened the week of June 19th?
Looking deeper, we see that week 1 retention was decreasing for the three previous weeks. Maybe a product change was released in the week of May 29th that was fully rolled out by the week of June 19th?
This chart doesn’t tell you specifically what’s wrong or what’s going well, but it shows you where to start looking. It’s a great high-level view of your product retention.
This same chart could be segmented in different ways to understand not only the performance of new users but also by something like geo or customer type.
Can we look differently at the same data?
Plotting this same retention data with a line chart can help us see when our retention stabilizes.
Ideally, these lines start to flatten out. This means we have a stable base of users and over time, our user base will grow.
At a glance, it looks like around week 10 we’re no longer losing a sizable amount of users.
Depending on our product, that could be okay. Or it could mean we’re losing too many users.
For example, for a freemium mobile app, this could be amazing retention. For a higher-priced B2B SaaS, this likely wouldn’t be good enough. A lot depends on your industry and the type of product.
What’s good retention for your product? Check out this masterpiece from Lenny’s Newsletter.
Another type of chart I’ll use is for tracking cohorts of users who’ve been exposed to different product experiments, received specific campaigns, or were in holdout groups. Looking at the behavior of these different groups over time can help decision-making.
It’s common to run A/B tests and look at the results after a week or 30 days, but it’s less common to launch an experiment or marketing campaign and track the cohorts for months after to understand long-term behavior.
We don’t always have the luxury of waiting weeks or months to make a product or marketing decision, but this is something we can set up to look back at in a month or two.
What if we get short-term gains, but at the sacrifice of longer-term retention and we never found out?
Or if a campaign was successful at first, but we launched another campaign that caused it to have a diminished impact.
Tracking user cohorts receiving certain treatments over a longer time than your test period can be valuable. It’s not necessary all the time, but a useful tool to pull out when you need it.
What else should I think about when building cohort analysis?
Depending on your product, you may want to look at different retention windows.
Daily retention: Measure user activity each day.
Count the users active on day 1, day 2, day 3, and so on. This is more strict because a user has to use the app each day. Better for products with higher engagement and activity levels. Social media products are usually measured by daily active users.
Weekly retention: Measure if a user has activity during any week.
If they open the app during week 1 and week 2, they’ll be counted as active for both weeks. Good for products with medium activity levels, or looking at retention over an extended period. Looking at retention for a whole year by each day is a bit cumbersome.
Monthly retention: Measure if a user has activity during any month.
Similar to weekly retention, but for a longer period.
For products that have infrequent use cases, this makes more sense. Something like Airbnb, Zillow, or Zocdoc would likely find this more helpful. You don’t go on vacation every day or usually need a doctor multiple times a month.
Yearly retention: Not used as frequently since it’s such an imprecise view. It’s more common when looking at purchase behavior or subscription renewals vs measuring product usage/activity levels.
Don’t have an analytics stack where these charts are possible?
I’ll show you how to set up a simple one-time analysis in Excel, but ideally, these cohort analyses are built into your data viz or dashboarding tool so they’re constantly updated. Excel can help prototype different data visualizations you’d like to build, but it shouldn’t be your final home for reporting.
If you have hundreds of thousands of users, you won’t be able to do it all in Excel with any efficiency. And you should have better data analysis tools at that point :)
You may need some SQL to get the data formatted. Even better if you get the whole query written so you don’t have to do the final calculations/formatting in Excel, but sometimes a SQL + Excel combo can be faster. Depending on your database/SQL, it can also be resource-intensive and take forever to load.
How to get started:
Write a query to find the user’s first week, and then you can add “0” or “1” in each subsequent week column on whether they had any activity or not. (example above)
In a second sheet in that document, you can use this formula to sum the active users by weekly cohort and by week.
=if({First Sheet Name}!$B3=$A2,sum({First Sheet Name}!C:C,”No Data”)
Use “$” on the column to lock the variable, so you can drag the formula across and down the cells.
Once you have the counts, it’s simple to create a second table of the percentage values.
The counts are interesting, but the percentage retention values are more valuable because you can compare week to week with normalized values.
Looking to write some SQL to create your cohort analysis in your data viz tool? I’m a fan of this article from Terence Shin which walks through the steps better than I could: https://towardsdatascience.com/a-complete-guide-to-revenue-cohort-analysis-in-sql-and-python-9eeecd4c731a
Summary:
Time-based acquisition cohorts are the most commonly used and can be valuable in understanding changes in conversion or user behavior
Cohort analysis doesn’t tell you what specifically is causing problems or going well, but can tell you where to start looking
We want our retention cohorts to “flatten” over time since this means we’re keeping our users or customers
Try tracking marketing cohorts over a longer time period to better understand your users
You’ll want to measure retention with different periods (daily, weekly, monthly) depending on your product
Use Excel to build or test your initial analysis, but use SQL or a real data viz tool as you scale so your data is dynamic
What else would you want to know about cohort analysis? Let me know in the comments.