An old trap
From time to time I find myself getting into trouble when designing research and thinking about incidence, sometimes referred to as prevalence, or rate of occurrence. The trouble I find myself in relates to the the net percent of respondents who will, ultimately, qualify for a study based on the screening criteria set by my client. This is true for both off-line as well as online studies (though the cost implications for off-line studies are often greater). For simple projects, in which I have general sample specifications, the process is straightforward and uncomplicated. Sample providers generally understand a request for a “general population sample”, typically resulting in a proportionately representative sample based on four census regions, ages 21 to 64, and proportionately balanced on race, ethnicity, income, and education. The problem I experience, however, is typically the result of special groups or “boosts” that can confound an overall incidence level – and significantly affect costs.
Let’s say I have asked my sample provider for a “gen pop” sample, and they quote me $5 per complete assuming 90% incidence. That means for every 1,000 people I screen, and all things being equal, 900 will qualify. But say that my client has three small brands A, B, and C, each at lower incidence levels, and they would like a readable sample of 200 per brand. And, let’s say that these three brands have past year purchase incidence rates of 15%, 10%, and 5% respectively. Assuming a sample of 1,000, I would end up with 150, 100, and 50 respondents, respectively. It will cost me significantly more for these respondents then for my “gen pop” sample. At 15%, my cost per completed interview (CPI) might be $20, at 10% it might be $30, and at 5% it might be $50. Note that I am using dramatic changes in the CPI for example purposes.
So, what should I do in terms of giving sample specs to my providers? I don’t want to receive costs that are too low, and base my proposal on those costs, only to get the project and then have much higher out-of-pocket expenses than foreseen. By the same token, I don’t want to overestimate my sample costs, and risk losing the project because I have not calculated correctly and am noncompetitive.
We also have the added complexity of brand interaction, meaning that some of those respondents who are using the brand at 15% may also be using the brand at 5%. So here are the probable outcomes based on this scenario:
- General population (“gen pop”) sample of 1,000 respondents
- 150 A’s who will fall out of the gen pop sample
- 100 B’s who will fall out of the gen pop sample
- 50 C’s who will fall out of the gen pop sample
Assuming that they are normally distributed, we also need to consider interactions between the brand quota groups, for example:
- 15 B’s will fall from the A sample (10% x 150)
- 7-8 C’s will fall from the A sample (5% x 150)
- 5 C’s will fall from the B sample (5% x 100)
Cost implications of a miscalculation
So, out of our total 600 quota (3 groups x 200 per group), approximately 328 will fall from the rep sample, comprised of 150 A’s, 115 B’s, and 63 C’s. This means that we will be short approximately 272 respondents at a mix of incidences (50 A’s, 85 B’s, and 137 C’s). In order to fill these remaining quotas, and assuming they were filled independently, we might think that we need (333+580+2,740)=3,653 to find our 272, or 7.4% incidence. But, in reality, we need only the 2,740 as the others will fall, hence 272/2740 or almost exactly 10%. That figure was basically the $30 CPI figure, for a total cost of $8,160. Had we calculated independently, costs would have been (50 x $20) + (85 x $30) + (137 x $50) = $10,400, about 27% higher than the overlap method – pretty meaningful, especially when computed over many studies during the year.
So, when mapping out your sampling plan, pay attention to the subtleties of incidence. The cost implications can be significant, as well as the odds of having your proposal remaining in the pool of competitive bids, and you in the good graces of your clients.