Tracking, or not…

Tracking, or not…

Sort of like an Eskimo tracker
“Tracking” studies have been a staple of marketing research departments for decades, yet they still mean different things to different researchers and companies. Generally speaking, conventional thinking holds that tracking studies are surveys conducted on an ongoing or wave basis, and intended to gather marketing intelligence on brands, advertising, messaging, users, or usage. It is the repeated nature (the same questions over time) that distinguishes tracking from other types of research. Tracking can certainly include diagnostic measures, but such measures tend not to be extensive.

The primary emphasis of tracking is, or should be, on being an efficient barometer of brand performance. The focus is on key performance indicators (KPIs), which typically include unaided and aided brand and advertising awareness, brand purchasing, usage occasions, performance measures (attributes), and basic classification questions. If we were to further extend out from these basic measures, we move into the realm of strategic research, which tracking is most certainly not — although tracking can point to areas that need to be explored in greater depth. Knowing where that line is crossed is a judgment call, hence each client must make his or her own decision regarding the “sweet spot” of breadth and depth.

With a steady stream of multiple data sources, from customer service to social media, tracking research has fallen out of favor somewhat. While many marketing research departments continue to conduct large-scale trackers, their utility and actionability are increasingly called into question given the multitude of other seemingly substitutable sources. Some clients appear quite comfortable relying solely on sales data, and the output of search engines that scrape social sites and web commentary.

If tracking hasn’t proven to be useful, you’re doing it wrong
No one will argue that (syndicated) sales data or site transaction data isn’t critical, but these data sources are not very useful forward-looking predictors. Social listening, while improving, produces data that is event-specific and volatile, and relying on the spiky nature of this data forces marketing into making knee-jerk reactions, and also diverts attention and resources away from more serious strategic management decisions. Hence a question is: What types of feedback should be institutionalized in the decision-making process? This seems to leave a fairly wide-open playing field for more thoughtfully designed and executed feedback tools, which tracking research most certainly can be. The recent (June 2017) decision by P&G to cut $125MM+ in digital ad spending was most certainly informed by smart tracking. Tracking research can and does impact significant business decisions.
The most tired, predictable criticisms of tracking research are that it is expensive, slow and, as a result, not particularly actionable. Let’s knock these straw men down one at a time.

If conducting consumer satisfaction research, sample acquisition costs are near-zero (more on that later). Of course, costs are incurred if your tracking program requires external sample. Costs obviously increase if you work in narrowly defined categories, limited geographies, or where incidence is low and screening costs are high. However, in most widely penetrated categories, incidence is not a huge factor, and the added benefits of sample quality are strong counterarguments against erratic sources with little quality control. Reliable data is rarely free, so perhaps the better case is a value story, and not purely a rebuttal using costs alone. Spending $100K on a smart tracking program to assess the impact of a $20MM ad budget seems compelling to me.

Management has to listen
The slowness critique is also a red herring, in that most tracking programs conducted online are done so continuously, and can be aggregated in any number of increments to reflect category dynamism or rate of change. If you are dealing with faster moving categories, or need more extensive subgroup analysis, then more frequent reporting is implied. Aggregations at the weekly, monthly, or quarterly levels are typical; some experimentation is needed to choose the right frequency of reporting for your business.

Last, the actionability of tracking research is always a function of both design and execution. With (a) the correct sample frame, (b) questions clearly linked to business processes, and (c) smart reporting (including exception reports and alerts), tracking becomes your primary radar screen and course correction tool for the business. As you likely know, customer satisfaction research gives us the added advantage of including a key field, such as a customer or company ID, which in turn lets us link survey results to transaction history. Once connected, we can unpack the relationship between perceptions and behavior at the respondent level, have the ability to develop predictive models, and inform and monitor specific marketing actions. We’ve had much success in this area.

To many of you, especially more senior researchers, most of this is not particularly newsworthy. But, it does raise the question of why tracking is not more widely used, and mined more fully, and better embedded within organizations. There are probably many reasons, although the most obvious include:

  • Not re-examining your tracking program on a frequent enough basis (e.g., quarterly) to make sure that category, brand, attribute, or diagnostic measures are up to date. Failing to keep tracking programs current lets them lapse into obscurity.
  • Seeking input from multiple people or departments, to identify consensus areas that are most important for workflow, and to identify when changes are needed.
  • Internally framing tracking research as a continuous improvement tool (Deming), and not just another form of sales data, which it not.
  • Not developing control bands on key measures to help identify data points that fall outside normative ranges and require corrective action. This avoids marketing uncertainty over what is, or is not, meaningful.
  • Missed opportunities to push reporting deeper into the organization using dashboards, visualizations, or exceptions/alerts to specific individuals or teams.
  • Increasing added value by including modules on hot topics that can be swapped in or out, depending on current market conditions (while keeping survey length brief).
  • Failing to connect the sample frame or key questions to specific business decisions.
  • Failing to leverage the linkage between survey responses and transactional data.
  • Taking tracking research too far. Most buying decisions are emotional, hence tracking research can only get us part of the way there. Other forms of research (qual only, hybrid quant-qual, strategic) are still needed to continue to learn about consumer behavior.

Let’s improve the measures
One last (loosely related) thought… given all of the customer satisfaction work being done today, one metric that we need to develop more fully is a prospect-to-conversion score. Tracking is a perfect vehicle for this, yet customer satisfaction studies rarely look outside the world of the brand itself. Much like NPS, creating a BPS (brand potential score) may prove to be a far more useful metric than knowing how satisfied current customers are. Over time, all customers defect or die. They must be replaced with new customers predisposed to the brand. Understanding this potential will be much more useful to the long-term health of the business. Tracking can help get you there.

Twists & turns of incidence

Twists & turns of incidence

An old trap
From time to time I find myself getting into trouble when designing research and thinking about incidence, sometimes referred to as prevalence, or rate of occurrence. The trouble I find myself in relates to the the net percent of respondents who will, ultimately, qualify for a study based on the screening criteria set by my client. This is true for both off-line as well as online studies (though the cost implications for off-line studies are often greater). For simple projects, in which I have general sample specifications, the process is straightforward and uncomplicated. Sample providers generally understand a request for a “general population sample”, typically resulting in a proportionately representative sample based on four census regions, ages 21 to 64, and proportionately balanced on race, ethnicity, income, and education. The problem I experience, however, is typically the result of special groups or “boosts” that can confound an overall incidence level – and significantly affect costs.

An example
Let’s say I have asked my sample provider for a “gen pop” sample, and they quote me $5 per complete assuming 90% incidence. That means for every 1,000 people I screen, and all things being equal, 900 will qualify. But say that my client has three small brands A, B, and C, each at lower incidence levels, and they would like a readable sample of 200 per brand. And, let’s say that these three brands have past year purchase incidence rates of 15%, 10%, and 5% respectively. Assuming a sample of 1,000, I would end up with 150, 100, and 50 respondents, respectively. It will cost me significantly more for these respondents then for my “gen pop” sample. At 15%, my cost per completed interview (CPI) might be $20, at 10% it might be $30, and at 5% it might be $50. Note that I am using dramatic changes in the CPI for example purposes.
So, what should I do in terms of giving sample specs to my providers? I don’t want to receive costs that are too low, and base my proposal on those costs, only to get the project and then have much higher out-of-pocket expenses than foreseen. By the same token, I don’t want to overestimate my sample costs, and risk losing the project because I have not calculated correctly and am noncompetitive.

We also have the added complexity of brand interaction, meaning that some of those respondents who are using the brand at 15% may also be using the brand at 5%. So here are the probable outcomes based on this scenario:

  • General population (“gen pop”) sample of 1,000 respondents
  • 150 A’s who will fall out of the gen pop sample
  • 100 B’s who will fall out of the gen pop sample
  • 50 C’s who will fall out of the gen pop sample

Assuming that they are normally distributed, we also need to consider interactions between the brand quota groups, for example:

  • 15 B’s will fall from the A sample (10% x 150)
  • 7-8 C’s will fall from the A sample (5% x 150)
  • 5 C’s will fall from the B sample (5% x 100)

Cost implications of a miscalculation
So, out of our total 600 quota (3 groups x 200 per group), approximately 328 will fall from the rep sample, comprised of 150 A’s, 115 B’s, and 63 C’s. This means that we will be short approximately 272 respondents at a mix of incidences (50 A’s, 85 B’s, and 137 C’s). In order to fill these remaining quotas, and assuming they were filled independently, we might think that we need (333+580+2,740)=3,653 to find our 272, or 7.4% incidence. But, in reality, we need only the 2,740 as the others will fall, hence 272/2740 or almost exactly 10%. That figure was basically the $30 CPI figure, for a total cost of $8,160. Had we calculated independently, costs would have been (50 x $20) + (85 x $30) + (137 x $50) = $10,400, about 27% higher than the overlap method – pretty meaningful, especially when computed over many studies during the year.

So, when mapping out your sampling plan, pay attention to the subtleties of incidence. The cost implications can be significant, as well as the odds of having your proposal remaining in the pool of competitive bids, and you in the good graces of your clients.

Rethinking NPS

Rethinking NPS

We’ve seen this before
A recent review of tracking data with one of my e-commerce clients involved a discussion around NPS, or the “Net Promoter Score”, a measure that has been widely adopted in survey and customer satisfaction research. In research and marketing circles, familiarity with NPS is high, as is the 0-10 (thus 11) point scale from which it is derived. The scale defines three groups: Promoters, Passives, and Detractors. The percent who are Promoters minus the percent who are Detractors (multiplied by 100) is the NPS. I won’t go into the mechanics of how the groups are defined because that, too, is widely known. The issue for my client was that their NPS score was low when compared to other e-commerce sites and/or businesses they considered to be competitors. The question was: Why?

My client runs a dealer network, in which dealers sell their wares through the client’s e-commerce platform. You will recall that NPS is based on the concept of a recommendation to friends or colleagues. However, in a dealer network, dealers are effectively competing with one another. As a result, a dealer’s willingness to recommend my client’s products actually results in the dealer helping a competitor, which is not exactly what the dealer had in mind! We might find this same dynamic in a number of business-to-business or competitive marketplaces in which a recommendation is, effectively, self-defeating. This makes NPS totally inappropriate for businesses in which a “recommendation” is unnatural. I can recall personal situations when people have asked me for contractor recommendations; I’ll give them names but hope they won’t steal my network!

This seemingly small hiccup in the conceptual approach to NPS raised a larger issue. Many senior marketing executives believe that they need but one consumer measure (gathered in customer research) to determine customer health, and can overlook other measures of satisfaction or happiness that have also proven to be valuable indicators. Specifically, overall satisfaction, various measures of quality, perceived value, appropriateness for use, and so on all have their place and, collectively, provide a multi-dimensional view of customer health. There are also other self-reported behavioral measures (e.g., frequency of purchase, sizes, flavors, etc.) as well as attitudinal measures that can aid in interpretation of the broader consumer mindset.

Never supported by data
In his wonderful book “How Brands Grow”, Byron Sharp thoroughly debunks the notion of NPS, as well as the “thought experiment” that assumed zero cost for customer retention. You may recall that NPS was presumed to measure the level of retention a company could expect to achieve. There was little empirical evidence to support the relationship between high NPS and high customer retention. Sharp goes on to demonstrate that high levels of customer retention are (a) impossible to achieve, even in businesses presumed to have high levels of loyalty, due to natural switching behavior and (b) even high levels of customer retention cannot generate any significant positive impact on revenue because the number of truly loyal customers is so small.

Like a bad penny
Yet NPS persists. Why? Initially, NPS adoption was driven by its Harvard Business School imprimatur, but more importantly, it is simply due to inertia: it has already become institutionalized at many companies. For those who do not have time to dig below the surface, NPS serves a purpose in that it is simple. And, there are available published benchmarks which lend some comfort to marketing management — i.e., there is always some professional pressure to rely on what is au courant: the SVP of marketing will look snappy in the C-suite.

I suppose that in organizations with limited resources, who can only afford to ask one summary question, NPS will do. But an overall happiness, satisfaction, or liking measure is always superior. In my statistical work linking evaluative measures, such as intent to purchase, satisfaction, or willingness to recommend, I have consistently found higher correlations between satisfaction and a hard dependent measures of consumption (i.e., transactions or sales) than with a willingness to recommend (NPS). I suspect that this is because “willingness to recommend” is one step further removed from actual product experience or satisfaction. For example, think of a deodorant or toothpaste. You may be very happy with your deodorant, in which case you will report high satisfaction, but will you go out of your way to recommended it to someone else? That doesn’t pass the smell test (sorry, too tempting a pun).

If you are considering a customer satisfaction or tracking program, and are exploring what measures to use, experiment a bit. Choose from other measures that are more directly linked to customer happiness and brand health, and build a normative database that is best for your business around them. I highly recommend it (oops)!

Surveys & Forecasts, LLC