Copy Testing

Copy testing (or advertising research) is a general class of research tests that evaluate and diagnose the communication power of an advertisement – either broadcast (television, radio), print (magazine), or ads streamed on the internet.

When Used
Copy tests are an integral part of the creative development and assessment process, and (of necessity) always follow the development of one or more basic advertising directions. These alternatives attempt to embody an advertising strategy that has been identified through previous phases of research (e.g., strategic research or concept testing). Copy tests are usually conducted (1) after a strategic or positioning study indicated an opportunity for the brand that, in turn, fed copy development; (2) after qualitative research has been used in the creative development process; or (3) after tracking research has indicated that the current campaign is no longer building awareness or image. Practically speaking, copy tests can be conducted at any time there is new advertising that needs to be evaluated.

Stimuli
The single biggest category of copy testing remains TV (though this is changing rapidly), where the stimuli are usually 30- or 60- second spots (again, changing rapidly – as of this writing, new formats of six-seconds are being introduced). In diagnostic tests, the state-of-finish can be “animatic” (hand-drawn, still animation with voiceover), “photomatic” (with still, stock photos), “steal-o-matic” (live sound and motion, but stock footage), or “rough cuts” (agency footage before editing or cleanup). For final on-air testing, only finished (ready-to-air) executions are used. Similar but fewer gradations are used in radio or print.

Copy Test Design
There are two basic copy testing approaches for TV – off-air, and on-air. Off-air tests focus on whether the copy effectively communicated its intended strategy, and provides more diagnostic information on specific copy elements than on-air tests. Off-air approaches are “forced exposure” tests (usually in a mall or theater environment), in which respondents view a clutter reel of competitive ads, with the test ad in the middle. Because a lower state-of-finish is acceptable, off-air stimuli are less costly, and these tests are more often used at an earlier stage of the copy development process.

On-air tests are executed on an unused cable TV channel among people who have been recruited to view a fictitious ½-hour pilot TV show. Respondents see ads for other categories but see only one test ad. On-air tests excel at evaluating copy performance in a real-world setting, and whether the advertisement “broke through” (i.e., was recalled).

Similar off-air approaches are used for radio testing (no “on-air” versions exist). Print testing usually involves the placement of the test ad in a mocked-up version of a national magazine, or can also involve eye-tracking to determine which elements were seen while reading the ad.

Several companies have specialized systems for copy testing. The advantage of using specialized companies is their normative databases, spanning many years of tests in many categories. Measures typically include:

Recall of ad
Main point communication
Proven recall (correct playback of copy elements)
Total copy and situational/visual playback
Purchase intent, or a pre-post persuasion score
Brand likes, dislikes
Imagery/personality ratings
Attribute/brand performance ratings
Classification and demographics

Sample Frame
Copy tests are generally comprised of targeted samples of 150-200 respondents per cell, with boosts to read diagnostic subgroups of interest as needed. Study costs depend on the number of executions and screening requirements. As with concept screening, other factors to keep in mind are consistency in format, screening, geography, and question sequence.

Pros & Cons
Pros: Critical consumer feedback on whether copy is “in sync” with intended overall marketing strategy; availability of normative database; expertise of copy testing companies.

Cons: Use of norms requires a more rigid survey instrument; norms can overshadow other factors (i.e., a below-norm ad that may increase use-up); in some categories, rough testing may not capture tonality or mood; we cannot assess the impact of an entire campaign with one test, and some campaigns will build brand equity more slowly.

Timing
Cycle time (excluding stimuli preparation) from field start to an initial presentation is typically a few weeks, but this will vary based on screening requirements.

Subsequent Steps
Tracking studies to monitor in-market performance of brand on both product performance and brand image, and/or qualitative research to assess problems in the marketplace after a campaign has launched (e.g., among trier-rejecters).