Email pre-testing: Determining required group sizes and margins of error

When testing, it’s a good idea to have some formulas to hand. For instance in split A/B/n test scenarios, you may want to inspect the relationship between sample size, level of significance, and power. Also when renting lists, no one likes to buy a pig in a poke. Instead, the campaign has to be tested on a small segment first. Only if the test turns out to provide a good return on investment, the full run will be booked.

However, the question is, how many recipients should one book for the test? Including too many recipients would only cost in case the list proves to be unprofitable. Renting too few subscribers on the other hand bears the risk that the test results are due to chance. Here’s a hands-on solution.

The sample.prop function

Again, I assume you have the statistical R software package (info1, info2, …) installed. If you didn’t do so, yet, please follow the instructions in the posting on how to determine statistical significance of tests. (Don’t panic – the installation takes only a few clicks.) After starting the R console, paste in the following function code and hit enter:

RKsample.prop <- function (p = NULL, n = NULL, sig.level = NULL, error = NULL, N = NULL) {
# Computes the required sample size for email marketing pre-tests, list seedings, ...
# Alternatively: calculates margin of error or confidence levels for given sample sizes
# Args:
#   p: proportion to examine (e.g. '0.1' * 100 percent click rate or inbox placement rate).
#   n: sample size (e.g. '100' subscribers).
#   N: population (e.g. '1000' subscribers or '500' @gmail subscribers).
#   sig.level: level of significance (= 1-confidence level, e.g. '0.05' for 95% conf.).
#   error: margin of error around p (e.g. '0.01').
# Usage:
#   Leave either n, sig.level, or error NULL to calculate this value from the others
# Returns:
#   object of class 'power.htest'.
  if (sum(sapply(list(n, error, sig.level), is.null)) != 1)
    stop("\nEither sample size 'n',\nerror margin 'error',\nor significance level 'sig.level' must be set to NULL")  
  if (is.null(p) || is.null(N))
    stop("\nProportion 'p' and population size 'N' have to be specified")
  if (!is.null(sig.level)) {
    z = qnorm(1 - sig.level/2)
    if (is.null(n))
      n = ceiling(z^2 * N * p * (1 - p) / (error^2 * (N - 1) + (z^2 * p * (1 - p))))
    else if (is.null(error))
      error = round(z * (sqrt((p * (1 - p)) / n) * sqrt((N - n)/(N - 1))), 4)
  } else    
      sig.level = round((1-pnorm (error / (sqrt(p * (1 - p) / n) * sqrt((N - n)/(N - 1)))))*2, 2)
  method <- "Sample size for proportions (urn model without placing back)"
  note <- paste("At a sample size of ", n, ", be ",
                (1 - sig.level) * 100, "% confident that\np lies between ",
                max((p - error) * 100, 0), "% - ", min((p + error) * 100, 100),
                "% in the population of ", N, sep = "")
  structure(list("N" = N, "n" = n, "p" = p, "error" = error,
                 "sig.level" = sig.level, note = note, method = method), class = "power.htest")

The formulas are based on classical inductive statistics. It’s what direct marketers already used for decades. After you pasted the code into the R-console, the result should look like this:

Now, the command “RKsample.prop()“ will be available to you. (Alternatively, you can add the code to your .Rprofile or file and save it. This would load the function automatically on each startup of the R console later on.)

The function’s parameters

The function RKsample.prop() offers us five parameters to play with:

  1. ‘p’ (proportion):
    This is the key performance indicator on which we will decide, if the test was a success. Take the email open rate for instance. Perhaps because you pay a performance fee that is based on the open rate. If you expect a 20% open rate in the test, you would set p to 0.20 .
  2. ‘N’ (population):
    The population is defined as the size of the relevant list segment out of which we take the test sample. If the list owner can offer you 200.000 addresses that fulfill your target group criteria, then that’s the population. You would tell the function to use N = 200000.
  3. ‘sig.level’ (level of significance):
    You want to be 95% confident that the test result is not due to chance? In that case, set sig.level to 0.05 (= 5% = 100 – confidence level).
  4. ‘error’ (margin of error):
    With the margin of error, you can control the width of the confidence interval, in which the true value for the population will fall. The value will fall in an interval between p – error and p + error. Setting error to 0.01 (= 1%) and p to 0.20 (= 20%) would mean that the open rate in the population could lie between 19% and 21%.
  5. ‘n’ (sample size)

How can it help us?

  • Assume you want to make 95% sure, the main deployment to all 200.000 recipients achieves an open rate of 20% +/- 1%. That’s what the list owner predicts, and he knows his subscribers.
    What test size ‘n’ do we have to rent? Entering…

    RKsample.prop (sig.level=0.05, N=200000, p=0.20, error=0.01)

    … will result in the following output:

    By leaving ‘n’ open (or setting n=NULL), the procedure knows that we want to calculate the sample size from the given values. It reports that we need nearly 6.000 recipients. Therefore, the test has to go out to about 3% of the relevant list segment. (Note: 2e+05 is the scientific notation for 2 * 10^5 = 2 * (10*10*10*10*10) = 200000 total recipients.)

  • What if we would accept a larger error margin of… say… 2% instead of 1%?
    RKsample.prop (sig.level=0.05, N=200000, p=0.20, error=0.02)
         Sample size for proportions 
    (urn model without placing back) 
                  N = 2e+05
                  n = 1525
                  p = 0.2
              error = 0.02
          sig.level = 0.05
     NOTE: At a sample size of 1525, be 95% confident that
    p lies between 18% - 22% in the population of 2e+05

    Now, we would only need to send the pre-test to about 1500 addresses.

  • What can we conclude with 80% confidence from a test to 1500 recipients that showed 26.65% open rate?
    RKsample.prop (n=1500,sig.level=0.2, N=200000, p=0.2665)
         Sample size for proportions 
    (urn model without placing back) 
                  N = 2e+05
                  n = 1500
                  p = 0.2665
              error = 0.0146
          sig.level = 0.2
     NOTE: At a sample size of 1500, be 80% confident that
    p lies between 25.19% - 28.11% in the population of 2e+05
  • How confident could we be at a margin of error of 0.5% instead of 1.46%?
    RKsample.prop (n=1500, N=200000, p=0.2665, error=0.005)
         Sample size for proportions 
    (urn model without placing back) 
                  N = 2e+05
                  n = 1500
                  p = 0.2665
              error = 0.005
          sig.level = 0.66
     NOTE: At a sample size of 1500, be 34% confident that
    p lies between 26.15% - 27.15% in the population of 2e+05

Now, it’s your turn. Try to experiment with the parameters to see, how each one affects the required sample sizes, margin of error, and level of statistical significance.

Enjoyed this one? Subscribe for my hand-picked list of the best email marketing tips. Get inspiring ideas from international email experts, every Friday: (archive♞)
Yes, I accept the Privacy Policy
Delivery on Fridays, 5 pm CET. You can always unsubscribe.
It's valuable, I promise. Subscribers rate it >8 out of 10 (!) on average.

2 Responses to Email pre-testing: Determining required group sizes and margins of error

  1. Pingback: Can seed lists measure email deliverability? A practical guide by the numbers. | E-Mail Marketing Tipps

  2. Pingback: 1999 days of email blogging - time to thank 9 content marketing tools

Leave a Reply

Your email address will not be published.