Hashing email lists for AdWords Customer Match

rapidminer_adwordsGoogle AdWords Customer Match lets you target email lists for remarketing purposes. Marketers upload their subscribers, and Google Search, YouTube, or Gmail, match them against signed in users to show them targeted ads. This makes it possible to e.g. reach out to inactive email subscribers on other channels easily.

To create such a list, one has to meet certain requirements. For privacy reasons, it’s always a good idea to hash addresses as a means of pseudonymization. Below you’ll find a step-by-step guide to accomplishing that using RapidMiner, an open source predictive analytics platform, which may be an alternative to using Excel.

(I also put the process in my Dropbox, so that you can reproduce it by using “import process” in RapidMiner’s file menu.)

  • First, download and install RapidMiner Studio, run it and start building a new process (CTRL-N).
  • Select the “Read CSV” operator (or another operator that corresponds to your file format) in the Operators panel on the left, and drag & drop it onto your main process worksheet:

rapidminer_1_readcsv

  • Make sure the “Read CSV” operator is selected within your main process (left click). Then use the “Import Configuration Wizard” in the Parameters panel on the right to tell RapidMiner the format of your data file:

rapidminer_import

  • Now, drag and drop the “Generate Empty Attribute” operator onto your worksheet, connect its input port with “Read CSV” operator’s output port, and configure it to add new column named “sha256” of type “polynomial”. You may need to switch to “Expert View” (press F4) in order to select the type:

rapidminer_2_createAtt

  • Add the “Select Attributes” operator, connect it to the “exa” output port of the “Generate Empty Attribute” operator, and configure it to return only a single attribute, namely “sha256”:

rapidminer_3_selAtt

  • Add the “Write CSV” operator, connect it with “Select Attributes” and with the “res” port of the process on the right, and set up its parameters. You have to deselect the options “write attribute names” and “quote nominal values”. I chose “hashes.csv” as a filename:

rapidminer_4_writecsv

  • Look for the “Execute Script” operator and drop it on (!) the connecting wire between “Generate Empty Attribute” and “Select Attributes”. Edit its text and paste in the following code snippet, which will fill in the SHA-256 hashes in our newly generated column named “sha256” for the normalized values in a column named “Email”. (You may need to rename the column name, which contains the addresses, into “Email” beforehand using the “Rename” operator.)

rapidminer_5_exScriptrapidminer_6_exScript

This is the relevant code snippet in plain text:

import java.security.*;

ExampleSet exampleSet = operator.getInput(ExampleSet.class);
MessageDigest m = MessageDigest.getInstance("SHA-256");

for (Example example : exampleSet) {
  normEmail = example["Email"].trim().toLowerCase();
  m.update(normEmail.getBytes());
  byte[] digest = m.digest();
  BigInteger bigInt = new BigInteger(1,digest);
  String hashtext = bigInt.toString(16);
  // Zero pad it to get 64 SHA-256 hash characters.
  while(hashtext.length() < 64 ) {
    hashtext = "0"+hashtext;
  }
  example['sha256'] = hashtext;
}

return exampleSet;

 

  • Now, run the process by pressing F11. If you set it up correctly, it will lead you to the Results Perspective (F9):

rapidminer_7_runrapidminer_8_result

  • Et voilà, the hash for examPLe@GMAIL.com is 264e53d93759bde067fd01ef2698f98d1253c730d12f021116f02eebcfa9ace6, just like in the Google help. You are ready to upload the hashes in “hashes.csv” to AdWords:

rapidminer_adwords_2rapidminer_adwords_success

Enjoyed this one? Subscribe for my hand-picked list of the best email marketing tips. Get inspiring ideas from international email experts, every Friday: (archive♞)
Yes, I accept the Privacy Policy
Delivery on Fridays, 5 pm CET. You can always unsubscribe.
It's valuable, I promise. Subscribers rate it >8 out of 10 (!) on average.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.