It might be to the new Gmail – the emojis seem to have been deprecated in the new version. Unfortunately… 🙁 ]]>

Great question, thanks. Discretizing could make things easier and more efficient. However, is it to be expected as effective? E.g., where to draw the boundaries and when do you send – 3 hours is a rather huge time span. What if the 30 clicks occured between 19:30~20:00, and you send at 17:00? And how do you account for 20 of this clicks happened a year ago around 19:30, and 10 occured during the last month, atfer being reactivated, around 19:55?

In the end, the binning approach might work well, I did not try it – so please report if you gained some insights. 🙂

Best,

René

Thanks for sharing this.

I have a question: if we use customers’ historical email open/click data, why do not you just use stats to get the favorite day and time of click?

For instance, user A in total received 100 email, and he open and clicked 80 of them. In those 80 clicks, 30 were clicked during Saturday 17:00pm~20:00pm, 20 were clicked during Sunday 11:00am~14:00pm,

20 were clicked during Wednesday 21:00pm~24:00pm, the result of 10 were clicked at other time. From these we know user are more likely to open email Saturday 17:00pm~20:00pm.

Regards

Jie

## Identify click sprees ("sessions") based on how far away clicks are in time

identify_clicksprees <- . %>%

arrange(userid, campaignid, datumzeit) %>%

group_by(userid, campaignid) %>%

mutate(

difftime = datumzeit-lag(datumzeit, default = 0),

difftime = as.numeric(`units<-`(difftime, "secs"))) %>%

mutate(clickspree_id = cumsum(difftime > 60*20)) %>% # X sekunden x Y Minuten

group_by(clickspree_id, add = T) %>%

mutate(

clickspree_nclicks = n(),

clickspree_nmins = abs(Reduce(`-`, range(datumzeit))),

clickspree_nmins = as.numeric(`units<-`(clickspree_nmins, "mins")),

clickspree_ageDays = difftime(Sys.time(), min(datumzeit)),

clickspree_ageDays = as.numeric(`units<-`(clickspree_ageDays, "days"))) %>%

ungroup

identify_clicksprees <- . %>%

arrange(userid, campaignid, datumzeit) %>%

group_by(userid, campaignid) %>%

mutate(

difftime = datumzeit-lag(datumzeit, default = 0),

difftime = as.numeric(`units<-`(difftime, "secs"))) %>%

mutate(clickspree_id = cumsum(difftime > 60*20)) %>% # X sekunden x Y Minuten

group_by(clickspree_id, add = T) %>%

mutate(

clickspree_nclicks = n(),

clickspree_nmins = abs(Reduce(`-`, range(datumzeit))),

clickspree_nmins = as.numeric(`units<-`(clickspree_nmins, "mins")),

clickspree_ageDays = difftime(Sys.time(), min(datumzeit)),

clickspree_ageDays = as.numeric(`units<-`(clickspree_ageDays, "days"))) %>%

ungroup

For the clustering, I used:

## Cluster the timestamps of a user

#'

#' @param datumzeit POSIXct, response timestamp

#' @param eps DBSCAN Epsilon, maximum distance for merging points into clusters

#' @param othervars properly scaled matrix of other variables to include into clustering, besides the time of day of the response (e.g. age of click, weekday, ...)

getClusters <- function(datumzeit, eps = 0.5, othervars = NULL) {

# datumzeit <- seq(as.POSIXct("2017-12-18 00:00:00"), as.POSIXct("2017-12-18 23:00:00"), "1 hour")

h <- as.hms(datumzeit)

h <- hour(h)+minute(h)/60

ha <- 2*pi*h/24

m <- cbind(x = sin(ha), y = cos(ha))

# as_data_frame(m) %>% ggplot(aes(x,y)) + geom_point() + theme_minimal()

# data.frame(x = 0:23) %>% ggplot(aes(x)) + geom_rug() + theme_minimal() + theme(panel.grid = element_blank())

if (!is.null(othervars)) m <- cbind(m, othervars)

res <- dbscan(m, c("-E", eps, "-M", 1))

return(res$class_ids)

}

#'

#' @param datumzeit POSIXct, response timestamp

#' @param eps DBSCAN Epsilon, maximum distance for merging points into clusters

#' @param othervars properly scaled matrix of other variables to include into clustering, besides the time of day of the response (e.g. age of click, weekday, ...)

getClusters <- function(datumzeit, eps = 0.5, othervars = NULL) {

# datumzeit <- seq(as.POSIXct("2017-12-18 00:00:00"), as.POSIXct("2017-12-18 23:00:00"), "1 hour")

h <- as.hms(datumzeit)

h <- hour(h)+minute(h)/60

ha <- 2*pi*h/24

m <- cbind(x = sin(ha), y = cos(ha))

# as_data_frame(m) %>% ggplot(aes(x,y)) + geom_point() + theme_minimal()

# data.frame(x = 0:23) %>% ggplot(aes(x)) + geom_rug() + theme_minimal() + theme(panel.grid = element_blank())

if (!is.null(othervars)) m <- cbind(m, othervars)

res <- dbscan(m, c("-E", eps, "-M", 1))

return(res$class_ids)

}

I also used the following libraries:

Sys.setenv("WEKA_HOME"="C:\\Users\\Rene\\Weka")

library("rJava")

library("RWekajars")

library("RWeka")

WPM("load-package", "optics_dbScan")

dbscan <- make_Weka_clusterer('weka/clusterers/DBSCAN')

library(tidyverse)

library(lubridate)

library(hms)

library("rJava")

library("RWekajars")

library("RWeka")

WPM("load-package", "optics_dbScan")

dbscan <- make_Weka_clusterer('weka/clusterers/DBSCAN')

library(tidyverse)

library(lubridate)

library(hms)

Let me know if you have questions. The code is a bit messy… I also left in some commented debugging things.

]]>I, also, seek an animated flashing red dot (or red cop car flasher) emoji for email subject lines.

]]>