I criticized Turkish election polls a lot in the past for multiple reasons, so it is not a big surprise that my first blog post is about the performance of pollsters. (On a side note, Turkey is a country, where all major news outlets shared an election poll of samplesize ~100 based only on one village in Turkey because that village had historicaly voted close to the national vote result…) Election polls in Turkey tend to be very opaque since they generally are ordered by private parties who then share only the final results of the polls without any details on the methodolgies.

Since I do not have access to the methodologies behind the polls, I will try to demonstrate the issues that pollsters have regarding sampling through data points that are available such as sample size and polling end-date.

A few notes on the methodology I used:

\[Q = \sum_i \left[ p_i - \hat{p_i} \right]^2\]

# Clean June 2015 data

june2015 <- as.data.frame(t(read.csv("~/Documents/Personal/Projects/Blog/data/2015 June election.csv",
                                     stringsAsFactors = F)), stringsAsFactors = F)
june2015[1,1] <- "Date"
colnames(june2015) <- june2015[1,]
june2015 <- june2015[-1,]

june2015$Date <- as.Date.character(june2015$Date, format="%m/%d/%y")
june2015[,2:6] <- lapply(june2015[,2:6], as.numeric)

june2015$accuracy <- c(aaply(.data = as.matrix(june2015[rep(nrow(june2015),nrow(june2015)-1),
                                                        c(-1)] - 
                                                   june2015[-nrow(june2015),c(-1)])^2, 
                             .margins = 1, .fun = sum), NA)
# Clean November 2015 data

November2015 <- as.data.frame(t(read.csv("~/Documents/Personal/Projects/Blog/data/2015 November Elections.csv",
                                         stringsAsFactors = F)), 
                              stringsAsFactors = F)
November2015[1,1] <- "Date"
colnames(November2015) <- November2015[1,]

November2015 <- November2015[-1,]
rownames(November2015)[1] <-"A&G"
rownames(November2015)[5] <- "Kurd Tek"
November2015$Date <- as.Date.character(November2015$Date, format="%m/%d/%y")
November2015[,2:6] <- lapply(November2015[,2:6], as.numeric)

November2015$accuracy <- c(aaply(.data = as.matrix(November2015[rep(nrow(November2015),nrow(November2015)-1),c(-1)] - November2015[-nrow(November2015),c(-1)])^2, .margins = 1, .fun = sum), NA)
#Presidential polls analysis

presidential2018 <- read.csv("~/Documents/Personal/Projects/Blog/data/2018 elections polling.csv",
                             stringsAsFactors = F)

presidential2018[3,2]<- "OPTIMAR"
presidential2018[12,2] <- "Foresight"
presidential2018[13,2] <- "IYIP"

presidential2018[,3] <- replace_na(as.numeric(str_replace_all(presidential2018[,3], ",", "")),0)
presidential2018$Date <- as.Date.character(presidential2018$Date, format="%m/%d/%y")
#Presidential

presidential2018[is.na(presidential2018)] <- 0

presidential2018$accuracy <- c(NA, aaply( .data = as.matrix(presidential2018[rep(1,nrow(presidential2018)-1),4:(ncol(presidential2018)-2)] - presidential2018[2:nrow(presidential2018),4:(ncol(presidential2018)-2)])^2, .margins = 1, 
                                          .fun = sum))


accuracyPresidential <- presidential2018[-1,c(1,2,3,12)]

Unusual Relationship Between Sample Size and Accuracy of Polls

One interesting trend that emerges after crunching the numbers and plotting the graph below, is that polls with smaller sample sizes are more accurate than polls with bigger sample sizes (albeit with only a small margin). However, the most disturbing revelation is that sample sizes that are neither big nor small perform considerably worse than polls with small/big sample sizes.

This is quite unusual for two reasons:

  1. For a voting population of the size of Turkey’s, a poll that would like to predict a result to a 1% accuracy with 95% confidence would need ~2500 participants. Yet, polls that have a sample size lower than ~2250 perform on average better than polls with bigger sample sizes.

  2. Higher sample sizes can generally lead to more biased polls since it can often be hard to get a big sample that is representative of the voting population (i.e. a pollster might have an easier time getting people to share their opinion in a liberal neighborhood that might have the same size as another conservative one where voters are more reluctant to share their preferences)

These observations lead me to conclude that the questions asked by pollsters might not be neutral (i.e. they might subcosciously orient people being polled to vote for a specific answer) and that their sampling methodologies are simply not “random” (or balanced). This is the reason why we need pollsters to share more information than the sample size of their poll to predict the accuracy of a poll.

Please note that sample size data was only available for the 2018 Presidential elections

ggplot(data = accuracyPresidential[accuracyPresidential$Sample.size>0,]) + 
    geom_jitter(aes(x = Sample.size, y = accuracy)) + 
    geom_smooth(aes(x = Sample.size, y = accuracy, color = "LOESS fit"), fill = NA, 
                method = "loess") + 
    ggtitle("Presidential Elections 2018: Accuracy as a function of sample size")+ guides(color=guide_legend(title="Fit")) + xlab("Sample Size") + ylab("Accuracy") + 
    theme_bw()

Poll Accuracy Over Time

An interesting observation that historic polling data shows us is that early polls are more successful at predicting the results than polls done in the middle of the election season. Early polls seem even to be successful at predicting the final results. However, it is hard to judge if this result is true considering our small sample size in getting to the result (furthermore, the result does not hold for the 2018 presidential elections).

Either way this observation seems to say more about Turkish voters than pollsters. That is, voters switch their position throughout the electoral season but they eventually settle for their traditional party. It will be interesting to see if this trend holds in the future since the 2018 presidential election seems to go against the trend in previous data with some voters who eventually settled for different candidates (i.e. there seems to be a linear trend in accuracy over time).

ggplot(data = accuracyPresidential) + 
    geom_point(aes(x = Date, y = accuracy)) + 
    geom_smooth(aes(x = Date, y = accuracy, color = "Linear fit"), fill = NA, method = "lm") +
    ggtitle("Presidential Elections: Accuracy over time")+ guides(color=guide_legend(title="Fit")) +
    xlab("Timeline (in 2018)") + 
    ylab("Accuracy") +  
    theme_light() + 
    geom_vline(xintercept = as.Date("2018-06-24"), color = "blue", linetype = "dashed") +
    geom_label(x= as.Date("2018-06-24")-0.85, y = 500, label = "Election Day:\n 24 jun", size = 2.5) +
    xlim(c(min(presidential2018$Date),max(presidential2018$Date)))  +
    geom_hline(yintercept = 0, color = "black", linetype = "dashed")

ggplot(data = November2015[-nrow(November2015),]) + 
    geom_point(aes(x = Date, y = accuracy)) + 
    geom_smooth(aes(x = Date, y = accuracy, color = "LOESS fit"), fill = NA, method = "loess") +
    ggtitle("November 2015 Parliamentary Elections: Accuracy over time") +
    guides(color=guide_legend(title="Fit")) + xlab("Timeline (in 2015)") + 
    ylab("Accuracy") +
    geom_vline(xintercept = November2015["Election", "Date"], color = "blue", linetype = "dashed") +
    geom_label( x = November2015["Election", "Date"]-0.35, y = 150, label = "Election Day:\n 1 Nov", 
                size = 2.5)  +  
    xlim(c(min(November2015$Date),max(November2015$Date))) +
    theme_light()  +
    geom_hline(yintercept = 0, color = "black", linetype = "dashed")

ggplot(data = june2015[-nrow(june2015),], aes(x = Date)) + 
    geom_point(aes(x = Date, y = accuracy)) + 
    geom_smooth(aes(x = Date, y = accuracy, color = "Linear fit"), fill = NA, method = "lm") +
    ggtitle("June 2015 Parliamentary Elections: Accuracy over time") + 
    guides(color=guide_legend(title="Fit")) + 
    xlab("Timeline (in 2015)") + ylab("Accuracy") +  
    theme_light() + 
    geom_vline(xintercept = june2015["Election", "Date"], color = "blue", linetype = "dashed") +
    xlim(c(min(june2015$Date),max(june2015$Date))) +  
    geom_label(x= june2015["Election", "Date"]-0.35, y = 150, label = "Election Day:\n 7 Jun", 
               size = 2.2) +
    ylim(c(-20,150))+
    geom_hline(yintercept = 0, color = "black", linetype = "dashed")

Accuracy Per Pollster

Finally, I hope to build a predictive model for the upcoming local elections’ results using the results found by different pollster, so I thought that the historical performance of different pollsters will be a good reference point. For this ranking, I simply took the average of the firms’ accuracy for the races they made predictions for.

june2015join <- june2015
june2015join$Pollster <- rownames(june2015join)
june2015join <- june2015join[,-(1:6)]

join1 <- full_join(accuracyPresidential[,-c(1,3)], june2015join, by = "Pollster")

November2015join <- November2015
November2015join$Pollster <- rownames(November2015)
November2015join <- November2015join[,-(1:6)]

pollster.accuracy <- full_join(join1, November2015join, by = "Pollster")

colnames(pollster.accuracy) <- c("Pollster", "Presidential2018", "June2015", "November2015")


pollster.accuracy1 <- as.matrix(pollster.accuracy[2:4]) 
pollster.accuracy$Average <- rowMeans(pollster.accuracy1, na.rm=T)
pollster.accuracy <- pollster.accuracy[-which(pollster.accuracy$Pollster == "Election"),]

pollster.avg.acc <- ldply(split(pollster.accuracy[c(1,5)], pollster.accuracy$Pollster), function(x){mean(x[,2])})

colnames(pollster.avg.acc) <- c("Pollster", "Average")
ggplot(data = pollster.avg.acc) + 
    geom_col(aes(x = reorder(Pollster, -Average, sum), y = Average, fill = Average)) + 
    coord_flip() + 
    ylab("Distance From Actual Result") +
    xlab("Pollster")+  
    theme_light()+ 
    scale_fill_gradient2(low="Green", high="red", mid = "Orange", midpoint = 210)+
    labs(title = "Historic Average Accuracy Per Firm")+
    guides(fill=guide_legend(title="Average Accuracy"))

kable(pollster.avg.acc, caption = "Average accuracy per firm for the last 3 elections")
Table 1: Average accuracy per firm for the last 3 elections
Pollster Average
A&G 10.44240
AKAM 80.83240
ANAR 6.77080
AndyAr 8.41280
CHP 58.09000
Denge 36.45160
Foresight 8.23000
Gezici 49.44707
IYIP 783.84000
KamuAR 116.99280
Konda 36.67573
Konsensus 74.32620
Kurd Tek 136.79440
MAK 45.97573
Mediar 170.93000
MetroPoll 39.43460
OPTIMAR 13.67000
ORC 43.46307
PIAR 380.90000
Plus Mayak 82.77000
Politic’s 52.23000
REMRES 220.14500
SONAR 105.77573

Next steps

If I get time and more data in the future, I hope to improve the current ranking of pollsters. Simply averaging the accuracy results of each pollster without nessecarily accounting for time left until the election, the number of polls done by the pollster and the sample size used by the pollster, I admit, is a bit misleading.

I will probably get this partially done when I start predictive models as I would like to use some bayesian methods and such prior information can be quite relevant.