Welcome to the 110th edition of The Week in Polls (TWIP), which takes a look at the underlying issue behind much of the current discussion about why different pollsters are giving Labour different sized leads. It’s that pollsters can’t simply take at face value the answers we give them.
Then it’s a look at the latest voting intention polls followed by, for paid-for subscribers, 10 insights from the last week’s polling and analysis. (If you’re a free subscriber, sign up for a free trial here to see what you’re missing.)
Before that, a quick word about the headline-grabbing new MRP out from Find Out Now and Electoral Calculus. If you’re a newer reader and wondering what an MRP is and whether to trust one, my MRP explainer is here.
It’s a poll that gives the Conservatives a 27% deficit to Labour. As the pollster’s write-up of this one says, the seat numbers that come out of the MRP are not that different from what you can get by playing around with standard swings.1 MRPs may grab out attention with its details, but what this one is really doing is bringing vividly to life what a 27% deficit means. (One in the eye for the cynics though: this Conservative gloom comes from a poll produced by … GB News and the Daily Mail.)
Finally before we get to the main business, a quick pictorial summary of the state of the polls. Everything (yes, even taking into account that Opinium poll)2 is as flat as a Norfolk landscape in the national voting intention polls:
Been forwarded this email by someone else? Sign up to get your own copy here.
Want to know more about political polling? Get my book Polling UnPacked: the history, uses and abuses of political opinion polling.
Pollsters can’t take what they are told at face value
With it all being very Norfolk in the polls, a lot of the polling insider chat3 is about the variations in how pollsters deal with the don’t knows. It’s becoming a bit of a thing to talk about distinguishing between those pollsters who are ‘nowcasters’ and those who are ‘predictors’. Although I get where this distinction is coming from, I think there’s a better way of understanding the variation between different pollsters.
To understand that, let’s start with a basic question: why don’t pollsters simply take the answers that people give them at face value? Sure, they may have to weight their sample to get the right proportion of men under 30 or women over 75, but otherwise, why not treat what the public says as sacrosanct, simply tallying it all up?
Turnout
There are three reasons why pollsters don’t take answers at face value.4 One is turnout. If you ask people if they are going to vote, you get more people saying they will than actually do.
This isn’t a nowcast versus prediction issue, it’s social desirability and self-predictive ability issue. Even if you ask people the day before polling day you get too high a claimed turnout, as you do if you ask people months in advance. People like to say the ‘right’ thing, whether that’s to impress the pollster or themselves and also can under-estimate the chance that life (a sick child, a job emergency or a broken train) will get in the way of voting.
One option therefore is to take the answer at face value, as the least worst option. Another is to get people to give some nuance (e.g. to give likelihood to vote on a 1-10 scale and use that scale to filter or weight the answers). A third, which can be done in combination with the nuance is to add in some modelling, such as using someone’s demographic characteristics and the known relationship between demography and turnout at the previous election.
As I put it in Polling UnPacked:
Taking answers on likelihood to vote at face value is the approach taken most notably by Ann Selzer, often viewed as the gold standard among gold standards for pollsters for her Iowa polling. Although she specializes in polling what is but one small state within the United States, being good at polling in Iowa is a stage for international fame, given the high-profile role of the Iowa caucuses in the U.S. presidential selection contests. As she puts it, her approach is one of polling forward, not polling backward. She simply includes the people if they say they will ‘definitely’ or ‘probably’ vote, and discards those who answer otherwise. This means her results can reflect dramatic changes in turnout patterns, [such] as … when this methodology delivered the goods in 2008 by correctly picking up the large number of first-timers in the Democrat caucus who propelled Barack Obama to victory…
Yet this is also a risky approach because there is plenty of evidence about the limitations of the accuracy of self-reporting likelihood to vote. This is why some pollsters prefer instead to pull on other information to help improve the methodology.
And so on, round in circles, the debate goes, propelled by the generation of both high- and low-quality polling results from either approach.
Past vote
A second reason why pollsters don’t just take answers at face value is faulty past vote recall.
Asking people how they voted last time can be useful for ensuring you weight your sample accurately. For example, the common experience in the heyday of telephone polling was that if you called numbers at random, you got a disproportionately Labour-leaning set of people taking your polls. So it became the norm to ask people who they had voted last time in order to help weight the figures appropriately.
But people don’t recall their past votes accurately. Their recall tends to get more inaccurate over time, with more people claiming to remember that they voted for the winner than actually did and also with people who changed their minds, particularly for a last minute tactical vote, progressively forgetting that they did so.
Nowadays, some pollsters with big panels have the actual record of what people said at the last election to go by. In addition, some pollsters, because of the problems with false recall avoid past vote weighting. But for those who don’t have the exact historic records but find that past voting recall is useful to fact in, they can modify the answers to try to overcome the false recall issue.
All again variations in how much people’s answers are taken at face value or not, and all again not about nowcasting versus forecasting as in this case it’s about how pollsters choose to deal with what someone says about the past.
Don’t knows
Then there is the third reason and, in the context of this Parliament, the big one. What do you do if someone says ‘don’t know’ when you ask them how they are going to vote?
There is no ‘don’t know’ on the ballot paper, so there’s no way out here. You have to make some decision about not taking this answer at face value.
As with turnout, pollsters have three choices. The first is to exclude the don’t knows, i.e. to assume either that they won’t vote or that their votes will break in the same way as those who gave a view. For example, a poll showing 25% Free Brussels Sprouts, 25% Milk Chocolate For All and 50% don’t know would, after excluding the don’t knows, become 50% Free Brussels Sprouts - 50% Milk Chocolate For All.
A second option is to ‘squeeze’ the don’t knows. Pester them a bit and ask them to make up their minds. Come on, make a choice between chocolate and Brussels sprouts. Those who are still resolutely don’t know may then be discarded, but you’ve reduced their number first.
The third option is to use other information to project/guess how the don’t knows will vote. You could, for example, look at how otherwise identical people (age, education, past voting, etc.) say they are going to vote and use that to impute how the don’t knows will vote.5
And you can use some mix of two or all three - squeeze, model, discard.
Lots of options for pollsters
The different things you can do across all three of these challenges of not taking at face value what people say in answer to a poll is a large part of why the results that different pollsters come up often systematically vary. Different methods, different results.
Changing circumstances, however, can mean that what worked best one time doesn’t produce the best results next time. Which is why there’s continuing variety in pollster methodology rather than a convergence over time on the one best way.
The classification I don’t use
Myself, I therefore think the best way to view the variations and options above is to think of it as how closely a pollster sticks to taking what people say at face value versus how much they don’t just leave it there.
A current trend, however, is to classify things differently as this very handy table from Focaldata does:
To simplify a little, the nowcasters discard the don’t knows, the squeezers use squeeze questions to make them pick a view and the re-weighters model the don’t knows either explicitly or implicitly.6
As the treatment of don’t knows is a notable cause of differences in the Labour lead found by pollsters, classifying pollster by how they treat don’t knows is useful.
But even the ‘nowcasters’ are doing a form of forecasting, such as on turnout. And even the ‘forecasters’ are not really going all-in on forecasting. Proper forecasting would involve things like estimating how much smaller parties will be squeezed during a campaign and how much second-placed parties in seats will rise due to benefiting from tactical voting. Or what the impact of known key moments in the campaign, such as TV debates, will be. The ‘forecasters’ aren’t really forecasting all the way to polling day.
Moreover, the don’t know split isn’t a timeless key divide. In previous (and no doubt future, including - once we see the results - perhaps even this one) the treatments of turnout and of past voting balance have been the big talking points. It just happens to be don’t knows this time.
Which is why I prefer the thinking of them all as taking different mixes of answers to those three challenges, all of which means pollsters can’t just take your, or my, answers at face value.
But what does this mean for right now?
Patrick Flynn, who did the table above, crunched the numbers as of about a week ago on the difference that the various approaches to don’t knows is making this time around:
No adjustment: 24-point Labour lead
Squeeze question: 19-point Labour lead
Re-weighting model: 17-point Labour lead
That 7-point spread isn’t trivial (and is a good example of why I’m cautious about simply averaging together all the different polls). Even though it’s unlikely all to be down to the treatment of don’t knows, the data from pollsters who provide enough information to allow us to experiment with the impact of their treatment on the overall numbers7 does backup the idea that it’s a big part of the variation.
The two re-weighting models are as yet untested in a general election (as JL Partners has not yet done one, and while Opinium has a good general election record, it’s current methodology is new in this Parliament).
But before Patrick Sturgis gets cross with me again, I should point out that as he wrote:
The 2015 inquiry [into the polling miss at that general election] recommended model-based reallocation of DKs [don’t knows].
To quote that report itself, it recommended that pollsters:
review current allocation methods for respondents who say they don’t know, or refuse to disclose which party they intend to vote for. Existing procedures are ad hoc and lack a coherent theoretical rationale. Model-based imputation procedures merit consideration as an alternative to current approaches.
So I wouldn’t dismiss the re-weighting models out of hand as being less likely to be right than the others.
Yet even among those re-weighters, who show the smallest average lead, it is still a lead which, at 17-points, is more than the 1945, 1983 or 1997 landslide defeats.
Moreover, there just aren’t enough 2019 Conservative to current don’t know switchers to make that much of an impact on the big picture, even if they all go back to the Conservatives. And that’s without factoring in the pattern of a party seeing high numbers of previous supporter switch to current don’t know being a harbinger of doom for the Lib Dems ahead of 2015 and Labour ahead of 2019.
It’s why the hopes expressed by Conservative peer and election pundit Rob Hayward about polls underestimating the Conservatives due to the don’t knows look slim. Even the polls that do the hardest to ensure they’re not under-estimating the Conservatives are still giving Labour a lead that is nearly double that of the size of the 1992 polling miss.8
Whatever way you look at it, the Conservatives are a long way behind.
National voting intention polls
Looking at the table below, it’s not obvious why we should read with a straight face the headline on an opinion piece over at the Daily Telegraph, “Whisper it, but Rishi Sunak is making an extraordinary comeback”. Let’s see what future weeks bring.
For more details and updates through the week, see my daily updated table here and for all the historic figures, including Parliamentary by-election polls, see PollBase.
Last week’s edition
Will the polls get the general election wrong?
(I’m rather chuffed that Ian Betteridge himself, he of Betteridge’s Law, posted a simple response: ‘no’.)
My privacy policy and related legal information is available here. Links to purchase books online are usually affiliate links which pay a commission for each sale. Please note that if you are subscribed to other email lists of mine, unsubscribing from this list will not automatically remove you from the other lists. If you wish to be removed from all lists, simply hit reply and let me know.
Sunak polling worse than Corbyn, and other polling news
The following 10 findings from the most recent polls and analysis are for paying subscribers only, but you can sign up for a free trial to read them straight away.
Keep reading with a 7-day free trial
Subscribe to The Week in Polls to keep reading this post and get 7 days of free access to the full post archives.