What Election Polling’s 2020 Miss Means For Business

Paul Phelps
6 min readNov 17, 2020

Polling has relied on tradition to inform its predictions. But even aggregation of dozens of polls has shown the limits of its effectiveness in the modern age.

Caveat: I am speaking specifically to the Senate and House prediction misses — presidential polling was fairly accurate after the predicted “red mirage” passed and more complete results have been reported.

Customer sentiment is more nuanced and robust than a “1 to 5” score.

I have a thought experiment I have been using with my teams for a number of years now. “If we were born yesterday, is this how we would do it?” The question is designed to figure out how much the organization does things because that is how it has always been done versus how it perhaps could be done more effectively with more recent technologies or processes.

Polling could be a great example of an industry that would not operate how it does if it was “born yesterday”.

Polling is surveying, and surveying is still used by many industries and organizations for their “Voice of the Customer” metrics. For some industries, these VOC metrics have regulatory requirements so have little chance of changing soon. But VOC surveying, like polling, is a poor metric for Customer Experience. If we were “born yesterday”, we would be utilizing sentiment analysis of all customer communications to identify what the customer experience actually has been like.

Sentiment analysis is measuring text based on words used and their context to identify customer opinion. These opinions can be categorized simply as “positive”, “negative”, or “neutral”, or greater detail can be gleaned from categorizing these opinions with time stamps (began the call positive, ended the call negative) or with more robust categorization.

Sentiment analysis uses each customer communication, and the entirety of that communication, to measure the Customer Experience. This provides a more accurate and nuanced representation as opposed to a VOC survey. VOC surveys occur after the Customer Experience has happened and represent the customer’s reflection on their experience, often responding to brief, static, predetermined questions requesting a single digit in place of what could have been an emotionally rich experience.

And emotionally rich experiences ARE measurable. In addition, customer experiences of all sorts, when measured and collected, can be used for employee coaching. These benefits are impossible using traditional VOC surveying.

VOC surveys are a good tool based on the limits of the technology available in the era the tools were created. And they are still useful for verifying semantic and sentiment analyses are accurate. But they are far from as accurate as more modern tools an organization would use instead if they were “born yesterday”.

The cost to the Democratic Party in 2020 was losing seats in the House they thought they would win and spending millions of dollars in states their candidates had little chance of winning Senate seats at the expense of spending in markets they were much closer to winning. The costs to an organization for mis-measured CX can be even greater — investment in products customers do not like but VOC scores suggest they do, for example.

Polling and surveying are forms of data collection. And data collection should be considered with this thought experiment beyond just CX in the form of text or transcript sentiment analysis. This can range from how non-profit organizations analyze their proposals for potential funding success, or how HR analyze job postings for attracting the best candidates. In both instances, organizations may submit versions of the same thing they always have but haven’t considered analyses with the potential to increase their effectiveness.

Artificial Intelligence provides organizations the ability to analyze larger amounts of data than ever, in ways previously thought impossible, and there are new startups nearly every day leveraging Artificial Intelligence for more and more niche opportunities. Additionally, as DeepMind demonstrated in their AlphaGo program, AI has the potential to be creative.

With enough data and computing power, AI and modern data analytics can provide previously impossible insights. But these insights are necessitated by collection and organization of data representing as much of each transaction as possible. The possibilities end only when it encroaches on personal privacy.

Humans are expert pattern recognizers. But we need the data to find the patterns in. In fact, humans are so good at seeing patterns we see patterns where there are none. So more data for human analysis can just produce more and more confusing and contradictory analyses. Utilizing algorithms removes human judgement in the analyses and use of recurrent neural networks, or just a monitoring of the success of the analyses, can ensure their accuracy.

What polling, surveys, and sparse or lean data collection do is force the data analyst or business analyst to extrapolate greater trends from limited information. This is as opposed to modern methods of semantic analysis from complete transcripts or analyses of robust data, where the data analyst or AI algorithm can infer insights instead. The latter is much more accurate.

2016 and 2020 polling demonstrated how extrapolation fails to provide accurate predictions of reality. From the 2016 election, pollsters believed they had not corrected for education level. This led many pollsters to include questions on education level for their polls in 2020. But polls still suffered from an inability to reach certain demographic groups — hispanic males, for example. So pollsters would correct for the small amount of surveys received for certain demographics versus how those demographics were likely to be represented among likely voters. But some demographics were represented overly-broadly. For example, hispanics in Florida represent several different likely voting tendencies, but many pollsters just represented them with “hispanic”.

The same can happen with an organization’s data analyses. With polling or lean data collection, organizations are extrapolating the experience of a single person represents that of a large number of un-surveyed people. Bayes theorem is useful for measuring the accuracy of polling or surveying, producing a Margin of Error organizations should use to provide caution before making decisions based on the analyses.

But many don’t use Margin of Error. And when an organization can utilize analyses not requiring Margin of Error, it should.

When an organization utilizes demographics to extrapolate inference of customer experience, for example, it is exposed to all sorts of legacy biases that can be expensive and morally fraught. When analyzing data using demographics, “cross-tabs” can be challenging. Race, gender, education level, location, age… any of these and more can be informative to look at, but also can be deceptive. Again, humans are expert pattern recognizers even when there are none.

The advantage the data analyst or AI has when using modern technology is the ability to infer from more complete data what the actual experience is rather than the problematic process of extrapolating from incomplete data.

Business analysts often have to caution their organization certain metrics “do not mean what you think it means”. While not erased by using modern technologies of data analysis, this tendency is lessened, and migrating an organization from legacy data analysis to modern tools provides great opportunities to re-evaluate those metrics, both for understanding and usefulness.

Election polling is a method of its time, and provides a foundation for modern techniques more likely to provide accurate predictions. Arijit Sengupta, founder and CEO of Aible, provided a great summary of the opportunities for AI, in particular, to revolutionize this industry.

Beyond election prediction, organizations should also consider sentiment analysis and AI, modern data collection, and inferring analyses as opposed to extrapolation in order to better measure customer experience and better predict business outcomes. If we were “born yesterday”, few organizations would approach data analysis the way they do now.