For about six months I looked at that NYT graphic: a smiling Hillary Clinton with a roughly 80-to-90 percent “probability” of winning the election, and a scowling Donald Trump with the corresponding 20-to-10 percent.
I didn’t drag you in here to grouse about the difference between probabilities and percentages (the former is strictly speaking between 0 and 1, but I’ll hew to convention and mix terminology). However I do think there is a valid point about whether a probability/percentage is appropriate in this context. Readers apparently found this confusing, mentally projecting the probability out to 0 or 1. I agree with the those finding the representation puzzling – I’d even argue it isn’t quite right in the first place.
In my mind, there are two problems. First, the precision of the reported probabilities exceeded the model’s precision – those probabilities very definitely had uncertainty, and from what I’ve read, when all things are considered the errors in probability were 20 or even 25 percent.
Second, probabilities normally refer to situations in which an measurement is repeated many times on a system, or there are many equivalent systems and a measurement is taken on each. When we think of the probability that an unbiased coin will turn up heads, or that we’ll draw a straight flush, we know we’re thinking about how many times we’d see a heads, or a flush, after many attempts. The electoral models do have an internal construct where this makes sense, but real elections are one-chance-only, and you win or you don’t. The outcome is binary and the ideally the prediction should be too. With no second chances, a chance of winning is cognitively deceptive – I believe many people projected this probability to the binary outcome we all understand. I certainly found myself doing that – and I’m supposed to know better.
Of course the model has uncertainties and they should be reported – and it’s laudable that people were making this effort. I just don’t think a probability is quite the way to do it. One thing is almost a matter of language: trade in that internal “probability” for a reported “confidence.” And, make the confidence qualitative to reflect the fact it isn’t certain – five levels would be fine. So for a quick read, something like: Hillary Clinton is predicted to defeat Donald Trump (confidence: high = 4/5).
In addition, people do love a chart and easy-to-see details. So, why not the projected electoral counts for each candidate with a color-coded error bar, with the most intense colors corresponding to the highest internal probabilties.
5/5-level confidence means the bars don’t overlap (and internally might be 95% confidence). As the darker areas start to overlap the reported confidence lower, corresponding to a lower internal probability. All that nitty-gritty can be reported in some tasty fine print or a link.
In the end, not too dissimilar from what we saw every day this year, but now with real measureables that align with the single-election “experiment,” rather than looking at an abstract internal probability that’s easily misinterpreted, both in concept and in precision.