Well, it was another holiday weekend in the U.S., so here’s another Haskell project I picked up. This one wasn’t planned; just started with some random conversations, which led to writing some code, and refining it… and I now have a reasonably realistic statistical model of voting behavior, as well as analysis tools for visualizing and analyzing election systems and methodologies and their practical effects.
The basic idea is to mode voters and candidates in an election as existing in a fairly high-dimensional vector space (100 dimensions) of political opinions and group affinities, and then define a probability distribution that describes which opinions and affiliations voters have in each simulation. The model chosen is a Mixture of Zipf Gaussians, which models voters as belonging to a collection of overlapping subpopulations, each of which has their own size, mean political positions, different degrees of variation along different axes, etc., though with variance decaying globally in each additional dimension. The result looks relatively realistic based on observed experience.
Using this model, we can answer some key questions about election systems through Monte Carlo simulation. Questions like: How often do different possible election methodologies - plurality, instant runoff, Smith/Condorcet, range voting, Borda count, STAR, etc., agree with each other about who wins an election? When they disagree, how can we characterize the kinds of decisions that each makes? How often can we find a Condorcet winner, and is the lack of Condorcet winners merely a theoretical problem or does it come up in practice? What is the impact of tactical voting: how effective is it in various methodologies, and how does it affect the character of the results?
Interestingly, a Monte Carlo simulation of elections using a spatial MoZG model tuned to reflect realistic of voting populations yields quite different results than well-known attempts to ask the same questions that rely on less realistic and nuanced models such as the better-known IC and IAC models. For example, while the common but unrealistic IC and IAC models predict the likelihood of a Condorcet winner to be approximately 0 or 1/e, respectively, simulation here suggests they almost always exist, failing to emerge only about 1% of the time.
I will likely write up the actual results of the investigation at some point, but for now I’m just sharing the code in case anyone finds it interesting.