Research – Fabricio Vasselai

Working Papers

Learning meaningful measures of democratization.
[ presented at APSA 2022, Polmeth 2023 ]abstract +

Instead of yet-another-measure of democracy, we put forward a wisdom-of-crowds solution that uses Supervised Machine Learning to flexibly generate interpretable measures of democratization – using existing typologies as labeled data. First, separate learners with monotonic constrains learn how each typology classified regimes, conditioning on chosen regime-year characteristics. Then, their ensemble outputs, for each regime-year of interest, an interpretable meta-measure: the average probability that said regime-year would be classified as democratic by scholarly work. Such measures are bounded, true ratio scales, but can be meaningfully dichotomizable (e.g. at 0.5). They have powerful bias mitigation properties, being (under mild conditions) always more advantageous than picking any chosen existing typology at random – which we prove formally and study via Monte Carlo simulations. Learned instead of aggregated, they can be generated out-of-sample, even for regimes never gauged by scholars before – which we illustrate generating measures for German Empire entities, Swiss cantons since 1919 and US counties and House districts 1900-1934.

(with Samuel Baltz). Political misinformation as a displacement in multidimensional preference spaces.abstract +

This paper uses a multi-agent computer simulation to investigate how misinformation might affect the formation of partisan preferences in a given electorate. Working with multidimensional issue spaces, we understand misinformation as a displacement vector that changes, for each elector, the perceived spatial location of the misrepresented party. Electors can receive misinformation about the misrepresented party both directly or from other electors through a network. Each elector has a different level of likelihood to get exposed to misinformation and also a different level of resilience to misinformation. Resilience modulates to which extent an elector’s impression about the party gets altered by misinformation, and gets incremented each time such an alteration happens. We find that around half the time, misinformation does not alter the partisan attachment of the electorate. In the other half, it is much more likely to harm than to help the misrepresented party, and we develop a geometric explanation for this phenomenon. We also show that the effect of misinformation deeply depends both on the number of partisan choices available, and also on the concentration of partisan preferences. Finally, we find that the results are the same for almost all levels of average exposure and resilience to misinformation, with the exception of when average resilience is either too high or exposure is too low; in these cases, misinformation generally does not alter preference formation. We offer detailed analysis, as well as formal derivations, for most of those results.

(with Patrick Wu and Walter Mebane). Real-time vote counting and the public trust in modern elections.
[ presented at MPSA 2023 ]abstract + slides ‣

The modern wide publication of partial results during the process of vote counting tempts citizens to see it as a race in which one candidate may “pass” the others. As if the final result wasn’t already contained in the ballots. This illusion can create false hopes for many voters, undermining trust in the electoral process when hopes are shattered by the final result. This has been observed in a few recent presidential elections (e.g. Mexico 2006, Brazil 2014, Honduras 2017, Bolivia 2019, United States 2020) and in each of those cases, the defeated side raised suspicions of fraud. We use Twitter data from around the time of vote counting processes, and apply Fuzzy Regression Discontinuity to estimate the differences, in users’ evaluations of the canvassing process, caused by the moment the winner passes the runner up. In order to enable that, we use a modified Recurrent Neural Network to classify which Tweets talk about aspects of the electoral process and to identify, through Twitter user bios, the candidate alignment of tweet authors.

Computational Game Theory to Study Empirical Elections.
[ presented at Polmeth 2021 ]abstract + poster ‣

Here we propose a novel way of simulating election results, that accounts for strategic behavior, and then illustrate how such a tool can aid the study of real-life empirical elections in 3 example cases. Our technique starts from translating canonical Calculus of Voting (CV) game-theoretical analytical models into computational multi-agent algorithms. For it to be possible, we derive an iterative version of CV (under Plurality, SNTV and Runoff), while also extending it to include both strategic voting and strategic abstention in multi-candidate elections. Since numerical simulations based on CV require explicit calculation of pivotal probabilities, to overcome that challenge (and assuming that voters ignore electorate size), we derive novel Poisson pivotal probabilities, generalizing Myerson and prove efficient algorithms to calculate them. Next, we also prove Skellam approximations and propose non-parametric heuristics. As a benchmark to evaluate those, we also generalize Palfrey’s Multinomial probabilities and provide a novel efficient algorithm to list pivotal scenarios. Finally, we showcase how that computational models of elections can be used to study actual election. First, we show that simulated elections can be used to assess the validity of state-of-the-art election forensic statistical tools, by calling attention to false negatives and positives in election fraud detection. Second, using available surveys, we use our computational models to simulate counter factual elections held under different systems, showing how computer simulations of elections can also be used to estimate a range of credible outcomes under electoral reform. Third, we use Brazilian election polls to show how simulated strategic behavior informed by poll data can be used to predict strategic behavior during an actual election.

Supervised Learning for election forensics with Multi-Agent simulated training data.
[ presented at Polmeth 2020 ]poster ‣

Work In Progress

Geolocation and Interpolation of Polling Places Using GIS Census Data.
[ presented at APSA 2021, Polmeth 2022 ]abstract + poster ‣

This paper proposes two novel techniques, one to geolocate electoral polling places, the other to impute them with demographic data – using as input GIS Census data and Facebook’s High Definition Population Maps. First, a new Iterative Substring Matching algorithm is proposed to match names or addresses of polling places to already georefferenced Census data. Second, to spatially interpolate demographic census data to the polling places, a Deep Learning model for compositional outputs and with spatial inputs is proposed. Both approaches are illustrated and validated using Brazilian data, resulting in a highly accurate geolocation of all polling places used in Brazil between 2006 and 2020, imputed with electorates’ age, sex and income.

High-resolution mapping of world’s electoral district boundaries.abstract +

The paper presents a novel dataset, with high-resolution GIS maps of the boundaries of the electoral districts (past and present) used in more than 1200 elections held in over 150 countries. To make this dataset possible, we leverage three key aspects of electoral district boundaries seldomly emphasized. First, many electoral districts around the world are coterminous to mixtures of different levels of administrative or statistical divisions of a country (provinces, cities, neighborhoods, census blocks, etc). Second, when they are not, they nearly always at least share large parts of their boundaries with administrative or statistical divisions. Third, even in countries where electoral districts are geographically local and change often, relevant parts of their boundaries are more stable over time. Those properties dramatically diminish the amount of manual vectorization required to reconstruct their GIS boundaries The paper also discusses and exemplifies some of the array of techniques we employed with those properties in mind. Nearly all countries that ever had any contested election (i.e. with 2+ competing parties) have at least their most recent elections included, but in most cases earlier elections are also present. This includes, for example, most contested elections held in Belgium, Canada, France, Germany, Italy, Portugal, Russia, Spain, US and UK; most elections held in Latin America since 1945 and the majority held since 2000 in East Europe, Africa and Asia.