What to do when survey data is not available or not trustworthy? An application of novel ecological inference techniques to the Carinthian Plebiscite of 1920

Ecological inference techniques, as advocated and codified by Gary King, provide a way forward when survey data is biased or unavailable and researchers need to infer individual behavior from contextual data. The historical 1920 Carinthian plebiscite provides an ideal-typical application: it was organized to decide whether an ethnically and linguistically heterogenous part of South-East Carinthia was to be included into Austria or into Yugoslavia. Eventually, a roughly 70:30 “Slovenian” electorate selected the Austrian option by about 60:40. An application of state-of-the art ecological inference models demonstrates that “German” support for Austria has been over- and “Slovenian” support has been underestimated: “only” 75 percent of the German-speaking, but also more than 50 percent of the Slovenian-speaking electorate has opted to join the new Austrian republic. Therefore, the significance of ethno-linguistic cleavages has been systematically overrated and the historical electorate has been much more unified than previously thought.

Ökologische Inferenztechniken, wie sie von Gary King befürwortet und kodifiziert wurden, bieten einen Ausweg, wenn Umfragedaten verzerrt oder nicht verfügbar sind und Forschende aus Kontextdaten auf individuelles Verhalten schließen müssen. Das historische Kärntner Plebiszit von 1920 bietet eine idealtypische Anwendung: Es wurde veranstaltet, um zu entscheiden, ob ein ethnisch und sprachlich heterogener Teil Südostkärntens in Österreich oder in Jugoslawien eingegliedert werden sollte. Letztendlich entschied sich eine etwa 70:30 “slowenische” Wählerschaft mit etwa 60:40 für die österreichische Option. Eine Anwendung modernster ökologischer Inferenzmodelle zeigt, dass die “deutsche” Unterstützung für Österreich über- und die “slowenische” Unterstützung unterschätzt wurde: “Nur” 75 Prozent der deutschsprachigen, aber auch mehr als 50 Prozent der slowenischsprachigen Wählerschaft haben sich für den Beitritt zur neuen österreichischen Republik entschieden. Die Bedeutung der ethno-linguistischen Cleavages wurde also systematisch überschätzt und die historische Wählerschaft war viel einheitlicher als bisher angenommen.

DOI: 10.34879/gesisblog.2020.28

The ecological fallacy has been a key problem in the social sciences and in applied statistics. This issue arises when observers naively rely on aggregate or group characteristics, but attempt to draw causal or descriptive inferences at the individual level. Yet, when survey evidence is, for instance in historical cases, not available, when field work cannot be done, or when survey responses are certain to be biased and wrong, relying on ecological data may be the only option at hand. In modern political methodology, “ecological inference” has become a common label for an array of techniques and perspectives that intends to provide a way forward and enable researchers, in spite of aggregation bias and a number of related complications, to draw some meaningful inferences based on aggregate data.

We demonstrate the utility of ecological inference techniques by an application to the Carinthian plebiscite of 1920: after the downfall of the Austro-Hungarian empire, this referendum was a legal and political instrument to solve complex territorial issues and to decide whether an ethnically and linguistically heterogeneous territory in South-East Carinthia was to become a part of the newly established German-Austrian Republic or to be incorporated into the newly founded Yugoslavian kingdom.

Figure 1: Tearing Up the Green or Ripping Up the White Ballot?
Notes: German and Slovenian slogans for “This is the way Carinthia wins”. Apparently, the Austrian side printed more than 600,000 copies to instruct less than 40,000 registered voters.

The tallies of the referendum are formally uncontested: a roughly 70:30 “Slovenian” electorate voted by about 60:40 for the German-Austrian option. Among 39,291 local citizens who were legally enfranchised, 22,025 (59.04 percent) selected Austria and 15,279 (40.96 percent) chose Yugoslavia. Adopting the last Austro-Hungarian census as a yardstick for the presence of linguistic groups, the choice for Austria was only possible because of the support of a sizable share of linguistic “Slovenes”. The nationalist literature on both sides quickly settled on a consensus: assuming that each and every linguistic “German” in the referendum zone went for Austria, about 10,300 “Slovenian” defectors were required to reach the overall vote count.

Modern ecological inference approaches that have been prominently advocated and unified by Gary King.1 The historical Carinthian plebiscite is almost ideally suited to demonstrate their empirical merits and to provide a valid assessment of vote choice across and within both linguistic groups: 2×2 tables (covering two linguistic groups and two options in the referendum) may provide narrow logical bounds for the individual cells, many lop-sided communities which are dominated by either the “German” or the “Slovenian” linguistic group ensure the provision of meaningful statistical Information:

  1. The “method of bounds”, as first suggested by Duncan and Davis2 exploits deterministic information contained within each of the 51 electoral communities: for instance, the 1910 census reports that 89.8 percent of the local population allegedly spoke German and only 10.2 percent spoke Slovenian in day-to-day conversation. However, only 72.5 percent voted for Austria and the remaining 27.5 percent opted for Yugoslavia. Some simple comparison of these figures reveals that, in the absence of major measurement error or bias, a sizable share of German-speaking citizens must have chosen Yugoslavia. Formalizations of these ideas allow us to infer meaningful bounds for the German majority and to determine that “only” between 69.4 and 80.8 percent of this group did actually vote for Austria. But we are able to say much less about the local Slovenian minority: logically, from zero to one hundred percent may have also chosen Austria. In a broader perspective, logical bounds may be better more meaningful and restrictive whenever vote choice and linguistic group size approach the scale margins. Since we have many “lopsided” communities concerning both variables, the Carinthian plebiscite is ideally suited for these analyses.
  2. Ecological regression models are able to contribute complimentary perspectives by estimating statistical models across the 51 communities at hand. In the most prominent formalization, Leo Goodman3 relied upon the re-parametrization of a simple linear regression model. Yet, these models assume that coefficients and predictors are uncorrelated, e.g., that “German” and “Slovenian” voters behave identically within each and any district, and this is regularly not the case.
  3. To better address heterogeneity issues, we follow Imai, Lu, and Strauss (2007). The authors restate the ecological inference problem as a coarsened data issue and contribute non-parametric Bayesian models which are better suited to deal with (omitted and/ or unmeasured) contextual variables. Given that we lack meaningful data and measurement alternatives for many likely impacts on the historical Carinthian electorate, such as economic considerations, cueing by political parties etc., these perspectives are very appropriate and fruitful for understanding the Carinthian plebiscite.
Table 1: Ecological Inference – Deducing the Inner Cells
Notes: Ecological inference model use marginal information, here the number/ share of voters that belong the “German” or “Slovenian” group (in columns) and the number/ share of voters that support the Austrian or Yugoslavian option (in rows), to infer the grey-shaded inner cells. Bayesian confidence regions have been omitted for better readability.

Below the line, our findings also imply that the referendum was not the focal point of a bitter struggle among monolithic, nationalist electorates, but instead defined by more complex, more careful, and more pragmatic considerations of civil liberties, political rights, and economic well-being. Given that roughly 75 percent of linguistic Germans and more than 50 percent of linguistic Slovenians selected Austria, electoral choice is affected by ethno-linguistic considerations to some extent. However, the ethnic cleavage has been shown to be much less divisive and the Carinthian electorate has been demonstrated to be much more united than previously insinuated. In fact, many other motives may have tilted vote choice towards the Austrian options: a search for perceived continuity, considerations of economic well-being and life chances, an appreciation of the historical unity of Carinthia, the cueing effects of the Austrian social democrats, and the mere contrast of an initially socialist republic with a strong social net vis-à-vis a militarized kingdom that was still waging war.

Original Paper:
Guido Tiemann. 2020. “Kärnten”=Austria, “Koroška”=Yugoslavia? A Novel Perspective on the 1920 Carinthian Plebiscite. Historical Social Research 45 (4): 309-346.
DOI: https://doi.org/10.12759/hsr.45.2020.4.309-346

Original Data:
Guido Tiemann. 2020. Replication Files: “Kärnten”=Austria, “Koroška”=Yugoslavia? A Novel Perspective on the 1920 Carinthian Plebiscite.
DOI: https://doi.org/10.7802/2109


  1. King, Gary. 1997. A Solution to the Ecological Inference Problem. Reconstructing Individual Behavior from Aggregate Data. Princeton: Princeton University Press.
  2. Duncan, Otis D. and Beverley Davis. 1953. An Alternative to Ecological Correlation. American Sociological Review 18(6): 665–666.
  3. Goodman, Leo. 1953. Ecological Regressions and the Behavior of Individuals, American Sociological Review 18(6): 663–664.

