On cumulative vote charts, and why they don't suggest fraud, esp. in primaries

RI
realisticidealist
Atlas Icon

Posts: 14,764

Political Matrix
E: 0.39, S: 2.61

Show only this user's posts in this thread

On cumulative vote charts, and why they don't suggest fraud, esp. in primaries

« on: August 01, 2016, 12:50:28 AM »

Groups like Election Justice USA have been in the news lately, alleging that Hillary Clinton benefited in the primaries due to "election irregularities" which may have comprised vote fraud against Bernie Sanders. Their primary evidence takes the form of cumulative precinct vote charts ordered from low to high turnout, which they allege provides evidence of violations of the Central Limit Theorem.

CLTs generally state that as the number of draws from a distribution increases, the observed mean should approach the true mean asymptotically. If the cumulative vote charts show "trends" as the vote total increases rather than convergence, then EJUSA alleges this shows that human intervention in the vote counting is occurring. Other groups have made similar allegations regarding previous elections.

However, this ignores the underlying assumption of CLTs; namely, they assume draws are independent of each other, draws are with replacement, and that the distribution being drawn from is not a function of the draws. As it turns out, there is a very simple reason to think that precinct turnout size in a primary election is not an independent basis for draws: assuming precincts in a state have roughly equal populations to begin with, precincts which tend to vote more for a party in general elections will have larger electorates to draw from in primary elections. As such, these precincts may reflect the base or establishment of a state's party moreso than precincts dominated by the opposing party. This is further amplified in the presence of racial polarization; a heavily black precinct in the south, for example, would likely have higher turnout for the Dems than a heavily white precinct and be more likely to vote for Hillary Clinton. Similarly, a heavily suburban precinct around a Midwestern city would likely have more GOP primary voters than one in the urban core and be more likely to support an establishment candidate.

Let's look at some examples. First, the 2016 Ohio presidential primary. Here's the Democratic cumulative vote share chart by precinct:

As you can see, as Dem precinct turnout increases, Clinton's share of the vote increases. Now, let's look at Dem precinct turnout vs. Dem turnout as a percentage of total primary turnout:

Indeed, it appears that precincts with a higher Dem primary turnout are indeed more heavily Democratic (vs. Republican) precincts in terms of overall turnout.

On the GOP side:

We see the same trend, this time toward the establishment candidate Kasich and away from Trump, with Cruz relatively constant. As for turnout composition:

It would again seem that precincts with larger turnout in the GOP primary are fundamentally more Republican precincts.

Here's the same with Alabama (chosen at random), Dems:

GOP:

The effect is weaker here, but still exists; Trump declines in the higher turnout precincts while Rubio (and Kasich and Cruz to lesser extents) gains.

A few cases that are particularly illustrative:

1) Wisconsin strongly trends for de facto establishment candidate Ted Cruz

2) In particularly anti-Clinton and/or caucus states, this effect seems to "benefit" Bernie over Clinton, such as in Utah:

3) New York* trended for both of its "favorite children"

In summary, it seems quite plausible, at least in primary elections, that trends in cumulative vote charts can be explained by natural factors outside of fraud or tampering.

*Note: The NY data excludes Nassau, Dutchess, and Chemung counties, but I have little reason to suspect this will change to outcome or conclusion.

Logged

jimrtex
Atlas Icon

Posts: 11,828

Re: On cumulative vote charts, and why they don't suggest fraud, esp. in primaries

« Reply #1 on: August 02, 2016, 07:45:50 PM »

Quote from: realisticidealist on August 01, 2016, 12:50:28 AM

However, this ignores the underlying assumption of CLTs; namely, they assume draws are independent of each other, draws are with replacement, and that the distribution being drawn from is not a function of the draws. As it turns out, there is a very simple reason to think that precinct turnout size in a primary election is not an independent basis for draws: assuming precincts in a state have roughly equal populations to begin with, precincts which tend to vote more for a party in general elections will have larger electorates to draw from in primary elections. As such, these precincts may reflect the base or establishment of a state's party moreso than precincts dominated by the opposing party. This is further amplified in the presence of racial polarization; a heavily black precinct in the south, for example, would likely have higher turnout for the Dems than a heavily white precinct and be more likely to vote for Hillary Clinton. Similarly, a heavily suburban precinct around a Midwestern city would likely have more GOP primary voters than one in the urban core and be more likely to support an establishment candidate.

It may not be a good assumption that precincts in a state have roughly equal populations.

In Kansas it definitely is not true. Kansas does not have any size restrictions on precincts. It does have requirements for respecting political boundaries.

Kansas has townships, and in most counties these are the public land survey townships. 36 sections in Western Kansas may have only couple dozen farmers, and they are probably 80% Republican.

If there is a city of even a couple of 1000 it may dominate the county population. The precincts in the "city" will be larger than in the county, and a bit less Republican (more waitresses, perhaps school teachers, etc.). So this suggests a trend towards larger precincts being more Democratic.

But in Kansas, elections are administered at the county level, and counties apparently have adopted different styles. Shawnee (Topeka) and Wyandotte (KCK) in particular have small precincts, and happen to be two of the three counties that would be identified as Democratic. The other is Douglas (Lawrence). Since Democrats are more likely to play the victim card, it might not be worth the pain to rationalize precinct sizes. And in cities that were laid out before almost universal private automobile ownership, there may be suitable facilities for polling places in a small area (say a 1/4 section).

Newer suburban areas were developed after everyone had automobiles. There might only be one suitable location within a section, and the voters will appreciate plentiful parking and may even approve of any cost savings.

If you look at Overland Park (which is the 2nd most populous city in the state) precinct size correlates with latitude. The cities in Johnson County are long and skinny, as they have annexed either south and west to prevent being blocked in by other cities. The precinct sizes also increase in area. And though an area with precincts of a section in size might have lower density, it will not be 1/4 of the density of a precinct 1/4 section in size.

The other factor in Kansas is that Black precincts are extremely unusual, and extremely Democratic. This results in the median being more Republican than the mean. So on the cumulative curve you see the Republican percentage creeping upward, and then there will be a black precinct which will really yank the curve down, but this really isn't visible on the cumulative curves. You almost have to be looking at a spreadsheet stepping through the precincts one by one.

What happens in Kansas is that these extremely Democratic precincts become more rare after all the of Shawnee and Wyandotte counties are in, and Douglas and the Democratic areas of Sedgwick County can't balance the remainder of the state, and this is true even if the most suburban precincts are slightly less Republican than the state as a whole.

So what you are seeing is more of a distribution problem, than a trend. In Sedgwick County, Kansas when you look at a scatter diagram of partisanship vs. precinct size, you will see two clouds with a gap between. The precincts are polarized, so the mean is even less representative of the data than you might expect. The mean falls between the two maximum. It would be like measuring the height of adult humans, and not recognizing that adult males are on average several inches taller.

Even if the number of registered voters is the same (some states (Ohio) have strict limits), this might not result in the same number of voters. People move, and they will still be registered at their old address. Registered voters can not be purged without documentation. An "inactive voter" is not someone who doesn't vote. It is someone who had mail returned to the registrar. If they remain inactive for the next two general elections they can be removed. If they show up, they can vote.

People who don't own their own home, are younger, less married, and have greater job instability move more often. They also vote less often. Residential stability is one indicia of being established, or being one of the establishment.

The cumulative vote share curves are different if you use registered voters, or votes casts for the ordering. In the Choquette (sp) analysis which was based on the Republican primaries, he recommended using the number of registered voters for ranking precincts. Using votes cast was a backstop or shortcut, but has apparently become the norm.

As you note, in a primary the votes cast in a particular primary has a strong demographic relationship.

Quote from: realisticidealist on August 01, 2016, 12:50:28 AM

Let's look at some examples. First, the 2016 Ohio presidential primary. Here's the Democratic cumulative vote share chart by precinct:

As you can see, as Dem precinct turnout increases, Clinton's share of the vote increases. Now, let's look at Dem precinct turnout vs. Dem turnout as a percentage of total primary turnout:

Indeed, it appears that precincts with a higher Dem primary turnout are indeed more heavily Democratic (vs. Republican) precincts in terms of overall turnout.

On the GOP side:

We see the same trend, this time toward the establishment candidate Kasich and away from Trump, with Cruz relatively constant. As for turnout composition:

It would again seem that precincts with larger turnout in the GOP primary are fundamentally more Republican precincts.

I would use cumulative votes cast, rather than precinct rank as the abscissa. Using rank squeezes the higher turnout precincts at the right end causing the cumulative curves to curve outward. They could in fact be converging asymptotically.

You can resample the curve (say to percentile intervals) and then calculate the vote shares over 5 percent or 10 percent intervals. This gives you the smoothing benefit of averaging 100s of precincts, while avoiding the dampening effect of cumulative vote share.

For the scatter plots, I would either the total votes cast, or Democratic votes cast for the abscissa. If you have the precinct identification it would be useful to analyze where the precincts with a large number/percentage of Democratic votes are. If they are mostly from Cleveland, with others from Cincinnati, Columbus, Toledo, and Akron, you have your proof of a demographic relationship.

Note, the phenomena of Cruz being relatively constant was also true in Oklahoma, where Rubio gained more support in establishment areas, and Trump was stronger in non-establishment areas.

I looked through the PDF from Elections Justice USA.

A couple of amusing tidbits.

They were in a argument with Nate Cohn with the effect of black voters in Louisiana. They clearly did not realize that Louisiana publishes registration for each precinct both by party and race. In Washington County, there is a huge difference in the size of precincts that are mostly white (small) and those that are mostly black (large). I suspect the smaller precincts are in more rural areas (exurban New Orleans), and the larger in Bogalusa and along the Pearl River.

In Louisiana, I suspect that many white registered Democrats did not vote in the presidential primary. Party registration does not mean anything for any other election, except for candidates, so there is no reason to bother to switch registration.

In Wisconsin they compared the hand counted vote to the machine counted vote. For Democrats the cumulative hand counted total was about 20,000 and for Republicans about 40,000. For machine-counted it was a around 1,000,000. "I'll take cheesy statistics for $1000." The question: "Doh, what are small rural counties in northern Wisconsin".

For some reason they also did Columbia County, New York. The 2nd Ward of Hudson was 80% for Trump.

Logged

crl389
Newbie

Posts: 1

Re: On cumulative vote charts, and why they don't suggest fraud, esp. in primaries

« Reply #2 on: September 27, 2016, 03:29:54 AM »

Thanks so much for the insightful analysis. I was wondering if you had the data/could plot the Dem precinct turnout vs. Dem turnout as a percentage of total primary turnout for Utah, given that it trended toward Bernie. I figured that would be interesting and illustrative.

Also, I like playing the Devil's advocate since it helps one understand how things work by dissecting it bit by bit, and I'd like to hear your response to this: Say some subset of elections are stolen via vote manipulation (perhaps with the electronic voting machines as is alleged), and they increase the vote switch as a function of the size of the precinct, couldn't that explain a part of the trend you see regarding %Dem turnout? Also, wouldn't it make sense for them to take advantage of any naturally occurring trend instead of just a constant 10% across the board?

I'm guessing the only way to find out would be to look at the ballots themselves, and even then...

Also, what do you make of the exit polling being quite different from the results in the Democratic primary? I know that the margin of error is quite large, and people argue that they're doing the polls for discussion purposes and not as an election-accuracy control measure-- but if they're doing them at all validly, then even with a large margin of error, the probability is still quite low that they'd 1) be as far of as they were in many cases, and 2) that they'd always be biased against Clinton-- i.e. Clinton did better in official results than in exit polls (except for I believe one exception). Where am I going wrong in my thinking?

I really appreciate this post as it explains a lot, and I am entirely curious in your response to these questions since you seem to know what you're doing. No trolling whatsoever.

Logged

jimrtex
Atlas Icon

Posts: 11,828

	Re: On cumulative vote charts, and why they don't suggest fraud, esp. in primaries « Reply #3 on: September 28, 2016, 03:01:27 AM »
	« Edited: September 29, 2016, 11:54:36 AM by jimrtex »

Quote from: crl389 on September 27, 2016, 03:29:54 AM

On the Utah Lt.Governor's (Utah does not have a Secretary of State) website there is a way to request precinct-level data. They want $35 for it.

This is based on the Ohio results. In this case precincts were ordered by number of registered voters. It is quite flat on the Democratic side.

But on the Republican side, there is a definite trend, with Kasich favored in more populous precincts.

Ohio has strict limits on precinct size (maximum size of 1400 registered voters). Going beyond that requires a waiver from the SOS. For the primary, only 2.4% of precincts exceeded 1400, with 1.3% between 1400 and 1500. There may be timing issues when a precinct may be split, or perhaps waivers were sought for a small excess. In a suburban area, it might be more difficult to obtain polling places, so that a division would simply mean two precincts sharing the same polling place.

In an area that is declining in population, it is not necessarily trivial to reconfigure precincts to increase their size. Unless pairs of precincts have declined substantially in population, they cannot be merged. An area with three precincts of 850 (near the statewide average) could be configured into two precincts of 1275 each, but would require a fairly even split of one of the precincts. Boards of elections in Ohio are deliberately bipartisan, so it may difficult to get support for changes that will be seen as disenfranchising voters of a board member's party. Areas that are growing may be allowed to grow until they reach 1400. Since growth generally requires building of new housing, it may be easier to split precincts, since it may actually make voting more convenient.

The average precinct size in Cuyahoga County has 870 voters, but the average Cleveland precinct has 772 voters compared to 929 within the county outside Cleveland. This is a 20% difference.

In some rural counties most precincts may consist of an entire township, and may be somewhat smaller. This is not entirely consistent. Among counties with an average precinct size over 1000 are rural counties Holmes, Putnam, Shelby, and Williams; suburban Fairfield(E Columbus), Lake(E Cleveland), Licking(E Columbus), Lorain(W Cleveland), Medina(S Cleveland), and Wood(S Toledo); small-city dominated Clark(Springfield), Richland(Mansfield), Wayne(Wooster); and large-city plus suburbs Franklin(Columbus), Hamilton(Cincinnati), and Montgomery(Dayton). Large areas of Columbus can be considered suburban.

So registration totals generally conform to a rural-city-suburban, with Kasich piling up votes in suburban areas. On the Democratic side, votes cast may include smaller inner city precincts that are extremely Democratic, black, and favorable to Clinton, and larger precincts that had more partisan balance.

Mahoning County was only 51% Democratic, and Trump had a majority in the Republican primary, indicating heavy cross-over voting among working class whites.

Logged

jimrtex
Atlas Icon

Posts: 11,828

Re: On cumulative vote charts, and why they don't suggest fraud, esp. in primaries

« Reply #4 on: September 30, 2016, 02:26:52 AM »

This shows the percentage of voters who chose a particular party ballot vs. registered voters.

It appears that both the smallest and largest precincts are more Republican than those in the middle.

Logged

Pages: [1]

« previous next »

	Welcome, Guest. Please login or register. Did you miss your activation email? April 19, 2024, 05:23:49 PM

News: Election Simulator 2.0 Released. Senate/Gubernatorial maps, proportional electoral votes, and more - Read more