However, this ignores the underlying assumption of CLTs; namely, they assume draws are independent of each other, draws are with replacement, and that the distribution being drawn from is not a function of the draws. As it turns out, there is a very simple reason to think that precinct turnout size in a primary election is not an independent basis for draws: assuming precincts in a state have roughly equal populations to begin with, precincts which tend to vote more for a party in general elections will have larger electorates to draw from in primary elections. As such, these precincts may reflect the base or establishment of a state's party moreso than precincts dominated by the opposing party. This is further amplified in the presence of racial polarization; a heavily black precinct in the south, for example, would likely have higher turnout for the Dems than a heavily white precinct and be more likely to vote for Hillary Clinton. Similarly, a heavily suburban precinct around a Midwestern city would likely have more GOP primary voters than one in the urban core and be more likely to support an establishment candidate.
It may not be a good assumption that precincts in a state have roughly equal populations.
In Kansas it definitely is not true. Kansas does not have any size restrictions on precincts. It does have requirements for respecting political boundaries.
Kansas has townships, and in most counties these are the public land survey townships. 36 sections in Western Kansas may have only couple dozen farmers, and they are probably 80% Republican.
If there is a city of even a couple of 1000 it may dominate the county population. The precincts in the "city" will be larger than in the county, and a bit less Republican (more waitresses, perhaps school teachers, etc.). So this suggests a trend towards larger precincts being more Democratic.
But in Kansas, elections are administered at the county level, and counties apparently have adopted different styles. Shawnee (Topeka) and Wyandotte (KCK) in particular have small precincts, and happen to be two of the three counties that would be identified as Democratic. The other is Douglas (Lawrence). Since Democrats are more likely to play the victim card, it might not be worth the pain to rationalize precinct sizes. And in cities that were laid out before almost universal private automobile ownership, there may be suitable facilities for polling places in a small area (say a 1/4 section).
Newer suburban areas were developed after everyone had automobiles. There might only be one suitable location within a section, and the voters will appreciate plentiful parking and may even approve of any cost savings.
If you look at Overland Park (which is the 2nd most populous city in the state) precinct size correlates with latitude. The cities in Johnson County are long and skinny, as they have annexed either south and west to prevent being blocked in by other cities. The precinct sizes also increase in area. And though an area with precincts of a section in size might have lower density, it will not be 1/4 of the density of a precinct 1/4 section in size.
The other factor in Kansas is that Black precincts are extremely unusual, and extremely Democratic. This results in the median being more Republican than the mean. So on the cumulative curve you see the Republican percentage creeping upward, and then there will be a black precinct which will really yank the curve down, but this really isn't visible on the cumulative curves. You almost have to be looking at a spreadsheet stepping through the precincts one by one.
What happens in Kansas is that these extremely Democratic precincts become more rare after all the of Shawnee and Wyandotte counties are in, and Douglas and the Democratic areas of Sedgwick County can't balance the remainder of the state, and this is true even if the most suburban precincts are slightly less Republican than the state as a whole.
So what you are seeing is more of a distribution problem, than a trend. In Sedgwick County, Kansas when you look at a scatter diagram of partisanship vs. precinct size, you will see two clouds with a gap between. The precincts are polarized, so the mean is even less representative of the data than you might expect. The mean falls between the two maximum. It would be like measuring the height of adult humans, and not recognizing that adult males are on average several inches taller.
Even if the number of registered voters is the same (some states (Ohio) have strict limits), this might not result in the same number of voters. People move, and they will still be registered at their old address. Registered voters can not be purged without documentation. An "inactive voter" is not someone who doesn't vote. It is someone who had mail returned to the registrar. If they remain inactive for the next two general elections they can be removed. If they show up, they can vote.
People who don't own their own home, are younger, less married, and have greater job instability move more often. They also vote less often. Residential stability is one indicia of being established, or being one of the establishment.
The cumulative vote share curves are different if you use registered voters, or votes casts for the ordering. In the Choquette (sp) analysis which was based on the Republican primaries, he recommended using the number of registered voters for ranking precincts. Using votes cast was a backstop or shortcut, but has apparently become the norm.
As you note, in a primary the votes cast in a particular primary has a strong demographic relationship.
Let's look at some examples. First, the 2016 Ohio presidential primary. Here's the Democratic cumulative vote share chart by precinct:
As you can see, as Dem precinct turnout increases, Clinton's share of the vote increases. Now, let's look at Dem precinct turnout vs. Dem turnout as a percentage of total primary turnout:
Indeed, it appears that precincts with a higher Dem primary turnout are indeed more heavily Democratic (vs. Republican) precincts in terms of overall turnout.
On the GOP side:
We see the same trend, this time toward the establishment candidate Kasich and away from Trump, with Cruz relatively constant. As for turnout composition:
It would again seem that precincts with larger turnout in the GOP primary are fundamentally more Republican precincts.
I would use cumulative votes cast, rather than precinct rank as the abscissa. Using rank squeezes the higher turnout precincts at the right end causing the cumulative curves to curve outward. They could in fact be converging asymptotically.
You can resample the curve (say to percentile intervals) and then calculate the vote shares over 5 percent or 10 percent intervals. This gives you the smoothing benefit of averaging 100s of precincts, while avoiding the dampening effect of cumulative vote share.
For the scatter plots, I would either the total votes cast, or Democratic votes cast for the abscissa. If you have the precinct identification it would be useful to analyze where the precincts with a large number/percentage of Democratic votes are. If they are mostly from Cleveland, with others from Cincinnati, Columbus, Toledo, and Akron, you have your proof of a demographic relationship.
Note, the phenomena of Cruz being relatively constant was also true in Oklahoma, where Rubio gained more support in establishment areas, and Trump was stronger in non-establishment areas.
I looked through the PDF from Elections Justice USA.
A couple of amusing tidbits.
They were in a argument with Nate Cohn with the effect of black voters in Louisiana. They clearly did not realize that Louisiana publishes registration for each precinct both by party and race. In Washington County, there is a huge difference in the size of precincts that are mostly white (small) and those that are mostly black (large). I suspect the smaller precincts are in more rural areas (exurban New Orleans), and the larger in Bogalusa and along the Pearl River.
In Louisiana, I suspect that many white registered Democrats did not vote in the presidential primary. Party registration does not mean anything for any other election, except for candidates, so there is no reason to bother to switch registration.
In Wisconsin they compared the hand counted vote to the machine counted vote. For Democrats the cumulative hand counted total was about 20,000 and for Republicans about 40,000. For machine-counted it was a around 1,000,000. "I'll take cheesy statistics for $1000." The question: "Doh, what are small rural counties in northern Wisconsin".
For some reason they also did Columbia County, New York. The 2nd Ward of Hudson was 80% for Trump.