Welcome, Guest. Please login or register.
Did you miss your activation email?
October 20, 2017, 04:21:42 pm
HomePredMockPollEVCalcAFEWIKIHelpLogin Register
News: Cast your Ballot in the 2016 Mock Election

+  Atlas Forum
|-+  General Politics
| |-+  Political Geography & Demographics (Moderator: muon2)
| | |-+  Iowa-style Redistricting: Measuring Erosity
« previous next »
Pages: 1 2 [3] Print
Author Topic: Iowa-style Redistricting: Measuring Erosity  (Read 4064 times)
muon2
Moderator
YaBB God
*****
Posts: 12905


View Profile
« Reply #50 on: March 02, 2013, 11:39:24 pm »

There's no question that for any particular shape of a state and the type of cuts that are allowed then there will be exactly that type of oscillation with respect to the large N ideal. However, when I started with some other shapes, such as hexagons or rectangles, I found that the oscillations peaked for different values of N.

If I consider the wide range of state shapes and types of county shapes as cuts, the right value to use is the weighted average of all possible choices. A more detailed study would involve modeling all 50 state shapes and their county shapes to get the best statistical average. Since I observed a smoothing with just three choices, I'm willing to speculate that the smoothing will persist when applied to a larger set.
Logged



Great American Eclipse seconds before totality showing Baily's Beads.
jimrtex
YaBB God
*****
Posts: 8272
France


View Profile
« Reply #51 on: March 05, 2013, 12:28:51 am »
Ignore

I don't believe your erosity estimate is robust enough or accurate enough to be used as a metric.

First, Tennessee is not shaped like a circle, and 10 is not a large number.

And while the total perimeter of a large number of circles does approach a limit of sqrt(N) * 2*pi*R, where R is the radius of the containing circle, it might not do so monotonically.   7 small circles can be placed in a hexagon shape, with rounded vertices that would have a relatively small outer bounding circle.  Add in an 8th and the bounding circle will expand quite a bit, remove a 7th, and the shrinkage will be small.  Going from 6 to 7 had little cost, going from 7 to 8 a large amount.

This can be illustrated with an idealized-Iowa-like state which has infinite granularity but where boundaries are restricted to North-South and East-West lines.   We draw equal sized districts (the population density is non-varying).



P is the total perimeter of the districts, so the shared (green) perimeters are counted twice, and the outer black perimeter is counted one.  This allows us to use sqrt(N) for our estimate, rather than sqrt(N)-1.   This has the advantage that sqrt(1) is non-zero, while sqrt(1)-1 is zero.

So the perimeter of our single district state is 4.00, while for a 2-district state it is 6.00.  But E, the estimate of the 2-district perimeter, is sqrt(2/1)*4.00 or 5.66.  We grossly underestimated our total perimeter.   This is not that surprising.   Our districts are non-compact by Iowa standards, with a length twice their width.

We continue to 3 district, where P is 7.33.  Our estimate is based on the 2-district case.  That is E(N+1) = P(N)*sqrt((N+1)/N).   E(3) = 6.00 * sqrt(3/2) = 7.35.  In this case our estimate is good.   We have the very-elongated district L:W = 3:1, and two somewhat compact districts L:W = 4:3.  This really says that 3 districts are about as good or about as bad as 2 districts.  If we based our estimate for 3 districts on the 1 district case, our estimate would be 4.00*sqrt(3/1) or 6.93, which would have been too optimistic.

But for a 4-district case P is 8.00, but our estimate of P(3)*sqrt(4/3) = 8.47 is much higher.  Our districts are all maximally compact.

As we continue on, the estimate for 5 districts is quite low, followed by slightly high, slightly low, slightly high, and then quite high when we once again reach a compact symmetric case for 9 districts.
The first example really wasn't how we were subdividing regions.  In this 2nd example, we nibble off smaller regions one at a time.



Somewhat surprising is the two-district case, where we do better than estimated, and much better than where we split into two equal areas.  But it less surprising when we recognize that both regions are fairly compact, with only a corner missing from an almost square larger district.  And of course smaller areas have smaller perimeters (P1/P2 = sqrt(A1/A2) for two similar polygons.  The perimeter of the small region with 1/9 the of the total area, has a perimeter of 1/sqrt(9) or 1/3 of that of the total area.

Adding a 3rd district is not quite as compact as expected base on the two-district case.  But our larger regions is beginning to become less compact.  Our estimate based on two-districts was 6.53.  We only did 6.67.   But an estimate based on 1-district is 6.93, and when we create 3 equal area districts the perimeter is 7.33.

Adding a 4th district is quite efficient as we only have to chop off the panhandle of our large region.   Our 4 districts are are as compact as 3 equal area districts can be, and better than the 4-district symmetric equal-area case of our first set of examples.  That is, we can get a smaller perimeter by having greater variation in region size.

As we continue on, we are worse or better than the estimate based on the previous case E(N+1) = P(N)*sqrt((N+1)/N), depending on if we are able to cut off a panhandle of larger district, or the new district is cut off from the body. In all cases, other than for N=8, we are better than the equivalent equal-area version for the same number of districts.

It is instructive to look at the 4-district case in more detail.  An estimate based on 1 district would be that we would have a perimeter of 8.00.  We managed to do better, with a 7.33.  And based on the 4-district case, we should be able to do 9-districts in 7.33*sqrt(9/4) with a perimeter of 11.00.   We managed to do only 12.00 even with maximally compact districts.

What went wrong?  We essentially gamed the system by creating districts for grossly different size, and then were unable to maintain that when subdividing the larger region.  

If we use your proposed estimate method, which only measures interior perimeters, our estimate is even worse.  The interior perimeter of the 4-district case is 1.67.  The estimate for the 9 district case would be 1.67 * (sqrt(9) - 1)/(sqrt(4) - 1) = 3.33.  But in reality it is 4.00.

If we adjust these numbers so they are equivalent to my values for total perimeter, the estimate for the  9-district case would be 10.67, which is even further from my estimate of 11.00, and the actual value of 12.00.  Your estimate is based on how well you did in creating the first three small districts, with no knowledge of the area in the larger district beyond other than one side and an idea of its area (because we knew that it could be divided into 6 more areas).

We could do better, by making an estimate for dividing the larger area into 6 districts.  The total perimeter for the 4-district case is 7.33, 4.00 of which is the total perimeter of the 3 small districts, and 3.33 for the 4th district.  The estimate for the total perimeter after dividing the larger area into 6 districts is 3.33 * sqrt(6/1) = 8.16.  Add in the 4.00 for the original 3 districts, and the estimate is 12.16, which is just above the actual 12.00.   We went from a slightly non-compact area (length:width of 3:2) to 6 maximally compact districts.

In your Tennessee example, a better estimate of dividing the eastern edge of the state into 3 districts could be obtained from using that area alone, and not basing it in part on the rest of the state which had a mix of district sizes between smaller ones in the Memphis and Nashville areas, and the more rural areas between the mountains and the Mississippi.

If we use county-county links to measure erosity, this is problematic, since it would require determining links between counties in Virginia, North Carolina, and Georgia, and those in Eastern Tennessee.  There may be scaling problems because of county size styles.   And how well we divide eastern Tennessee is probably only marginally related to how easy it is to travel across the mountains between Tennessee and North Carolina.   A better approach would be to calculate the internal erosity among all counties in the area being divided, and estimate downward.

There are 25 counties in your 3-district Eastern region, 6 counties in your greater Nashville region 2-district region, and 14 counties in your Western-Memphis region.   The inter-county link counts for these 3 regions is 50, 9, and 26 respectively.

We can estimate the erosity if each of these county areas were consolidated into N regions:

Erosity(N)/County_Links =  (sqrt(N) - 1) / (sqrt(ncounties) - 1)

For the eastern region:

Erosity(3) = 50 * (sqrt(3) - 1) / (sqrt(25) - 1) = 9.15

For the greater Nashville region:

Erosity(2) = 9 * (sqrt(2) - 1) / (sqrt(6) - 1) = 2.57

For the western-Memphis region:

Erosity(2) = 26 * (sqrt(2)-1) / (sqrt(14) - 1) = 3.93

These are reasonably close for the first two, if we ignore your proposed county splits (10 and 3).  As to be expected, we missed badly in the 3rd, because there we simply chopped part of Shelby County from the region.

Add these to the 5-region erosity for a final estimate:

42 + 9.15 + 2.57 +3.93 = 57.65

BTW, why are links preferred over perimeter?
Logged
muon2
Moderator
YaBB God
*****
Posts: 12905


View Profile
« Reply #52 on: March 05, 2013, 08:53:25 am »


BTW, why are links preferred over perimeter?


There are three reasons to consider links.

1) Perimeter-based compactness measures have a well-known bias against subdivisions based on irregular natural features such as rivers and mountains. County-based division runs into this problem when counties use those same natural divisions. The other large class of compactness measures are based on bounding shapes like circles or polygons, which are weak at penalizing peninsulas jutting into a larger district. Links don't penalize natural irregular boundaries, but do weigh against jutting peninsulas.

2) County or municipal integrity are proxies for communities of interest. Links also are proxies for communities of interest with respect to associations between counties. Perimeter and area measures don't add anything to identify potential communities of interest. Links can add more than just a compactness measure.
 
3) Perimeter calculations or bounding methods require GIS software with enough sophistication to calculate lengths and areas. That's fine for those who are experts in the field, but I'm looking to make redistricting more accessible to the public. I'd like users to be able to proceed with a spreadsheet and standard mapping software like Mapquest or Google Maps. Links can be simply counted by the mapper using nothing more sophisticated than a highway map.
Logged



Great American Eclipse seconds before totality showing Baily's Beads.
jimrtex
YaBB God
*****
Posts: 8272
France


View Profile
« Reply #53 on: March 05, 2013, 10:46:59 pm »
Ignore


BTW, why are links preferred over perimeter?


There are three reasons to consider links.

1) Perimeter-based compactness measures have a well-known bias against subdivisions based on irregular natural features such as rivers and mountains. County-based division runs into this problem when counties use those same natural divisions. The other large class of compactness measures are based on bounding shapes like circles or polygons, which are weak at penalizing peninsulas jutting into a larger district. Links don't penalize natural irregular boundaries, but do weigh against jutting peninsulas.

2) County or municipal integrity are proxies for communities of interest. Links also are proxies for communities of interest with respect to associations between counties. Perimeter and area measures don't add anything to identify potential communities of interest. Links can add more than just a compactness measure.
 
3) Perimeter calculations or bounding methods require GIS software with enough sophistication to calculate lengths and areas. That's fine for those who are experts in the field, but I'm looking to make redistricting more accessible to the public. I'd like users to be able to proceed with a spreadsheet and standard mapping software like Mapquest or Google Maps. Links can be simply counted by the mapper using nothing more sophisticated than a highway map.
Any redistricting process that allows meaningful citizen input will have to provide the GIS and demographic data, and likely at least simple software.  The Census Bureau has road and street layers, and it image data can be meshed in as well.  The software used in the Ohio redistricting commission supported image data, but the Ohio sponsors did not wish to pay the licensing fees to Google/Bing.

There are ways to simplify borders for calculating perimeter.  This has the additional advantage of resolving whether one can directly link Lincoln and Okanagan counties, since there is an extreme penalty for doing so.

Logged
muon2
Moderator
YaBB God
*****
Posts: 12905


View Profile
« Reply #54 on: March 06, 2013, 12:32:15 am »


BTW, why are links preferred over perimeter?


There are three reasons to consider links.

1) Perimeter-based compactness measures have a well-known bias against subdivisions based on irregular natural features such as rivers and mountains. County-based division runs into this problem when counties use those same natural divisions. The other large class of compactness measures are based on bounding shapes like circles or polygons, which are weak at penalizing peninsulas jutting into a larger district. Links don't penalize natural irregular boundaries, but do weigh against jutting peninsulas.

2) County or municipal integrity are proxies for communities of interest. Links also are proxies for communities of interest with respect to associations between counties. Perimeter and area measures don't add anything to identify potential communities of interest. Links can add more than just a compactness measure.
 
3) Perimeter calculations or bounding methods require GIS software with enough sophistication to calculate lengths and areas. That's fine for those who are experts in the field, but I'm looking to make redistricting more accessible to the public. I'd like users to be able to proceed with a spreadsheet and standard mapping software like Mapquest or Google Maps. Links can be simply counted by the mapper using nothing more sophisticated than a highway map.
Any redistricting process that allows meaningful citizen input will have to provide the GIS and demographic data, and likely at least simple software.  The Census Bureau has road and street layers, and it image data can be meshed in as well.  The software used in the Ohio redistricting commission supported image data, but the Ohio sponsors did not wish to pay the licensing fees to Google/Bing.

It's not only the cost and availability, but also the level of sophistication needed by the user. Most perimeter calculations are not simple to understand without a reasonable background in geometry. Counting links is not hard, just like counting county splits is an easy thing fr a user to see and understand. The math of counting links gets only a little more complicated when comparing plans with different region counts, and is still less complicated than understanding a purely geometrical concept.

Eve if the math for perimeters were easy to understand, it leaves the problems associated with perimeter-based formulas as I noted in point 1.
Logged



Great American Eclipse seconds before totality showing Baily's Beads.
jimrtex
YaBB God
*****
Posts: 8272
France


View Profile
« Reply #55 on: March 06, 2013, 11:15:45 am »
Ignore


BTW, why are links preferred over perimeter?


There are three reasons to consider links.

1) Perimeter-based compactness measures have a well-known bias against subdivisions based on irregular natural features such as rivers and mountains. County-based division runs into this problem when counties use those same natural divisions. The other large class of compactness measures are based on bounding shapes like circles or polygons, which are weak at penalizing peninsulas jutting into a larger district. Links don't penalize natural irregular boundaries, but do weigh against jutting peninsulas.

2) County or municipal integrity are proxies for communities of interest. Links also are proxies for communities of interest with respect to associations between counties. Perimeter and area measures don't add anything to identify potential communities of interest. Links can add more than just a compactness measure.
 
3) Perimeter calculations or bounding methods require GIS software with enough sophistication to calculate lengths and areas. That's fine for those who are experts in the field, but I'm looking to make redistricting more accessible to the public. I'd like users to be able to proceed with a spreadsheet and standard mapping software like Mapquest or Google Maps. Links can be simply counted by the mapper using nothing more sophisticated than a highway map.
Any redistricting process that allows meaningful citizen input will have to provide the GIS and demographic data, and likely at least simple software.  The Census Bureau has road and street layers, and it image data can be meshed in as well.  The software used in the Ohio redistricting commission supported image data, but the Ohio sponsors did not wish to pay the licensing fees to Google/Bing.

It's not only the cost and availability, but also the level of sophistication needed by the user. Most perimeter calculations are not simple to understand without a reasonable background in geometry. Counting links is not hard, just like counting county splits is an easy thing fr a user to see and understand. The math of counting links gets only a little more complicated when comparing plans with different region counts, and is still less complicated than understanding a purely geometrical concept.

Eve if the math for perimeters were easy to understand, it leaves the problems associated with perimeter-based formulas as I noted in point 1.
Relative compactness measures are used in an attempt to restrict abuse by gerrymanders that use block assignment.  But the true abuse is block assignment.  Total perimeter length is just as effective (more effective in actuality) as county links in measuring compactness of county-based plans.

Perimeter is a grade-school math concept.   Given a simplified outline anyone could measure it on a map.  Not they wouldn't have to, since any reasonable software would be calculating it as they clicked, particularly it it were a contest criteria.

The map of Washington was not necessarily what a contest entrant would see.  It was a presentation of the data that was being used for the contest.  Both the actual borders, and the simplified borders would be available, as well as information about whether the border was passable or not.

It is certainly easier to measure the length of the Okanogan-Lincoln border or the Columbia-Franklin border than to try to figure out why I think one is "1" and the other is "0", and you think the opposite.
Logged
Pages: 1 2 [3] Print 
« previous next »
Jump to:  


Login with username, password and session length

Logout

Powered by SMF 1.1.21 | SMF © 2015, Simple Machines