This was originally posted on blogger here.
I had promised myself that I wouldn’t blog about politics, but this is really more about statistics so I think it’s ok.David Sparks has posted an interesting piece about using statistical clustering to determine US Congressional districts (h/t R-Bloggers). He uses k-means clustering, and then analyzes the “partisanship” of the resulting districts by assuming that districts with above-median population density are Democratic and those with below-median density are Republican (I’m not sure how good an assumption that is). The result is that you get much more reasonable looking districts than the crazy ones that politicians come up with, but the partisan balance doesn’t seem to change (again, under the assumption that density=party). Here is an example of the map for Texas:This is, of course, way too reasonable to actually be put into practice.