May 12, 2005

Cluster solutions  

In re this new study (via Political Wire) can anybody shed some light on this?

A statistical cluster analysis was used to sort the remaining respondents into relatively homogeneous groups based on the nine value scales, party identification, and self reported ideology. Several different cluster solutions were evaluated for their effectiveness in producing cohesive groups that are distinct from one another, large enough in size to be analytically practical, and substantively meaningful. The final solution selected to produce the new political typology was judged to be strongest on a statistical basis and to be most persuasive from a substantive point of view.
In particular I'd like to know what exactly their statistical methodology is for creating these clusters. It sounds a lot like the confirmatory factor analysis we use in the educational research I do, in that the goal is to confirm a preexisting theory about the relationship between factors. But I guess cluster analysis is about creating taxonomies rather than illuminating causal relationships (even if the stated goal of the technique is to "organize observed data into meaningful structures" -- what exactly does meaningful mean in this context?).

Anyway, no surprise I was classified as a liberal.

Comments
Sudeep  {May 16, 2005}

Hrm -- if it's the same as it may or may not be in biology:

Imagine creating a two by two table -- on the horizontal axis, place your respondents, and on the vertical axis, the categories/questions you've gotten responses from for said respondents. The clustering takes place when the respondents are grouped by their response profiles, and significances can be derived from bootstrapping the response profiles and reclustering.

This is, as far as I can tell, how the statistic works in biology, and it doesn't seem unreasonable to assume it works similarly here, I think -- good luck anyhow.

paul  {May 16, 2005}

Thanks Sudeep. I'm just wondering how the clustering itself happens statistuically -- from that page it sounds as though they have various pre-imagined cluster configurations and they just choose the best one, based on some very subjective notion of intelligibility. But who knows.

Sweth  {May 29, 2005}

Normally, for clustering where you aren't trying to identify hierarchical relationships, you basically end up doing a brute-force analysis to see which variables, when taken together, are the best predictors of membership in distinct groups--but those methods presuppose knowing how many groups there are in the first place. So you often have to take the brute force approach one level further, and try the same analysis presupposing 1 group, and then 2 groups, and so on; there's a test that you can do to get a good idea of how many clusters there PROBABLY are, but no way to know for sure, so some subjectivity is introduced in the decision of whether the results when presupposing, say, 8 groups are more meaningful than the results when presupposing 9 groups.


Post a comment










Remember personal
information?