Answer: The variance-covariance matrix containing all the MaxDiff scores is not invertible. R tells you that, either with an error message or a warning. SPSS, at least earlier versions still in use, runs the factor analysis without comment.
I made two points in my last post. First, if you want to rank order your attributes, you do not need to spend $2,000 and buy Sawtooth's MaxDiff. A much simpler data collection procedure is available. You only need to list all the attributes and ask respondents to indicate which attribute is the best on the list. Then you repeat the same process after removing the attribute just selected as best. The list gets shorter by one each time, and in the end you have a rank ordering of all the attributes. But remember that the resulting ranks are linearly dependent because they sum to a constant, which was my second point in the prior post. This same dependency can be seen in the claimed "ratio" scores from a MaxDiff. The scores from a MaxDiff sum to a constant, so they are also dependent and cannot be analyzed as if they were not. Toward the end of that previous post, I referred you to the r package composition for a somewhat technical discussion of the problem and a possible solution.
So you can image my surprise when I received an email with the output from an SPSS factor analysis of MaxDiff scores. Although the note was polite, questions were raised regarding my assertion that MaxDiff scores are linearly dependent. Surely, if they were, the correlation matrix could not be inverted and factor analysis would not be possible.
I asked that the sender run a multiple regression using SPSS with the MaxDiff scores as the independent variables. And they spent me back the output with all the regression coefficients, except for one excluded MaxDiff score. My reputation was saved. On its own, SPSS had removed one of the variables in order to invert (X'X). Since the sender had included the SPSS file as an attachment, I read it into R and tried to run a factor analysis using one of the build-in R functions. The R function factanal, which performs maximum-likelihood factor analysis, returned an error message, "System is computationally singular." I tried a second factor analysis using the psych package. The principal function from the psych package was more accommodating. However, it gave me a clear warning, "matrix was not positive definite, smoothing was done."
This is an example of why I am so fond of R. Revelle, the author of the psych package, has added a function called cor.smooth to deal with correlation matrices containing teterachoric or polychoric coefficients or coefficients calculated from pairwise deletion. Such correlation matrices are not always positive definite. You can type ?cor.smooth after loading the psych package to get the details, or you can type cor.smooth without parameters to obtain the underlying R code. His function principal gives me the warning, smoothes the correlation matrix, and then produces output with results almost identical to that from SPSS. Had I been more careful initially, I would have noticed that the last eigenvalue from the SPSS results seemed a little small, 2.776E-17. Is this just a rounding error? Has it been fixed in later versions of SPSS?
Now what? How am I to analyze these MaxDiff scores? I cannot simply pretend that the data collection procedure did not create spurious negative correlations among the variables. As we have seen with the regression and the factor analysis, I cannot perform any multivariate analysis that requires the covariance matrix to be inverted. And what if I wanted to do cluster analysis? The constraint that all the scores sum to a constant creates problems for all statistical analyses that assume observations lie in Euclidian space. One could consider this to be an unintended consequence of forcing tradeoffs when selecting the best and the worst of a small set of attributes. But regardless of the cause, we end up with models that are misspecified and estimates that are biased.
As an alternative, we could incorporate the constant sum constraint and move from a Euclidean to a simplex geometry, as suggested by the r package composition. Perhaps the best way to explain simplex geometry is to start with two attributes, A and B, whose MaxDiff scores sum to 100. Although there are two scores, they do not span a two-dimensional Euclidean space since A + B = 100. If one thinks of the two-dimensional Euclidean space as the floor of a room, MaxDiff restricts our movement to one dimension. We can only walk the line between always A (100, 0) and always B (0, 100). Similar restrictions apply when there are more than two MaxDiff attributes.
In a three-dimensional Euclidean space one can travel unrestricted in any direction. However, in a three-dimensional simplex space one can travel freely only within a two-dimensional triangle where each vertex represents the most extreme score on each attribute. For example, a score of 80 on the first attribute leaves only 20 for the other two attributes to share. The figure below shows what our simplex would look like with only three MaxDiff scores and our most extreme score for the each attribute at the vertices.
ternary plot. You should note that the three axes are the sides of the triangle. They are not at right angles to each other because the dimensions are not independent. It takes a little while to get familiar with reading coordinates from the plot because we tend to think "Euclidean" about our spaces. If you start at the apex labeled with a 1, you will be at the MaxDiff pattern (100, 0, 0). Now, jump to the base on the triangle. Any observation along the base has a MaxDiff score of zero for the first attribute. The blue lines drawn parallel to the base are the coordinates for the first attribute.
Next, find the vertex labeled 2; this is the MaxDiff pattern (0, 100, 0). If you jump to the side of the triangle opposite this vertex, you will be at a second MaxDiff score of zero. All the lines parallel to the side opposite the vertex #2 are the coordinates for second MaxDiff score. You should remember that the vertex has a score of 100, so that the marking you should be reading are those along the base of the triangle that gradually decrease from 100 (i.e., 90 is closest to the 2nd vertex). You interpret the third attribute in the same way.
We stop plotting with this figure. In four-dimensions the simplex plot looks like a tetrahedron, and no one wants to read coordinates off a tetrahedron.
The r package composition outlines an analysis consistent with the statistical properties of our measures. It would substitute log ratios for the MaxDiff scores. One even has some flexibility over what is the base for that log ratio. As I noted in the last post, Greenacre has created some interesting biplots from such data. You could think about cluster analysis within Greenacre's biplots. Or, we could avoid all these difficulties by not using MaxDiff or ranking tasks to collect our data. Personally, I find Greenacre's approach very intriguing, and I would pursue it if I believed that the MaxDiff task provided meaningful marketing data.
Perhaps I should provide an example of the point that I am trying to make. The task in choice modeling corresponds to a real marketplace event. I stand in front of the retail shelf and decide which pasta to buy for dinner. There was a time in my research career when I only used rating-based conjoint for individual-level analysis with purchase likelihood as the dependent variable. But I was motivated to substitute choice-based conjoint analysis because I found convincing research demonstrating that the cognitive processing underlying choice among alternatives was different than the cognitive processing underlying the purchase likelihood rating. Choice modeling mimics the marketplace, so I learned how to run hierarchical Bayes choice models (see r package bayesm).
There are times and situations in the marketplace where tradeoffs must be made by consumers. Selecting only one brand from many different alternatives is one example. Deciding to fly rather than drive or take the train is another example. Of course, there are many more examples of tradeoffs in the marketplace that we want to model with our measurement techniques. But should we be forcing respondents to make tradeoff in our measurements when those tradeoffs are not made in the marketplace? What if I wanted the latest product or service with all the features included, and I was willing to pay for it? MaxDiff is so focused on limiting the respondent's ability to rate everything as important that they have eliminated the premium product from the marketing mix. How can the upper-end indicate that they want it all when MaxDiff requires that they select the one best and the one worst?
The MaxDiff task does not attempt to mimic the marketplace. Its feature descriptions tend to be vague and not actionable. It is an obstructive measurement procedure that creates rather than measures. It leads us to attend to distinctions that we would not make in the real world. It makes sense only if we assume that context has no impact on purchase. We must believe that each feature has an established preference value that is stored in memory and waiting to be accessed without alteration by any and all measurement tasks. Nothing in human affect, perception, or thought behaves in such a manner: not object perception, not memory retrieval, not language comprehension, not emotional response, and not purchase behavior.
Finally, please keep those emails and comments coming in. It is the only way that I learn anything about my audience. If you have used MaxDiff, please share you experiences. If something I have written has been helpful or not, let me know. Thanks.