Tuesday, May 28, 2013

Why doesn't R have a MaxDiff package?

Almost once every year someone asks if R has a package for running the MaxDiff procedure sold by Sawtooth.  One such inquiry recently received a reply with a link showing in some detail the R code needed to generate a balanced incomplete block design, input the best-worst choice data, and use the mlogit R package to estimate the parameters.  Although Sawtooth uses hierarchical Bayes and its own peculiar program for showing individuals an unbalanced subset of the block design, the link provides a considerable amount of helpful R code that takes one some distance toward understanding how MaxDiff works.  For example, it makes clear that MaxDiff relies on a "trick" to combine the best and worst choice data into one single analysis.  Sawtooth assumes that there is one set of preference parameters underlying the choice of both the best and the worst alternatives.  Thus, we can estimate just one set of common parameters for the combined best and worst choices if we multiply all the independent variables by -1 when the dependent variable is the worst choice.

This assumption is questioned in the upcoming Advanced Research Techniques Forum.  One of the papers, Models of Sequential Evaluation in Best-Worst Choice Tasks, raises serious concerns.  The authors present convincing evidence that respondents make sequential choices and that usually the worst alternative is selected first and then the best selected from the remaining alternatives.  The conclusion is that we need different best and worst choice parameters, and as a result, Sawtooth's MaxDiff analysis can be misleading. Moreover, the authors note that their findings support the idea of preference construction.  That is, pre-existing preferences are not revealed in the choice task, but instead, preferences are created in order to solve the choice task.  The paper ends with the following sentence: "We think that it is very important in the model development stage to stop and think about whether the tool that is used to collect data on preferences makes sense and is consistent with anticipated decision processes on actual purchasing decisions."

So we have another reason not to add a MaxDiff package to the R library.  However, I do not believe that the paper takes its argument to its logical conclusion.  The example from their analysis is a study of hair concerns.  They ask respondents to indicate their most and least concern in choice sets of 5 alternatives chosen from the complete list of 15 concerns in all.  For example, which of the following items are you the most and least concerned about?

  • My hair is coming out more than it used to
  • My graying hair is unflattering
  • My hair is dry
  • I have unruly, unmanageable hair
  • My hair is stiff and resistant

Now I ask the reader, when does anyone need to make such a choice in the real world?  The authors admonish us to "stop and think about whether the tool used to collect data on preferences makes sense."  What if my hair is unmanageable because it is dry?  Trade-offs work when they are forced upon us by real constraints in the world.  When trade-offs are artificial, they persuade us to attend to features and give weight to information that we would not have considered during the actual purchase process.

Now, if preference were well-formed and established, then perhaps such an unrealistic choice task would still reveal my "true" preferences.  But if preferences are constructed in order to solve the choice task at hand, then little that I learn in this study can be generalized to the marketplace.  Context and order effects are robust and common in choice modeling because we do not store fully-formed preferences for the thousands of features that we will encounter given all that we purchase over time.  Instead, we use the choice task to assist us in the construction of preference for the particular situation in which we find ourselves (situated cognition).  Only when the choice tasks mimics the real world can we learn anything of value.  Even then, we may have problems because experiments and data collection are always obtrusive.

A previous post, Incorporating Preference Construction into the Choice Modeling Process, provides a more detailed discussion of these issues. 

No comments:

Post a Comment