Sunday, April 3, 2016

When Choice Modeling Paradigms Collide: Features Presented versus Features Perceived

What is the value of a product feature? Within a market-based paradigm, the answer is the difference between revenues with and without the feature. A product can be decomposed into its features, each feature can be assigned a monetary value by including price in the feature list, and the final worth of the product is a function of its feature bundle. The entire procedure is illustrated in an article using the function rhierMnlMixture from the R package bayesm (Economic Valuation of Product Features). Although much of the discussion concentrates on a somewhat technical distinction between willingness-to-pay (WTP) and willingness-to-buy (WTB), I wish to focus instead on the digital camera case study in Section 6 beginning on page 30. If you have question concerning how you might run such an analysis in R, I have two posts that might help: Let's Do Some Hierarchical Bayes Choice Modeling in R and Let's Do Some More Hierarchical Bayes Choice Modeling in R.

As you can see, the study varies seven factors, including price, but the goal is to estimate the economic return from including a swivel screen on the back of the digital camera. Following much the same procedure as that outlined in those two choice modeling posts mentioned in the last paragraph, each respondent saw 16 hypothetical choice sets created using a fractional factorial experimental design. There was a profile associated with each of the four brands, and respondents were asked to first select the one they most preferred and then if they would buy their most preferred brand at a given price.

The term "dual response" has become associated with this approach, and several choice modelers have adopted the technique. If the value of the swivel screen is well-defined, it ought not matter how you ask these questions, and that seems to be confirmed by some in the choice modeling community. However, outside the laboratory and in the field, commitment or stated intention is the first step toward behavior change. Furthermore, the mere-measurement effect in survey research demonstrates that questioning by itself can alter preferences. Within the purchase context, consumers do not waste effort deciding which of the rejected alternatives is the least objectionable by attending to secondary features after failing to achieve consideration on one or more deal breakers (i.e., the best product they would not buy). Actually, dual response originates as a sales technique because encouraging commitment to one of the offerings increases the ultimate purchase likelihood.

We have our first collision. Order effects are everywhere. It is one of the most robust findings in measurement. The political pollster wants to know how big a sales tax increase could be passed in the next election. You get a different answer when you ask about a one-quarter percent increase followed by one-half percent than when you reverse the order. Perceptual contrast is unavoidable so that one-half seems bigger after the one-quarter probe. I do not need to provide a reference because everyone is aware of order as one of the many context effects. The feature presented is not the feature perceived.

Our second collision occurs from the introduction of price as just another feature, as if in the marketplace no one ever asks why one brand is more expensive than another. We ask because price is both a sacrifice with a negative impact and a signal of quality with a positive weight. In fact, as one can see from the pricing literature, there is nothing simple or direct about price perception. Careful framing may be needed (e.g., maintaining package size but reducing the amount without changing price). Otherwise, the reactions can be quite dramatic for price increases can trigger attributions concerning the underlying motivation and can generate a strong emotional response (e.g., price fairness).


At times, the relationship between the feature presented and the feature perceived can be more nuanced. It would be reasonable to vary gasoline prices in terms of cost per unit of measurement (e.g., dollars per gallon or euros per liter). Yet, the SUV driver seems to react in an all-or-none fashion only when some threshold on the cost to fill up their tank has been exceeded. What is determinant is not the posted price but the total cost of the transaction. Thus, price sensitivity is a complex nonlinear function of cost per unit depending on how often one fills up with gasoline and the size of that tank. In addition, the pain at the pump depends on other factors that fail to make it into a choice set. How long will the increases last? Are higher prices seen as fair? What other alternatives are available? Sometimes we have no option but to live with added costs, reducing our dissonance by altering our preferences.

We see none of this reasoning in choice modeling where the alternatives are described as feature bundles outside of any real context. The consumer "plays" the game as presented by the modeler. Repeating the choice exercise with multiple choice sets only serves to induce a "feature-as-presented" bias. Of course, there are occasions when actual purchases look like choice models. We can mimic repetitive purchases from the retail shelf with a choice exercise, and the same applies to online comparison shopping among alternatives described by short feature lists as long as we are careful about specifying the occasion and buyers do not search for user comments.

User comments bring us back to the usage occasion, which tends to be ignored in choice modeling. Reading the comments, we note that one customer reports the breakage of the hinge on the swivel screen after only a few months. Is the swivel screen still an advantage or a potential problem waiting to occur? We are not buying the feature, but the benefit that the feature promises. This is the scene of another paradigm collision. The choice modeler assumes that features have value that can be elicited by merely naming the feature. They simplify the purchase task by stripping out all contextual information. Consequently, the resulting estimates work within the confines of their preference elicitation procedures, but do not generalize to the marketplace.

We have other options in R, as I have suggested in my last two posts. Although the independent variables in a choice model are set by the researcher, we are free to transform them, for instance, compute price as a logarithm or fit low-order polynomials of the original features. We are free to go farther. Perceived features can be much more complex and constructed as nonlinear latent variables from the original data. For example, neural networks enable us to handle a feature-rich description of the alternatives and fit adaptive basis functions with hidden layers.

On the other hand, I have had some success exploiting the natural variation within product categories with many offerings (e.g., recommender systems for music, movies, and online shopping like Amazon). By embedding measurement within the actual purchase occasion, we can learn the when, why, what, how and where of consumption. We might discover the limits of a swivel screen in bright sunlight or when only one hand is free. The feature that appeared so valuable when introduced in the choice model may become a liability after reading users' comments.

Features described in choice sets are not the same features that consumers consider when purchasing and imagining future usage. This more realistic product representation requires that we move from those R packages that restrict the input space (choice modeling) to those R packages that enable the analysis of high-dimensional sparse matrices with adaptive basis functions (neural networks and matrix factorization).

Bottom Line: The data collection process employed to construct and display options when repeated choice sets are presented one after another tends to simplify the purchase task and induce a decision strategy consistent with regression models we find in several R packages (e.g., bayesm, mlogit, and RChoice). However, when the purchase process involves extensive search over many offerings (e.g., music, movies, wines, cheeses, vacations, restaurants, cosmetics, and many more) or multiple usage occasions (e.g., work, home, daily, special events, by oneself, with others, involving children, time of day, and other contextual factors), we need to look elsewhere within R for statistical models that allow for the construction of complex and nonlinear latent variables or hidden layers that serve as the derived input for decision making (e.g., R packages for deep learning or matrix factorization).