Monday, June 22, 2015

Looking for Preference in All the Wrong Places: Neuroscience Suggests Choice Model Misspecification

At its core, choice modeling is a utility estimating machine. Everything has a value reflected in the price that we are willing to pay in order to obtain it. Here are a collection of Smart Watches from a search of Google Shopping. You are free to click on any one, look for more, or opt out altogether and buy nothing.

Where is the utility? It is in the brand name, the price, the user ratings and any other feature that gets noticed. If you pick only the Apple Smartwatch at its lowest price, I conclude that brand and price have high utility. It is a somewhat circular definition: I know that you value the Apple brand because you choose it, and you pick the Apple watch because the brand has value. We seem to be willing to live with such circularity as long as utility measured in one setting can be generalized over occasions and conditions. However, context matters when modeling human judgment and choice, making generalization a difficult endeavor. Utility theory is upended when higher prices alter perceptions so that the same food tastes better when it costs more.

What does any of this have to do with neuroscience? Utility theory was never about brain functioning. Glimcher and Fehr make this point in their brief history of neuroeconomics. Choice modeling is an "as if" theory claiming only that decision makers behave as if they assigned values to features and selected the option with the optimal feature mix.

When the choice task has been reduced to a set of feature comparisons as is the common practice in most choice modeling, the process seems to work at least in the laboratory (i.e., care must be taken to mimic the purchase process and adjustment may be needed when making predictions about real-world market shares). Yet, does this describe what one does when looking at the above product display from Google Shopping? I might try to compare the ingredients listed on the back of two packages while shopping in the supermarket. However, most of us find this task quickly becomes too difficult as the number of features exceeds our short-term memory limits (paying attention is costly).

Incorporating Preference Construction

Neuroeconomics suggests how value is constructed on-the-fly in real world choice tasks. Specifically, reinforcement learning is supported by multiple systems within the brain: "dual controllers" for both the association of features and rewards (model-free utilities) and the active evaluation of possible futures (model-based search). Daw, Niv and Dayan identified the two corresponding regions of the brain and summarized the supporting evidence back in 2005.

Features can become directly tied to value so that the reward is inferred immediately from the presence of the feature. Moreover, if we think of choice modeling only as the final stages when we are deciding among a small set of alternatives in a competitive consideration set, we might mistakenly conclude that utility maximization describes decision making. As in the movies, we may wish to "flashback" to the beginning of the purchase process to discover the path that ended at the choice point where features seem to dominate the representation.

Perception, action and utility are all entangled in the wild, as shown by the work of Gershman and Daw. Attention focuses on the most or least desirable features in the context of the goals we wish to achieve. We simulate the essentials of the consumption experience and ignore the rest. Retrospection is remembering the past, and prospection is experiencing the future. The steak garners greater utility sizzling than raw because it is easier to imagine the joy of eating it.

While the cognitive scientist wants to model the details of this process, the marketer will be satisfied learning enough to make the sale and keep the customer happy. In particular, marketing tries to learn what attracts attention, engages interest and consideration, generates desire and perceived need, and drives purchase while retaining customers (i.e., the AIDA model). These are the building blocks from which value is constructed.

Choice modeling, unfortunately, can identify the impact of features only within the confines of a single study, but it encounters difficulties attempting to generalize any effects beyond the data collected. Many of us are troubled that even relatively minor changes can alter the framing of the choice task or direct attention toward a previously unnoticed aspect (Attention and Reference Dependence).

The issue is not one of data collection or statistical analysis. The R package support.BWS will assist with the experimental design, and other R packages such as bayesm, RSGHB and ChoiceModelR will estimate the parameters of a hierarchical Bayes model. No, the difficulty stems from needing to present each respondent with multiple choice scenarios. Even if we limit the number of choice sets that any one individual will evaluate, we are still forced to simplify the task in order to show all the features for all the alternatives in the same choice set. In addition, multiple choice sets impose some demands for consistency so that a choice strategy that can be used over and over again without a lot of effort is preferred by respondents just to get through the questionnaire. On the other hand, costly information search is eliminated, and there is no anticipatory regret or rethinking one's purchase since there is no actual transfer of money. In the end, our choice model is misspecified for two reasons: it does not include the variables that drive purchase in real markets and the reactive effects of the experimental arrangements create confounding effects that do not occur outside of the study.

Measuring the Forces Shaping Preference

Consumption is not random but structured by situational need and usage occasion. "How do you intend to use your Smartwatch?" is a good question to ask when you begin your shopping, although we will need to be specific because small differences in usage can make a large difference in what is purchased. To be clear, we are not looking for well-formed preferences, for instance, feature importance or contribution to purchase. Instead, we focus on attention, awareness and familiarity that might be precursors or early phases of preference formation. If you own an iPhone, you might never learn about Android Wear. What, if anything, can we learn from the apps on your Smartphone?

I have shown how the R package NMF for nonnegative matrix factorization can uncover these building blocks. We might wish to think of NMF as a form of collaborative filtering, not unlike a recommender system that partitions users into cliques or communities and products into genres or types (e.g., sci-fi enthusiasts and the fantasy/thrillers they watch). An individual pattern of awareness and familiarity is not very helpful unless it is shared by a larger community with similar needs arising from common situations. Product markets evolve over time by appealing to segments willing to pay more for differentiated offerings. In turn, the new offerings solidify customer differences. This separation of products and users into segregated communities forms the scaffolding used to construct preferences, and this is where we should begin our research and statistical analysis.


  1. Hi Joel,

    Thanks for this. To be clear, are you advocating for the usage of the estimated utilities into the NMF, or are you suggesting to use NMF prior (or instead) of using choice modeling?

    Furthermore, what type of survey data can NMF accommodate? Interval data (like ratings) or binary (multi-select) or both? Does the data frame need to be just raw data or would it need to be accommodated prior to running the NMF?



    1. Never mind the second set of questions. Your link at the end of the post answers that.

    2. When choice modeling does not describe the decision process, we should search for another model that does a better job (e.g., matrix factorization used by recommender systems).