Friday, April 10, 2015

Modeling Categories with Breadth and Depth

Religion is a categorical variable with followers differentiated by their degree of devotion. Liberals and conservatives check their respective boxes when surveyed, although moderates from each group sometimes seem more alike than their more extreme compatriots. All Smartphone users might be classified as belonging to the same segment, yet the infrequent user is distinct from the intense who cannot take their eyes off their screens. Both of us have the flu, but you are really sick. My neighbor and I belong to a cluster called dog-owners. However, my dog is a pet on a strict allowance and theirs is a member of the family with no apparent limit on expenditures. There seems to be more structure underlying such categorizations than is expressed by a zero-one indicator of membership.

Plutchik's wheel of emotion offers a concrete example illustrating the breadth and depth of affective categories spanning the two dimensions defined by positive vs. negative and active vs. passive. The concentric circles in the diagram below show the depth of the emotion with the most intense toward the center. Loathing, disgust and boredom suggest an increasing activation of a similar type of negative response between contempt and remorse. Breadth is varied as one moves around the circle traveling through the entire range of emotions. When we speak of opposites such as flight (indicated in green as apprehension, fear and terror) or fight (the red emotions of annoyance, anger and rage), we are relating our personal experiences with two contrasting categories of feelings and behaviors. Yet, there is more to this categorization than is express by a set of exhaustive and mutually exclusive boxes, even if the boxes are called latent classes.



The R statistical language approaches such structured categories from a number of perspectives. I have written two posts on archetypal analysis. The first focused on the R package archetypes and demonstrated how to use those functions to model customer heterogeneity by identifying the extremes and representing everyone else as convex combinations of those end points. The second post argued that we tend to think in terms of contrasting ideals (e.g., liberal versus conservative) so that we perceive entities to be more different in juxtaposition than they would appear on their own.

Latent variable mixture models provide another approach to the modeling of unobserved constructs with both type and intensity, at least for those interested in psychometrics in R. The intensity measure for achievement tests is item difficulty, and differential item functioning (DIF) is the term used by psychometricians to describe items with different difficulty parameters in different groups of test takers. The same reasoning applies to customers who belong to different segments (types) seeking different features from the same products (preference intensity). These are all mixture models in the sense that we cannot estimate one set of parameters for the entire sample because the data are a mixture of hidden groups with different parameters. R packages with the prefix or suffix "mix" in their title (e.g., mixRasch or flexmix) suggest such a mixture modeling.

R can also capture the underlying organization shaping categories through matrix factorization. This blog is filled with posts demonstrating how the R package NMF (nonnegative matrix factorization) easily decomposes a data matrix into the product of two components: (1) something that looks like factor loadings of the measures in the columns and (2) something similar to a soft or fuzzy clustering of respondents in the rows. Both these components will be block diagonal matrices when we have the type of separation we have been discussing. You can find examples of such matrix decompositions in a listing of previous posts at the end of my discussion of brand and product representation.

Consumer segments are structured by their use of common products and features in order to derive similar benefits. This is not an all-or-none process but a matter of degree specified by depth (e.g., usage frequency) and breadth (e.g., the variety and extent of feature usage). You can select almost any product category, and at one end you will find heavy users doing everything that can possibly be done with the product. As usage decreases, it falls off in clumps with clusters of features no longer wanted or needed. These are the latent features of NMF that simultaneously bind together consumers and the features they use. For product categories such as movies or music, the same process applies but now the columns are films seen or recordings heard. All of this may sound familiar to those of you who have studied recommendation systems or topic modeling, both of which can be run with NMF.

No comments:

Post a Comment