Wednesday, October 28, 2015

Graphical Modeling Mimics Selective Attention: Customer Satisfaction Ratings


As shown by the eye tracking lines and circles, there is more on the above screenshot than we can process simultaneously. Visual perception takes time, and we must track where the eye focuses by recording sequence and duration. The "50% off" and the menu items seems to draw the most attention, suggesting that the viewers were not men.

But what if the screen contained a correlation matrix?

The 23 mobile phone customer satisfaction ratings from an earlier post will serve as an illustration. The R code to access the data, calculate the correlation matrix and produce the graph can be found at the end of this blog entry.


All the correlations are positive, so we might tend to focus on the highest correlated pairs and then search for triplets with uniformly larger intercorrelations. Although 23x23 is not a particularly big matrix, it contains enough entities that uncovering a pattern is a difficult task.

Factor analysis is an option, yet it might impose more structure than desired. What if we believe that "overall quality" is an abstraction created from perceptions of the reliability and stability of the network and supportive services and not indicators reflecting the hidden presence of a latent quality dimension? That is, we want to maintain all those individual ratings and their separate pairwise connections as shown in the correlation matrix. Well, a graph might assist those of us with a short span of attention, for example, an undirected graph whose nodes are the individual ratings and whose edges represent the correlations between pairs of ratings.


The green indicate positive values, and the largest correlations have the thickest paths. Though everything is interconnected, the graph aligns neighbors with the strongest connections. One could track your eye movements as you attempt to discover some spatial organization: overall satisfaction centered amidst quality and service with retention and recommendation pulled toward good value and fair price. Alternatively, you might have noted four regions: quality toward the right, innovative and range of products/services near the top, service and support on the top-right side, and the final word on value and loyalty in the bottom-right corner.

Hopefully, the eye tracking analogy clarifies that the interpretative process involved in making sense of the graph mimics the graphical modeling that factors or decomposes a complex network into groupings of local relationships. There is just too many pairwise relationships for the graph or the person to assimilate it all in a single glance. Selective attention deconstructs the perceptual field, in this case, mobile phone customer experiences with their cellular providers.

Of course, all decompositions are not equally helpful for customers deciding whether or not to continue with their current product and provider. We must remember that our consumer is not alone and product purchase is not a solitary quest. In order to understand product reviews, marketing communications and user comments, the consumer tends to adopt the prevailing factorization shared by most in the market and shown in the above graph.

Finally, we are not choosing a factor analysis model because we do not believe in a directed graphical representation with hidden latent constructs generating the satisfaction ratings. To be clear, we could have run a factor analysis and identified factors. The factors, however, would be derivative and not generative. The undirected graph preserves the primacy of the separate ratings and represents the factors in the edges as local regions of higher connectivity.


R code to read the data, print the correlation matrix, and plot the correlation network map.

library("semPLS")
data(mobi)
 
# descriptive names for graph nodes
names(mobi)<-c("QualityExp",
               "MeetNeedsExp",
               "GoWrongExp",
               "OverallSat",
               "Fulfilled",
               "IsIdeal",
               "ComplaintHandling",
               "BuyAgain",
               "SwitchForPrice",
               "Recommend",
               "Trusted",
               "Stable",
               "Responsible",
               "Concerned",
               "Innovative",
               "OverallQuality",
               "NetworkQuality",
               "CustomerService",
               "ServiceQuality",
               "RangeProdServ",
               "Reliability",
               "ClearInfo",
               "FairPrice",
               "GoodValue")
 
# prints the correlation matrix
round(cor(mobi[,-9]),2)
 
# plots the correlation network map
library("qgraph")
qgraph(cor(mobi[,-9]), layout="spring",
       labels=names(mobi[-9]), label.scale=FALSE,
       label.cex=1, node.width=.5, minimum=.3)

Created by Pretty R at inside-R.org

Tuesday, October 13, 2015

The Network Underlying Consumer Perceptions of the European Car Market


The nodes have been assigned a color by the author so that the underlying distinctions are more pronounced. Cars that are perceived as Economical (in aquamarine) are not seen as Sporty or Powerful (in cyan). The red edges connecting these attributes indicate negative relationships. Similarly, a Practical car (in light goldenrod) is not Technically Advanced (in light pink). This network of feature associations replicates both the economical to luxury and the practical to advanced differentiations so commonly found in the car market. North Americans living in the suburbs may need to be reminded that Europe has many older cities with less parking and narrower streets, which explains the inclusion of the city focus feature.

The data come from the R package plfm, as I explained in an earlier post where I ran a correspondence analysis using the same dataset and where I described the study in more detail. The input to the correspondence analysis was a cross tabulation of the number of respondents checking which of the 27 features (the nodes in the above graph) were associated with each of 14 different car models (e.g., Is the VW Golf Sporty, Green, Comfortable, and so on?).

I will not repeat those details, except to note that the above graph was not generated from a car-by-feature table with 14 car rows and 27 feature columns. Instead, as you can see from the R code at the end of this post, I reformatted the original long vector with 29,484 binary entries and created a data frame with 1092 rows, a stacking of the 14 cars rated by each of the 78 respondents. The 27 columns, on the other hand, remain binary yes/no associations of each feature with each car. One can question the independence of the 1092 rows given that respondent and car are grouping factors with nested observations. However, we will assume, in order to illustrate the technique, that cars were rated independently and that there is one common structure for the 14-car European market. Now that we have the data matrix, we can move on to the analysis.

As in the last post, we will model the associative net underlying these ratings using the IsingFit R package. I would argue that it is difficult to assert any causal ordering among the car features. Which comes first in consumer perception, Workmanship or High Trade-In Value? Although objectively trade-in value depends on workmanship, it may be more likely that the consumer learns first that the car maintains its value and then infers high quality. A possible resolution is to treat each of the 27 nodes as a dependent variable in their own regression equation with the remaining nodes as predictors. In order to keep the model sparse, IsingFit fits the logistic regressions with the R package glmnet.

For instance, when Economical is the outcome, we estimate the impact of the other 26 nodes including Powerful. Then, when Powerful is the outcome, we fit the same type of model with coefficients for the remaining 26 features, one of which is Economical. There is nothing guaranteeing that the two effects will be the same (i.e., Powerful's effect on Economical = Economical's effect on Powerful, controlling for all the other features). Since an undirected graph needs a symmetric affinity matrix as input, IsingFit checks to determine if both coefficients are nonzero (remember that sparse modeling yields lots of zero weights) and then averages the coefficients when Economical is in the Powerful model and Powerful is in the Economic model (called the AND rule).

Hastie, Tibshirani and Wainwright refer to this approach as "neighborhood-based" in their chapter on graph and model selection. Two nodes are in the same neighborhood when mutual relationships remain after controlling for everything else in the model. The red edge between Economical and Powerful indicates that each was in the other's equation and that their average was negative. IsingFit output the asymmetric weights in a data matrix called asymm.weights (Res$weiadj is symmetric after averaging). It is always a good idea to check this matrix and determine if we are justified in averaging the upper and lower triangles.

It should be noted that the undirected graph is not a correlation network because the weighted edges represent conditional independence relationships and not correlations. You need only go back to the qgraph() function and replace Res$weiadj with cor(rating) or cor_auto(rating) in order to plot the correlation network. The qgraph documentation explains how cor_auto() checks to determine if a Pearson correlation is appropriate and substitutes a polychoric when all the variables are binary.

Sacha Epskamp provides a good introduction to the different types of network maps in his post on Network Model Selection Using qgraph. Larry Wasserman covers similar topics at an advanced level in this course on Statistical Machine Learning. There is a handout on Undirected Graphical Models along with two YouTube video lectures (#14 and #15). Wasserman raises some concerns about our ability to estimate conditional independence graphs when the data does not have just the right dependence structure (not too much and not too little), which is an interesting point-of-view given that he co-teaches the class with Ryan Tibshirani, whose name is associated with the lasso and sparse modeling.

# R code needed to reproduce the undirected graph
library(plfm)
data(car)
 
# car$data$rating is length 29,484
# 78 respondents x  14 cars x 27 attributes
# restructure as a 1092 row data frame with 27 columns
rating<-data.frame(t(matrix(car$data$rating, nrow=27, ncol=1092)))
names(rating)<-colnames(car$freq1)
 
# fits conditional independence model
library(IsingFit)
Res <- IsingFit(rating, family='binomial', plot=FALSE)
 
# Plot results:
library("qgraph")
# creates grouping of variables to be assigned different colors.
gr<-list(c(1,3,8,20,25), c(2,5,7,23,26), c(4,10,16,17,21,27), 
         c(9,11,12,14,15,18,19,22))
node_color<-c("aquamarine","lightgoldenrod","lightpink","cyan")
qgraph(Res$weiadj, fade = FALSE, layout="spring", groups=gr, 
       color=node_color, labels=names(rating), label.scale=FALSE, 
       label.cex=1, node.width=.5)
Created by Pretty R at inside-R.org

Thursday, October 8, 2015

The Graphical Network Associated with Customer Churn

The node representing "Will Not Stay" draws our focus toward the left side of the following undirected graph. Customers of a health care insurance provider were asked about their intentions to renew at the next sign-up period. We focus on those indicating the greatest potential for defection by creating a binary indicator separating those who say they will not stay from everyone else. In addition, before telling us whether or not they intended to switch health care providers, these customers were given a checklist and instructed to check all the events that recently occurred (e.g., price increases, higher prescription costs, provider not covering all expenses, hospital and doctor visits, and customer service contacts).

We should note that all we have are customer perceptions. There is no electronic record of price increases, claim rejections, direct billings by MDs or hospitals, a customer service contact, or doctor and hospital visits. That is, we do not have measures of the event occurrences that are independent of defection intention. Consequently, we have no justification for drawing an arrow from Premiums Increases to Will Not Stay because the decision to churn impacts the willingness to check the Premium Up box. For example, everyone in the United States is likely to see some increase in their premiums, yet your willingness to check "yes" may depend on what else has occurred in your relationship with the insurance provider. Those wanting to remain dismiss the price increase as inflation or reframe it as essentially the same price, while those thinking of flight are more likely to take notice and affront. It might help to think of this as a form of cognitive dissonance or simply selective attention. Regardless of the specifics of the cognitive and affective processes, the result is an undirected graph with every node is both an outcome and a predictor.

The thickness of the lines indicate the strength of the connections. These edges represent the relationship between nodes controlling for all the other nodes in the graph. A checklist was provided so that all we have is a data matrix with either yes (=1) or no (=0). As I explained above, the only rating scale was dichotomized into Will Not Stay versus any other response. The data are proprietary so that all I can tell you is that there were more than a thousand customers, and each row was a profile of 11 binary variables coded zero or one. On the other hand, I can share the four lines of R code needed to run the analysis using the IsingFit R package and a data frame called "events2" with 11 columns and lots of rows containing only zeros and ones (see the end of this post). In addition, I can provide the link to an comprehensive overview of the methodology, A New Method for Constructing Networks from Binary Data. Those seeking more will find the notes from Sacha Epskamp workshop very helpful.

Getting back to our network, it seems that when Premiums go up, so do Deductibles and Co-pays. Cost increases form a clique near the bottom of the graph with edges suggesting that anticipated defection co-varies with price increases. A similar effects can be seen for prescription costs near the top. However, nothing seems to encourage exit more than a provider's failure to pay. Or, at least those who will not stay checked the box associated with the provider not paying. Moreover, we can observe some separation and independence in this undirected graph. Visiting your doctor, a specialist or going to the hospital have positive connections to customer churn only through the receipt of a bill or a customer service contact.

Hopefully, this example demonstrates that a lot can be learned from a undirected graphical representation of dichotomous survey data. Bayesian networks, more correctly called directed graphs, seem to attract a good deal of attention in marketing (e.g., BayesiaLab), as do structural equations models (see my previous post on Undirected Graphs When the Causality Is Mutual). In fact, my first post in this blog, Network Visualization of Key Driver Analysis, demonstrates how much can be summarizes quickly and clearly in an undirected graph. Another post, Metaphors Matters, compares factor analysis and correlation network maps.

To be clear, a graph displays an adjacency matrix that can contain any measure, often an index of association, affiliation or affinity. Any similarity or distance matrix can be graphed. Thus, we need to be careful when we interpret the resulting graphs. In this case, the adjacency matrix contained the averaged coefficients from a sparse logistic regression with each node as the dependent variable and all the remaining nodes as predictors. This means that our graph is not a correlation network because the adjacency matrix does not contain correlations. It is more like a partial correlation network, except that the adjacency matrix does not contain partial correlations but something that can be interpreted like a partial correlation. Fortunately, you can work with the graph as representing the relationship between two nodes controlling for the rest while you learn the details of Ising discrete data graphing.



### Fit using IsingFit ###
library(IsingFit)
Res <- IsingFit(events2, family='binomial', plot=FALSE)
 
# Plot results:
library("qgraph")
qgraph(Res$weiadj, fade = FALSE, layout="spring", 
       labels=names(events2), label.scale=FALSE, 
       label.cex=1, node.width=.5)

Created by Pretty R at inside-R.org

Monday, October 5, 2015

Undirected Graphs When the Causality Is Mutual

Structural equation models impose causal order on a set of observations. We start with a measurement model: a list of theoretical constructs and a table assigning what is observed (manifest) to what is hidden (latent). Although it is possible to think of this assignment as formative rather than reflective, the default is a causal connection with the latent variables responsible for the observed scores. Next, we draw arrows specifying the cause and effect relationships among the latent variables. All of this is shown in great detail with a customer satisfaction example in the very well-written vignette for the R package semPLS, which uses partial least squares (PLS) to fit structural equations models (sem).

Your focus should be on the causal model and not the estimation technique. PLS is optional, and all the parameters can be estimated using maximum likelihood with the lavaan R package. However, you can get access to the dataset through the semPLS package, and you will not find a better description of this particular example or the steps involved in specifying and testing a SEM.

As always, there are issues. An earlier post raises a number of concerns with this tale of causal links suggesting that we might be asked to assume too much when we impose a directionality on mutually interacting components. For example, when it requires effort to change product or service providers, it might be easier to believe that all competitors are the same and that it is futile to seek a better deal elsewhere. Here, the decision to Buy Again encourages us to rethink our dissatisfaction and raise the ratings over that which would have been given had switching been easier. Such mutual dependencies is represented by undirected graphs, and for social scientists, the R package qgraph provides an introduction.

My goal in this post is a modest one: to demonstrate that one can learn a great deal from a series of customer ratings without needing to force the data into a causal model. This is achieved by examining the following partial correlation network.

You should recall that a graph is a visual display of some adjacency matrix. In this case we define adjacency as the partial correlation between two nodes after controlling for all the other nodes in the graph. Actually, our adjacency matrix is a bit more complicated because we applied the graphical lasso to obtain our estimates. The details are important, yet one can learn a great deal from the graph knowing little more than that the edges show us conditional association after removing the other nodes and that we have made some effort to eliminate as many edges as possible (a sparse undirected graph).

All the R code needed to replicate this analysis appears at the end of this post. One of the original 24 items, # 9 SwitchforPrice, was removed because it had no edge to any of the other nodes in this partial correlation network (the semPLS documentation reveals that the question had a unique format).

One way to start is to identify the thickest edges connecting the remaining 23 customer perception, satisfaction and loyalty ratings. Unsurprisingly, good value and fair price "hang together" since endorsing one and rejecting the other would seem to be a contradiction. Similarly, stability is a key component of network quality, reliability defines service quality, and we do not recommend that which we are unwilling to buy again. These single edges connecting two ratings with common meanings may not be that informative.

What is interesting, however, is that we can read "the customer's mind" from the structure of the undirected graph.  First, all the quality measures form a grouping toward the left of the graph: stable, network quality, reliability, service quality, and overall quality. As we move toward the right, we encounter overall satisfaction along with its companion positive perceptions of trusted and fulfilled. In the region just above fall the product and service attributes with range of products and services, innovative, and customer service. Corporate responsibility is more toward the left with the loyalty measures below (e.g., buy again and recommend).

In general, expectations (go wrong, quality, and meet needs) are toward the top and behaviors near the bottom (compliant handling, recommend, and buy again). The most basic quality indicators are found on the left with the extras, such as good citizenship, appearing on the right (concerned, responsible, fair price, and good value).

Over time, customers form impressions and reach conclusions about the companies providing them goods and services. These attributions are mutually supportive and create a system of interdependencies that seeks an equilibrium. Disturbing that equilibrium anywhere within the system will have its consequences. A company that provides small incentives to current customers in order to encourage them to recruit new customers gets both the new customers and recommending customers with higher satisfaction and improved impressions. Recommendation is more than the result of a sequential causal process with satisfaction as an input. The incentive is an intervention with satisfaction as the outcome. The causality is mutual.


library("semPLS")
data(mobi)
 
# descriptive names for graph nodes
names(mobi)<-c("QualityExp",
              "MeetNeedsExp",
              "GoWrongExp",
              "OverallSat",
              "Fulfilled",
              "IsIdeal",
              "ComplaintHandling",
              "BuyAgain",
              "SwitchForPrice",
              "Recommend",
              "Trusted",
              "Stable",
              "Responsible",
              "Concerned",
              "Innovative",
              "OverallQuality",
              "NetworkQuality",
              "CustomerService",
              "ServiceQuality",
              "RangeProdServ",
              "Reliability",
              "ClearInfo",
              "FairPrice",
              "GoodValue")
 
library("qgraph")
 
# Calculates Sparse Partial Correlation Matrix
sparse_matrix<-EBICglasso(cor(mobi[,-9]), n=250)
 
# Plots results
ug<-qgraph(sparse_matrix, layout="spring", 
           labels=names(mobi[-9]), label.scale=FALSE,
           label.cex=1, node.width=.5)

Created by Pretty R at inside-R.org