Comments on Engaging Market Research: Identifying Pathways in the Consumer Decision Journey: Nonnegative Matrix Factorization

Cool, glad I have it down finally. One thing that ...

2015-05-05T17:13:25.086-07:00

Cool, glad I have it down finally. One thing that bugged me about reading Skillicorn (and I asked him about it but will confess I did not understand the answer) was I expected that if I ran NMF on a matrix and then ran it on the transpose of that same matrix, I would find the matrices switched and transposed (as I thought he showed on page 174) but in fact that is not the case and the matrices are completely different.

Well said! I have had problems trying to explain t...

2015-05-05T17:07:16.525-07:00

Well said! I have had problems trying to explain this point, and your comment will help future readers. Thanks.

I have that book actually! How does this sound: W...

2015-05-05T16:46:55.380-07:00

I have that book actually! How does this sound:

When input is [person x attributes] then W is [person x latent dimensions] and H is [latent dimensions x attributes]. Thus, to get the weightings of each attribute in each latent dimension, look at the columns of H (each attribute is a column and the rows are the latent dimension strength). This is what I think you have done in your post(?)

When input is [terms x documents] then W is [terms x latent dimensions (topics)] and H is [latent dimensions x documents]. Thus the columns of W describe the strength of each term for a given latent dimension (each column is a latent dimension). The columns of H show the strength or mix of each latent dimension for a given document (each is a column).

You can find a discussion in Chapter 8.1 from a bo...

2015-05-05T15:45:09.341-07:00

You can find a discussion in Chapter 8.1 from a book by David Skillicorn called Understanding Complex Datasets (2007). You may not see that section in Google Books, but there is a book pdf that you can find if you search for it.

Try to remember: W for the rows and H for the columns. The latent features are the columns of W and the rows of H. When the data matrix is person x measures of the person, the transpose of H yields something that looks like a matrix of factor loadings with simple structure. W looks like a membership matrix from finite mixture models. Of course, you may need to rescale the values of W and H since W*H reproduces the raw data matrix, which can be counts or any arbitrary non-negative intensity scale.

Hey Joel- Great stuff! I am very confused by what ...

2015-05-05T11:38:19.431-07:00

Hey Joel- Great stuff! I am very confused by what vectors (rows or columns of which matrix (W or H) ) to use for the latent factors, depending on if the input matrix is (for example) documents x terms or terms x documents. I saw your other post but is there any resources you have found that describes this?

Yes, NMF can be used as a pre-processing step for ...

2014-07-16T06:50:52.341-07:00

Yes, NMF can be used as a pre-processing step for future clustering or regression analysis. And yes, one can impose orthogonality constraints on the factors. You can find lots of examples using the basis matrix in the same way as principal component or factor scores. However, orthogonal NMF has been introduced primarily as a means for dealing with the non-uniqueness of the NMF solution and not because the columns of the basis matrix are collinearity.

To build on your example, say I want to look at fa...

2014-07-16T02:05:53.817-07:00

To build on your example, say I want to look at factors associated with purchase likelihood, for example, using a logistic regression. I could have used the individual episodes (original columns of your data matrix) as dependent variables, but this isn't a particularly parsimonious representation and so much detail may obscure the main story (e.g online shoppers have higher purchase likelihood than offline shoppers):

1) Can I use the columns of the basis matrix as independent variables in a regression?
2) When running NMF, can you choose to enforce orthogonality to avoid multicollinearity due to correlations between basis columns?

Thanks in advance!

Thanks this is super helpful!

2014-07-15T10:52:13.461-07:00

Thanks this is super helpful!

Let me answer your last two questions first. NMF h...

2014-07-14T05:44:21.458-07:00

Let me answer your last two questions first. NMF has become an established technique, as you can see from all the references on the Wikipedia link. The onlilne textbook "Elements of Statistical Learning Theory" provides a good introduction. Li and Ding have a recent book chapter called "Non-negative Matrix Factorization for Clustering: A Survey" at http://users.cis.fiu.edu/~taoli/pub/home-new.html. Nicolas Gillis "The Why and How of Nonnegative Matrix Factorization" at http://arxiv.org/pdf/1401.5226v2.pdf might help. Ankur Moitra has a YouTube video called "New Algorithms for Nonnegative Matrix Factorization and Beyond" with slides at his website.

Now, deciding how many latent components is best depends on your definition of best. It is decided in the same manner as the number of clusters or number of factors question in cluster and factor analysis. Are additional latent components meaningful and interpretable or are they noise? Let's run several different ranks and take a look. Section 2.6 of the NMF introductory vignette discusses these issues under the heading "Estimating the Factorization Rank" where rank is the number of components.

I hope this helps. Good luck.

Hi, would appreciate any comments on: 1) How to de...

2014-07-14T02:15:49.641-07:00

Hi, would appreciate any comments on:
1) How to decide how many latent components 'best' describe the data?
2) Any situations in which NMF may be inappropriate? I'm a little leery of applying techniques I don't have a deep mathematical understanding of to new situations?
3) The (intuitive) difference between applying NMF to a data matrix and, say, PCA or classical MDS?

Thanks!

Hi, Great post as always. You're one of the fe...

2014-06-26T03:51:00.141-07:00

Hi, Great post as always. You're one of the few people who describe marketing analysis so well using R. Please keep this up! Cheers!

Correction made, thanks.

2014-06-16T06:48:31.545-07:00

Correction made, thanks.

Hey Joel - I am an editor who is studying to becom...

2014-06-15T07:49:38.914-07:00

Hey Joel - I am an editor who is studying to become a data scientist and I noticed a typo in the R program. The fourth to last item in the c() command should be "Voucher," not "Vocher." Very interesting article, by the way. Regards.