April 28th, 2013
Predictive analytics is frequently used to build decision support systems or to present the items that are likely to be of interest to the users. Pandora, Netflix, Last.fm, etc are some prominent examples of predictive analytics in the consumer place. There are primarily two mechanisms to do this – (i) content based filtering, wherein we look at the internals of the item, or (ii) collaborative filtering, wherein we look at the users’ behaviors and recommend what other “similar” users are finding to be of interest. Of course we can also use a hybrid model that uses two underlying models and use a combination technique to combine the results of the models.
Recent research focus has been on building better information filtering systems (better content based filtering system or a better collaborative filtering system or a better combination model). If collaborative filtering is of interest, you can read my Computing Reviews review for More reputable recommenders give more accurate recommendations?
However, an equally important question is if we can characterize which particular model is more likely to be effective in a given system? Further, can we characterize the systems in which collaborative filtering can be simplified to use just a set of “power users”?
April 22nd, 2013
We looked at Kappa Statistic previously, and I have been evaluating some aspects of it again.
To remind ourselves, Kappa statistic is a measure of consistency amongst different raters, taking into account the agreement occurring by chance. The standard formula for kappa statistic is given as:
Firstly, an observation that I omitted to make: the value of kappa statistic can indeed be negative. The total accuracy can be lesser than random accuracy, and as CMAJ letter by Juurlink and Detsky points out, this may indicate genuine disagreement, or it may reflect a problem in the application of a diagnostic test.
Secondly, one thing to love about Kappa is the following. Consider the case that one actual class is much more prevalent than the other. In such case, a classification system that simply outputs the more prevalent class may have a high F1 measure (a high precision and high recall), but will have a very low value of kappa. For example, consider the scenario that we are asked if it will rain in Seattle and consider the following confusion matrix:
This is a null hypothesis model, in the sense that it almost always predicts the class to be “T”. In the case of this confusion matrix, precision is 0.9008 and recall is 0.9989. F1 measure is 0.9473 and can give an impression that this is a useful model. Kappa value is very low, at 0.0157, and gives a clear enough warning about the validity of this model.
In other words, while it may be easier to predict rain in Seattle (or sunshine in Aruba), kappa statistic tries to take away the bias in the actual distribution, while the F1 measure may not.