The kappa tool is used to evaluate measurement systems that place items in categories, such as pass/fail. Below you’ll learn how to use them for your own projects.
Under measurement systems analysis, we have different tools for different occasions. So let’s take a look at what we’ve got here. In measurement systems analysis, we’ve got gauge R&R and we’ve got inter class correlation coefficient, which are well-suited to data that are collected from things like voltmeters, or scales, or rulers, or digital calipers.
We’ve also got the Kappa Tool. The Kappa Tool is what you use if you want to know if you can reliably put things back in the same category. A really good example of that is warranty returns. Most companies will look at their warranty returns and make a Pareto chart of them and we’ll assign engineering resources to fix the problems that are found. Well, if those problem categorizations are being generated by a random number generator, then you can very easily mis-allocate your problem-solving resource.
So let’s go to Kappa and take a look at that. So we’re going to have some samples and ordinarily, we’d like a dozen or so but this is just a little teaching example. So I’m going to have five samples and you’ll see that the grid that immediately expands for 5 samples. Let’s just leave this at four judges for right now and let’s put things in low, let’s say four categories. Now, these categories have to be mutually exclusive, so I can put red and I can put blue, and I can put yellow, and I can put white, and then I have my judges independently, without any consulting, put things into the category. So I’ve got five samples and let’s just say that all four judges agree that the first one is red. Then let’s say that all four judges agree that the second one is blue and let’s say that they split here. Two say yellow, two say white. Let’s say that this for some reason is four and one, and then this one all four will agree again and then all we have to do is click calculate.
This tells us that we’ve got not a very good system, but a tolerable system. The rule of thumb is that if your kappa score is .9 or above, you’ve got a system that works reasonably well. Oh, I made a mistake up here. Look at that, I’ve got a one there. My rows have to add across and let’s see if this changes things a lot. Not very much. We’ve got a not-very-good system but maybe tolerable. So our gas gauges up into the yellow. We’d like to see this .9 or above. One means perfect agreement and we don’t have that here. If its point nine or better, then that would usually indicate that the measurement system doesn’t need a lot of improvement. So let’s just make this one and three and see what happens. That does improve it a bit and then if I hit delete here and four here, then we should get perfect agreement and we do. So that’s a very simple, easy-to-use tool, putting things in categories looking at the kappa score.