« Return to Thread: Meaning of confusion matrix for OneR

Re: Re: Meaning of confusion matrix for OneR

by Polczynski, Mark :: Rate this Message:

Reply to Author | View in Thread

>> I applied OneR toFisher’s iris dataset with 10-fold cross-validation and minBucketSize = 3.  Petal Width was the attribute that OneR chose, with split at 0.8 and 1.65. The output below the rule says (144/150 instances correct).  When I look at the dataset, this checks out, with 4 virginica classified as versicolor, and 4 versicolor classified as virginica.  But the confusion matrix says that 6 virginica were classified as versicolor, and 6 versicolor as virginica.
>>
>> When I repeat this with minBucketSize = 6, the results are the same except the confusion matrix now says 2 virginica as versicolor and 7 versicolor as virginica.
>>
>> Why might this be?  I’m using Weka 3.7.

> 10-fold cross-validation generates 10 different models. The confusion
matrix reflects this, the printed model is built on the full training
data *before* cross-validation is performed (you can turn it off in
the Explorer in the "More Options" dialog: "Output model"). This model
is only printed to give the user an idea of what the classifier does
on the data, it doesn't necessarily reflect the 10 CV models.

> Cheers, Peter
> --
>Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
> http://www.cs.waikato.ac.nz/~fracpete/           Ph. +64 (7) 858-5174

*******************************
Thanks, Peter.  Now for a follow-on question.  I modified the Fishers' iris dataset to have 4 missing values in each of the four attributes.  I used OneR with 10-fold cross validation and minBucketSize = 6.  The classifier model says:

Petal Length:
< 2.45 - setosa
< 4.75 -> versicolor
>= 4.75 -> virginica
? -> virginica

So, what does the last line mean?  Also, just to verify, is it true that OneR automatically replaces all missing values for an attribute the average value for the attribute?

_______________________________________________
Wekalist mailing list
Send posts to: Wekalist@...
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

 « Return to Thread: Meaning of confusion matrix for OneR