I applied OneR toFisher’s iris dataset with 10-fold cross-validation and minBucketSize = 3. Petal Width was the attribute that OneR chose, with split at 0.8 and 1.65. The output below the rule says (144/150 instances correct). When I look at the dataset, this checks out, with 4 virginica classified as versicolor, and 4 versicolor classified as virginica. But the confusion matrix says that 6 virginica were classified as versicolor, and 6 versicolor as virginica.
When I repeat this with minBucketSize = 6, the results are the same except the confusion matrix now says 2 virginica as versicolor and 7 versicolor as virginica.
Why might this be? I’m using Weka 3.7.
Thanks,
Mark Polczynski
_______________________________________________
Wekalist mailing list
Send posts to:
Wekalist@...
List info and subscription status:
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalistList etiquette:
http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html