>> I am using OneR on a version of the weather dataset (outlook/temp/humid/ windy/play) which has numerical attributes for temperature and humidity. Is there a way to see the bin ranges that OneR uses to discretize these numerical attributes?
> If OneR chooses a numeric attribute to base its model on (the
classifiers uses only a single attribute!), then the output is as
follows (UCI dataset "balance-scale"):
> left-weight:
< 2.5 -> R
>= 2.5 -> L
> For numeric attributes, the generated rule holds the breakpoints (=
borders between bins). The above example has one breakpoint and
therefore two bins.
Thank you Peter. I believe I am asking a different question. It seems that in order for OneR to select the single attribute, it must discretize the numerical attributes into nominal attributes first. I'm assuming that it does discretization using the scheme outlined in the paper by Neville-Manning, Holmes and Witten. I know that I can specify the minBucketSize in the GenericObjetEditor that OneR will use when discretizing the values. What I would like to see is the bins, or perhaps the term is "buckets", that OneR put the numerical values in to in order to select the single attribute. The goal is to compare this with the nominal attributes in the version of the weather database that has all nominal values to see what the difference is between the attribute values for temperature and humidity. I would then like to compare this with equal-width and equal-frequency discretization.
Thanks again,
Mark Polczynski
_______________________________________________
Wekalist mailing list
Send posts to:
Wekalist@...
List info and subscription status:
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalistList etiquette:
http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html