|
View:
New views
3 Messages
—
Rating Filter:
Alert me
|
|
|
constant number of training instances per classHi,
I'd like to split my dataset in such a way that there are N (which is a constant number) training instances for each class, and all the rest are left for testing. Can I do this using the GUI? What is the easiest way to achieve this?
Thanks. Emre
_______________________________________________ Wekalist mailing list Send posts to: Wekalist@... List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html |
|
|
Re: constant number of training instances per class2009/10/26 Emre Akbas <eakbas2@...> Hi, using explorer -> preprocess tab -> (1) RemoveRange filter two runs: instances till N are for training, from constant till end are for testing save both as arff, and so on be sure that your dataset is fully order-randomised per classes before doing this i.e. that order of instances does not give away what class the instance is (2) StratifiedRemoveFolds filter if your fixed constant can be transformed to a percentage cutoff the much easier and directly applicable way is to use this filter two runs with and without the invert flag, same as above -> this latter would also guarantee the rather key thing that the resulting two arffs have the same prior distribution of instances into classes which is generally recommended to ensure 'balanced' output of the classes
-- ----------------- Harri M.T. Saarikoski M.A, PhD graduate student Helsinki University Finland _______________________________________________ Wekalist mailing list Send posts to: Wekalist@... List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html |
|
|
Re: constant number of training instances per class> Hi,
How can we insure that the data-set is fully order-randomized? Is there a way/option thru weka to insure this.
> I'd like to split my dataset in such a way > that there are N (which is a constant number) training > instances for each class, and all the rest are left for > testing. Can I do this using the GUI? What is the easiest > way to achieve this? > > using explorer -> preprocess tab -> > > (1) RemoveRange filter > two runs: instances till N are for training, from constant > till end are for testing > save both as arff, and so on > > be sure that your dataset is fully order-randomised per > classes before doing this > i.e. that order of instances does not give away what class > the instance is Regards, Jitendra Try the new Yahoo! India Homepage. Click here. http://in.yahoo.com/trynew _______________________________________________ Wekalist mailing list Send posts to: Wekalist@... List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html |
| Free embeddable forum powered by Nabble | Forum Help |