Multiple data sets to Stacking classifier?

View: New views
4 Messages — Rating Filter:   Alert me  

Multiple data sets to Stacking classifier?

by Hollis Wright :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Multiple data sets to Stacking classifier? Is is possible to define a Stacking classifier with input classifiers that have been trained on distinct data sets? I’m basically trying to set up a stack of multiple Bayesian networks that have each been trained on a data set from a different data source, but it seems that the Stacking interface (at least in the KnowledgeFlow) only permits a single dataSet to be input to all of the classifiers; I can’t just generate them separately and hook them into a single Stacking object, apparently.  Is there a way to do this?

Hollis Wright, MS
PhD Candidate, DMICE
Oregon Health and Science University

_______________________________________________
Wekalist mailing list
Send posts to: Wekalist@...
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

Re: Multiple data sets to Stacking classifier?

by Peter Reutemann-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> Is is possible to define a Stacking classifier with input classifiers that
> have been trained on distinct data sets? I’m basically trying to set up a
> stack of multiple Bayesian networks that have each been trained on a data
> set from a different data source, but it seems that the Stacking interface
> (at least in the KnowledgeFlow) only permits a single dataSet to be input to
> all of the classifiers; I can’t just generate them separately and hook them
> into a single Stacking object, apparently.  Is there a way to do this?

No, each classifier in Weka (no matter whether simple or meta) takes
exactly *one* dataset as input. Using the API, you can fake the
training phase for some meta-classifiers (e.g., Vote). But this
doesn't work for Stacking, as it generates a meta-level dataset during
training.

Cheers, Peter
--
Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
http://www.cs.waikato.ac.nz/~fracpete/           Ph. +64 (7) 858-5174


_______________________________________________
Wekalist mailing list
Send posts to: Wekalist@...
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

Re: Multiple data sets to Stacking classifier?

by Hollis Wright :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Re: [Wekalist] Multiple data sets to Stacking classifier? Ok, so something like:

While (I < k)
{
    classifier.train(dataset[i])
    arrayOfClassifiers[i] = classifier;
    ++I;
}
votingClassifier.train(arrayOfClassifers);

should work? If so I think that should do what I need. Thanks...

Hollis Wright, MS
PhD Candidate, DMICE
Oregon Health and Science University

On 7/2/09 1:45 PM, "Peter Reutemann" <fracpete@...> wrote:

> Is is possible to define a Stacking classifier with input classifiers that
> have been trained on distinct data sets? I’m basically trying to set up a
> stack of multiple Bayesian networks that have each been trained on a data
> set from a different data source, but it seems that the Stacking interface
> (at least in the KnowledgeFlow) only permits a single dataSet to be input to
> all of the classifiers; I can’t just generate them separately and hook them
> into a single Stacking object, apparently.  Is there a way to do this?

No, each classifier in Weka (no matter whether simple or meta) takes
exactly *one* dataset as input. Using the API, you can fake the
training phase for some meta-classifiers (e.g., Vote). But this
doesn't work for Stacking, as it generates a meta-level dataset during
training.

Cheers, Peter
--
Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
http://www.cs.waikato.ac.nz/~fracpete/           Ph. +64 (7) 858-5174



_______________________________________________
Wekalist mailing list
Send posts to: Wekalist@...
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

Re: Multiple data sets to Stacking classifier?

by Peter Reutemann-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> Ok, so something like:
>
> While (I < k)
> {
>     classifier.train(dataset[i])
>     arrayOfClassifiers[i] = classifier;
>     ++I;
> }
> votingClassifier.train(arrayOfClassifers);
>
> should work? If so I think that should do what I need. Thanks...

Yes, something like that.

import weka.core.Instances;
import weka.classifiers.Classifier;
import weka.classifiers.bayes.BayesNet;
import weka.classifiers.meta.Vote;

Instances[] datasets = ... // from somewhere
// test whether datasets are all compatible
// NB: you're not allowed to violate Weka's underlying assumption,
// that all classifiers got trained on the same data. Hence the
// structure of the datasets must be exactly the same. The data
// itself can differ though.
for (int i = 1; i < datases.length; i++) {
  if (!datasets[0].equalsHeader(datasets[i]))
    throw IllegalStateException("Training sets not compatible!");
}

// train classifiers
Classifier[] classifiers = new Classifier[datasets.length];
for (int i = 0; i < datasets.length; i++) {
  classifiers[i] = new BayesNet();
  classifiers[i].buildClassifier(datasets[i]);
}

// setup Vote
Vote vote = new Vote();
vote.setClassifiers(classifiers);

// output predictions on test set
Instances test = ... // from somewhere
if (!datasets[0].equalHeaders(test))
  throw new IllegalStateException("Test set not compatible!");
for (int i = 0; i < test.numInstances(); i++) {
  double pred = vote.classifyInstance(test.instance(i));
  System.out.println((i+1) + ". " + pred);
}

NB: this code has never been compiled, but was written from memory.
See wiki article "Use Weka in your Java code" for how to use the Weka
API:
  http://weka.wiki.sourceforge.net/Use+Weka+in+your+Java+code

Cheers, Peter
--
Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
http://www.cs.waikato.ac.nz/~fracpete/           Ph. +64 (7) 858-5174

_______________________________________________
Wekalist mailing list
Send posts to: Wekalist@...
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html