Classification Algorithms

View: New views
4 Messages — Rating Filter:   Alert me  

Classification Algorithms

by Curtis Jensen-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Forgive me for asking a general data mining question.
There are probably over 100 classification algorithms built into Weka.
 Are they described in one concise place.  I can look up each one
individually, but I'm hoping for a single table that says this one is
good for this type of data etc.

In general how do I decide which ones to try?
For my data, I need to know the final criteria for the classification.
 That leaves many of them out.  I'm basically left with the rule and
tree classifiers.  There is still a large number of algorithms in that
set.  Is there a cheat sheet to help me weed out the list futher?

Also, do any of the clustering algorithms produce rule sets or decision tree?

Thanks,
Curtis


_______________________________________________
Wekalist mailing list
Send posts to: Wekalist@...
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

Re: Classification Algorithms

by Peter Reutemann-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> Forgive me for asking a general data mining question.
> There are probably over 100 classification algorithms built into Weka.
>  Are they described in one concise place.

The Data Mining by Witten&Frank (based on Weka 3.4.x) describes quite
a few of them.

> I can look up each one
> individually, but I'm hoping for a single table that says this one is
> good for this type of data etc.

Not that I know of.

> In general how do I decide which ones to try?

You run experiments in the Experimenter and pick the one that works best.

> For my data, I need to know the final criteria for the classification.
>  That leaves many of them out.  I'm basically left with the rule and
> tree classifiers.

That might rule out classifiers that work well on your data.

> There is still a large number of algorithms in that
> set.  Is there a cheat sheet to help me weed out the list futher?

Assuming that you have a nominal attribute, I can think of the following ones:
  trees.J48, trees.REPTree, rules.PART

But I'd just chuck them all in an Experimenter setup and compare them
on your dataset. Then, pick the best bunch and optimize the
parameters.

> Also, do any of the clustering algorithms produce rule sets or decision tree?

CobWeb is the only one that produces a tree, if I'm not mistaken.

Cheers, Peter
--
Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
http://www.cs.waikato.ac.nz/~fracpete/           Ph. +64 (7) 858-5174

_______________________________________________
Wekalist mailing list
Send posts to: Wekalist@...
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

Re: Classification Algorithms

by Curtis Jensen-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Fri, Oct 23, 2009 at 2:42 PM, Peter Reutemann <fracpete@...> wrote:

>> Forgive me for asking a general data mining question.
>> There are probably over 100 classification algorithms built into Weka.
>>  Are they described in one concise place.
>
> The Data Mining by Witten&Frank (based on Weka 3.4.x) describes quite
> a few of them.
>
>> I can look up each one
>> individually, but I'm hoping for a single table that says this one is
>> good for this type of data etc.
>
> Not that I know of.
>
>> In general how do I decide which ones to try?
>
> You run experiments in the Experimenter and pick the one that works best.
>
>> For my data, I need to know the final criteria for the classification.
>>  That leaves many of them out.  I'm basically left with the rule and
>> tree classifiers.
>
> That might rule out classifiers that work well on your data.
>
>> There is still a large number of algorithms in that
>> set.  Is there a cheat sheet to help me weed out the list futher?
>
> Assuming that you have a nominal attribute, I can think of the following ones:
>  trees.J48, trees.REPTree, rules.PART
>
> But I'd just chuck them all in an Experimenter setup and compare them
> on your dataset. Then, pick the best bunch and optimize the
> parameters.
>
>> Also, do any of the clustering algorithms produce rule sets or decision tree?
>
> CobWeb is the only one that produces a tree, if I'm not mistaken.
>
> Cheers, Peter
> --
> Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
> http://www.cs.waikato.ac.nz/~fracpete/           Ph. +64 (7) 858-5174
>
Trial and error.
Got it.
Thanks.

--
Curtis


_______________________________________________
Wekalist mailing list
Send posts to: Wekalist@...
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

Re: Classification Algorithms

by NightlordTW :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message



Forgive me for asking a general data mining question.
There are probably over 100 classification algorithms built into Weka.
 Are they described in one concise place.  I can look up each one
individually, but I'm hoping for a single table that says this one is
good for this type of data etc.

In general how do I decide which ones to try?
For my data, I need to know the final criteria for the classification.
 That leaves many of them out.  I'm basically left with the rule and
tree classifiers.  There is still a large number of algorithms in that
set.  Is there a cheat sheet to help me weed out the list futher?

Also, do any of the clustering algorithms produce rule sets or decision tree?

Thanks,
Curtis


In general, trees and decision rules are interpretable, but often not optimal classifiers. If you would like to improve performance, try support vector machines, stochiastic modeling, or classifier ensembles. Well-known ensembles consist of applying random forests, bagging, boosting, bayesian model averaging, ... on a set of classifiers which can for example be produced by decision rules of decision tree learning.



_______________________________________________
Wekalist mailing list
Send posts to: Wekalist@...
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html




--
Thomas Debray | Theoretical Epidemiology | Julius Center | Stratenum 6.131 | University Medical Center Utrecht  | P.O.Box 85500  | 3508 GA Utrecht | The Netherlands | www.juliuscenter.nl | www.thomasdebray.be | www.netstorm.be

_______________________________________________
Wekalist mailing list
Send posts to: Wekalist@...
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html