Attribute Filter by a simple regular expression

View: New views
9 Messages — Rating Filter:   Alert me  

Attribute Filter by a simple regular expression

by Nivaldo Vasconcelos-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

I need remove a set of attributes from a dataset. I'll do this using their names.
Is there any filter (or another resource) to remove attribute based on regular expressions ?
eg: remove all attributed which begins with 'X1_01a'

I know that it is possible to do this, manually, using string handles + methods from Instances Class, but may be possible to find some thing like this already done.

Best regards,
--
Nivaldo Vasconcelos
http://jeitosdeveravida.blogspot.com

_______________________________________________
Wekalist mailing list
Send posts to: Wekalist@...
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

Re: Attribute Filter by a simple regular expression

by Peter Reutemann-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> I need remove a set of attributes from a dataset. I'll do this using their
> names.
> Is there any filter (or another resource) to remove attribute based on
> regular expressions ?
> eg: remove all attributed which begins with 'X1_01a'
>
> I know that it is possible to do this, manually, using string handles +
> methods from Instances Class, but may be possible to find some thing like
> this already done.

I've added a new filter to the developer version:
  weka.filters.unsupervised.attribute.RemoveByName

This filter removes attributes based on a regular expression matched
against their names. Inverting the matching sense is possible as well.
You could use the following regular expression with this filter then,
in order to remove attributes that start "X1_01a":
  ^X1_01a.*

Read FAQ "How do I get the latest bugfixes?" if you want to get this
new functionality. Link to the FAQs available from the Weka homepage.

Cheer, Peter
--
Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
http://www.cs.waikato.ac.nz/~fracpete/           Ph. +64 (7) 858-5174

_______________________________________________
Wekalist mailing list
Send posts to: Wekalist@...
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

Re: Attribute Filter by a simple regular expression

by Nivaldo Vasconcelos-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

OK.
Thank you.

BR,
Nivaldo

On Sun, Aug 9, 2009 at 12:56 AM, Peter Reutemann<fracpete@...> wrote:

>> I need remove a set of attributes from a dataset. I'll do this using their
>> names.
>> Is there any filter (or another resource) to remove attribute based on
>> regular expressions ?
>> eg: remove all attributed which begins with 'X1_01a'
>>
>> I know that it is possible to do this, manually, using string handles +
>> methods from Instances Class, but may be possible to find some thing like
>> this already done.
>
> I've added a new filter to the developer version:
>  weka.filters.unsupervised.attribute.RemoveByName
>
> This filter removes attributes based on a regular expression matched
> against their names. Inverting the matching sense is possible as well.
> You could use the following regular expression with this filter then,
> in order to remove attributes that start "X1_01a":
>  ^X1_01a.*
>
> Read FAQ "How do I get the latest bugfixes?" if you want to get this
> new functionality. Link to the FAQs available from the Weka homepage.
>
> Cheer, Peter
> --
> Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
> http://www.cs.waikato.ac.nz/~fracpete/           Ph. +64 (7) 858-5174
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: Wekalist@...
> List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>



--
Nivaldo Vasconcelos
http://jeitosdeveravida.blogspot.com

_______________________________________________
Wekalist mailing list
Send posts to: Wekalist@...
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

decision tree support updateable function or not

by tgh :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi
   I have a question about j48, does it support updateable function or
not? Could it make incremental training for classifier or not?

Thanks



_______________________________________________
Wekalist mailing list
Send posts to: Wekalist@...
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

Re: decision tree support updateable function or not

by Peter Reutemann-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

>   I have a question about j48, does it support updateable function or
> not?

No, which is quite easy to tell, since J48 doesn't implement the
weka.classifiers.UpdateableClassifier interface.

> Could it make incremental training for classifier or not?

I don't think so, as J48 uses infogain (if I'm not mistaken) and
therefore needs to have access to all the data beforehand to determine
which attributes can be used for splitting.

If you need trees that can handle millions of rows and are
incremental, then have a look at MOA:
  http://www.cs.waikato.ac.nz/~abifet/MOA/

Cheers, Peter
--
Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
http://www.cs.waikato.ac.nz/~fracpete/           Ph. +64 (7) 858-5174

_______________________________________________
Wekalist mailing list
Send posts to: Wekalist@...
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

RE: decision tree support updateable function or not

by Andrew Purtell :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Some parts of this message have been removed. Learn more about Nabble's security policy.
Recently on a project I extracted some of the MOA classifiers into a set of Weka UpdateableClassifiers. I have dropped it into an S3 bucket if someone might find it useful: http://iridiant.s3.amazonaws.com/weka.classifiers.hoeffding.tar.bz2

Best regards,

   - Andy



From: Peter Reutemann
To: Weka machine learning workbench list.
Sent: Sun, October 11, 2009 2:40:23 PM
Subject: Re: [Wekalist] decision tree support updateable function or not

>   I have a question about j48, does it support updateable function or
> not?

No, which is quite easy to tell, since J48 doesn't implement the
weka.classifiers.UpdateableClassifier interface.

> Could it make incremental training for classifier or not?

I don't think so, as J48 uses infogain (if I'm not mistaken) and
therefore needs to have access to all the data beforehand to determine
which attributes can be used for splitting.

If you need trees that can handle millions of rows and are
incremental, then have a look at MOA:
  http://www.cs.waikato.ac.nz/~abifet/MOA/

Cheers, Peter
--
Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
http://www.cs.waikato.ac.nz/~fracpete/          Ph. +64 (7) 858-5174

_______________________________________________
Wekalist mailing list
Send posts to: Wekalist@...
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com


_______________________________________________
Wekalist mailing list
Send posts to: Wekalist@...
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

Re: decision tree support updateable function or not

by Peter Reutemann-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> Recently on a project I extracted some of the MOA classifiers into a set of
> Weka UpdateableClassifiers. I have dropped it into an S3 bucket if someone
> might find it useful:
> http://iridiant.s3.amazonaws.com/weka.classifiers.hoeffding.tar.bz2

The latest release of MOA already allows you to use the MOA
classifiers within WEKA (via the MOA meta-classifier) and the Weka
classifiers within MOA.

http://www.cs.waikato.ac.nz/~abifet/MOA/

Cheers, Peter
--
Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
http://www.cs.waikato.ac.nz/~fracpete/           Ph. +64 (7) 858-5174

_______________________________________________
Wekalist mailing list
Send posts to: Wekalist@...
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

some variables dropped out, could not be fully loaded

by tgh :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi
        I use weka in windows, and I got a set of data in linux, and
save the data file in csv file format, the data set file has 342
columes, that is, the data set have 342 variables ,  the samples  have
75 instances,
        but in windows, the excel could not load all variables, it just
load 256 columes, and in windows, weka also just load 256 varialbes,
drop out the rest,


how to find out the dropped variables in windows, what should I set in
weka
Could you help me
Thank you



_______________________________________________
Wekalist mailing list
Send posts to: Wekalist@...
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

Re: some variables dropped out, could not be fully loaded

by Peter Reutemann-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

>        I use weka in windows, and I got a set of data in linux, and
> save the data file in csv file format, the data set file has 342
> columes, that is, the data set have 342 variables ,  the samples  have
> 75 instances,
>        but in windows, the excel could not load all variables, it just
> load 256 columes, and in windows, weka also just load 256 varialbes,
> drop out the rest,
>
>
> how to find out the dropped variables in windows, what should I set in
> weka

Weka does *not* have a limit of columns (what version of Weka are you
using, anyway?). Something else must have gone wrong there. Maybe you
accidentally saved the CSV file with Excel and that deleted all the
other columns?

BTW OpenOffice 3 handles up to 1024 columns:
  http://en.wikipedia.org/wiki/OpenOffice.org_Calc#Specifications

Cheers, Peter
--
Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
http://www.cs.waikato.ac.nz/~fracpete/           Ph. +64 (7) 858-5174

_______________________________________________
Wekalist mailing list
Send posts to: Wekalist@...
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html