String attributes

View: New views
6 Messages — Rating Filter:   Alert me  

String attributes

by Ícaro Medeiros :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I have been reading emails and Web pages about string attributes in WEKA but was not able to figure out how to use them in the right way.

My problem is that I want a reference in the ARFF file for the document and the term of instances (in a text mining application) but obviously this is not an information for the Classifier to handle, so I want to use string attributes in the ARFF file and filter this attribute when building the Classifier.

Does anybody have an example of how a string Attribute object is set? Also, how Instance values are set and how I use these filters to remove string attributes when I forward instances read from the ARFF files for the Classifier?

--
Ícaro Rafael da Silva Medeiros (+351 91 447 0877)
Researcher at INESC-ID Lisbon (www.inesc-id.pt)
MSc Candidate - Federal University of Pernambuco (www.cin.ufpe.br)
Blog: http://kirux.wordpress.com/

Linux User #212604

== Scientia Vincere Tenebras ==

_______________________________________________
Wekalist mailing list
Send posts to: Wekalist@...
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

Re: String attributes

by Peter Reutemann-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> I have been reading emails and Web pages about string attributes in WEKA but
> was not able to figure out how to use them in the right way.
>
> My problem is that I want a reference in the ARFF file for the document and
> the term of instances (in a text mining application) but obviously this is
> not an information for the Classifier to handle, so I want to use string
> attributes in the ARFF file and filter this attribute when building the
> Classifier.
>
> Does anybody have an example of how a string Attribute object is set? Also,
> how Instance values are set and how I use these filters to remove string
> attributes when I forward instances read from the ARFF files for the
> Classifier?

If you want to track instances in your data, then use ID attributes.
See FAQ "How do I use ID attributes?". Link to the FAQs available from
the Weka homepage.

Cheers, Peter
--
Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
http://www.cs.waikato.ac.nz/~fracpete/           Ph. +64 (7) 858-5174

_______________________________________________
Wekalist mailing list
Send posts to: Wekalist@...
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

Re: String attributes

by Ícaro Medeiros :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

But how can I do this programatically during Instance creation?

On Thu, Oct 22, 2009 at 7:09 PM, Peter Reutemann <fracpete@...> wrote:
> I have been reading emails and Web pages about string attributes in WEKA but
> was not able to figure out how to use them in the right way.
>
> My problem is that I want a reference in the ARFF file for the document and
> the term of instances (in a text mining application) but obviously this is
> not an information for the Classifier to handle, so I want to use string
> attributes in the ARFF file and filter this attribute when building the
> Classifier.
>
> Does anybody have an example of how a string Attribute object is set? Also,
> how Instance values are set and how I use these filters to remove string
> attributes when I forward instances read from the ARFF files for the
> Classifier?

If you want to track instances in your data, then use ID attributes.
See FAQ "How do I use ID attributes?". Link to the FAQs available from
the Weka homepage.

Cheers, Peter
--
Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
http://www.cs.waikato.ac.nz/~fracpete/           Ph. +64 (7) 858-5174

_______________________________________________
Wekalist mailing list
Send posts to: Wekalist@...
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html



--
Ícaro Rafael da Silva Medeiros (+351 91 447 0877)
Researcher at INESC-ID Lisbon (www.inesc-id.pt)
MSc Candidate - Federal University of Pernambuco (www.cin.ufpe.br)
Blog: http://kirux.wordpress.com/

Linux User #212604

== Scientia Vincere Tenebras ==

_______________________________________________
Wekalist mailing list
Send posts to: Wekalist@...
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

Re: String attributes

by Ícaro Medeiros :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

And I want a string attribute, not a numeric one.

[]s

2009/10/26 Ícaro Medeiros <icaro.medeiros@...>
But how can I do this programatically during Instance creation?


On Thu, Oct 22, 2009 at 7:09 PM, Peter Reutemann <fracpete@...> wrote:
> I have been reading emails and Web pages about string attributes in WEKA but
> was not able to figure out how to use them in the right way.
>
> My problem is that I want a reference in the ARFF file for the document and
> the term of instances (in a text mining application) but obviously this is
> not an information for the Classifier to handle, so I want to use string
> attributes in the ARFF file and filter this attribute when building the
> Classifier.
>
> Does anybody have an example of how a string Attribute object is set? Also,
> how Instance values are set and how I use these filters to remove string
> attributes when I forward instances read from the ARFF files for the
> Classifier?

If you want to track instances in your data, then use ID attributes.
See FAQ "How do I use ID attributes?". Link to the FAQs available from
the Weka homepage.

Cheers, Peter
--
Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
http://www.cs.waikato.ac.nz/~fracpete/           Ph. +64 (7) 858-5174

_______________________________________________
Wekalist mailing list
Send posts to: Wekalist@...
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html



--
Ícaro Rafael da Silva Medeiros (+351 91 447 0877)
Researcher at INESC-ID Lisbon (www.inesc-id.pt)
MSc Candidate - Federal University of Pernambuco (www.cin.ufpe.br)
Blog: http://kirux.wordpress.com/

Linux User #212604

== Scientia Vincere Tenebras ==



--
Ícaro Rafael da Silva Medeiros (+351 91 447 0877)
Researcher at INESC-ID Lisbon (www.inesc-id.pt)
MSc Candidate - Federal University of Pernambuco (www.cin.ufpe.br)
Blog: http://kirux.wordpress.com/

Linux User #212604

== Scientia Vincere Tenebras ==

_______________________________________________
Wekalist mailing list
Send posts to: Wekalist@...
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

Re: String attributes

by Peter Reutemann-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> But how can I do this programatically during Instance creation?
>
>> > I have been reading emails and Web pages about string attributes in WEKA
>> > but
>> > was not able to figure out how to use them in the right way.
>> >
>> > My problem is that I want a reference in the ARFF file for the document
>> > and
>> > the term of instances (in a text mining application) but obviously this
>> > is
>> > not an information for the Classifier to handle, so I want to use string
>> > attributes in the ARFF file and filter this attribute when building the
>> > Classifier.
>> >
>> > Does anybody have an example of how a string Attribute object is set?
>> > Also,
>> > how Instance values are set and how I use these filters to remove string
>> > attributes when I forward instances read from the ARFF files for the
>> > Classifier?
>>
>> If you want to track instances in your data, then use ID attributes.
>> See FAQ "How do I use ID attributes?". Link to the FAQs available from
>> the Weka homepage.

Just add another attribute for storing the ID (and increase your
counter whenever you generate an Instance).

See wiki article "Creating an ARFF file" on how to generate
weka.core.Instances objects on the fly:
  http://weka.wikispaces.com/Creating+an+ARFF+file

Cheers, Peter
--
Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
http://www.cs.waikato.ac.nz/~fracpete/           Ph. +64 (7) 858-5174

_______________________________________________
Wekalist mailing list
Send posts to: Wekalist@...
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

Re: String attributes

by Peter Reutemann-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> And I want a string attribute, not a numeric one.
>>
>> But how can I do this programatically during Instance creation?

Just add a STRING attribute when you're generating the
weka.core.Instances object.

Cheers, Peter
--
Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
http://www.cs.waikato.ac.nz/~fracpete/           Ph. +64 (7) 858-5174

_______________________________________________
Wekalist mailing list
Send posts to: Wekalist@...
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html