Hi there,
On Mon, Jun 29, 2009 at 11:32 AM, Quanzhi LIU<
vickyquanzhi@...> wrote:
> Can anyone help?
I've attached a little explanation of the GSP algorithm (which now is
archived with regard to future questions).
Hope this helps a bit.
Best regards,
Sebastian
--------------------
Consider the following minimalistic data set:
-----
@relation sequential_test_set
@attribute day {1, 2, 3}
@attribute power consumption {gas=base, gas=peak}
@attribute wind speed {wind=calm, wind=breeze}
@data
1,power=base,wind=calm
1,power=peak,wind=breeze
2,power=base,wind=calm
2,power=peak,wind=breeze
3,power=base,wind=calm
3,power=peak,wind=breeze
----
There are 3 subsets describing the power consumption and wind speed
over the day; e.g. the first line could be interpreted as "on day one
the power cunsumption corresponded to a base load level and the wind
speed was calm". The algorithm extracts these three subsets, deletes
the attribute identifying the subsets (in this case "day"), and
extracts all existing sequences meeting a previously set minimum
support threshold. The output looks like the following:
---------------------------------------------------------------------------------------------------------
- Number of cycles performed: 4
- Total number of frequent sequences: 15
- Algorithm started at: 2007-04-23, 18:25:48
- Algorithm ended at: 2007-04-23, 19:27:50
- Frequent Sequences Details (filtered)
-----------------------------------------------------------------
- 1-sequences ------------------------------------------
[1] <{power=base}> (3)
[2] <{power=peak}> (3)
[3] <{wind=calm}> (3)
[4] <{wind=breeze}> (3)
- 2-sequences ------------------------------------------
[1] <{power=base}{power=peak}> (3)
[2] <{power=base}{wind=breeze}> (3)
[3] <{power=base,wind=calm}> (3)
[4] <{wind=calm}{power=peak}> (3)
[5] <{wind=calm}{wind=breeze}> (3)
[6] <{power=peak,wind=breeze}> (3)
- 3-sequences ------------------------------------------
[1] <{power=base}{power=peak,wind=breeze}> (3)
[2] <{power=base,wind=calm}{power=peak}> (3)
[3] <{power=base,wind=calm}{wind=breeze}> (3)
[4] <{wind=calm}{power=peak,wind=breeze}> (3)
- 4-sequences ------------------------------------------
[1] <{power=base,wind=calm}{power=peak,wind=breeze}> (3)
---------------------------------------------------------------------------------------------------------
Generally, an exemplary sequence looks like
<{event1,event2}{event3}{event4,event5}>
where the curly braces define an element of the sequence.
At the moment, there are 3 options available:
---
dataSeqID -- The attribute number representing the data sequence ID
(in the example "day").
filterAttributes -- The attribute numbers (e.g. "0, 1") used for
result filtering. Only sequences containing the specified attributes
in each of their elements/itemsets will be output. -1 prints all.
minSupport -- Minimum support threshold.
---
_______________________________________________
Wekalist mailing list
Send posts to:
Wekalist@...
List info and subscription status:
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalistList etiquette:
http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html