|
View:
New views
9 Messages
—
Rating Filter:
Alert me
|
|
|
Question on Association RulesHi Everyone,
Is it correct that any association rules generated cannot be saved as a model and used to predict any missing attribute value that I have in testing dataset ? I've tried to search archive for this and find no clue.
_______________________________________________ Wekalist mailing list Send posts to: Wekalist@... List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html |
|
|
Re: Question on Association Rules> Is it correct that any association rules generated cannot be saved as a
> model and used to predict any missing attribute value that I have in testing > dataset ? Correct. Cheers, Peter -- Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ http://www.cs.waikato.ac.nz/~fracpete/ Ph. +64 (7) 858-5174 _______________________________________________ Wekalist mailing list Send posts to: Wekalist@... List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html |
|
|
Re: Question on Association RulesHi Peter,
Thanks for the confirmation. Is there any roadmap that this feature will be implemented ?
-- Regards, Feris Thia http://pentaho.phi-integration.com _______________________________________________ Wekalist mailing list Send posts to: Wekalist@... List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html |
|
|
Re: Question on Association Rules> Thanks for the confirmation. Is there any roadmap that this feature will be
> implemented ? I don't know, I'm not the Weka maintainer. But since this feature hasn't been added in the past 10 years, I doubt it that it will be added any time soon (unless somebody contributes it). Cheers, Peter -- Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ http://www.cs.waikato.ac.nz/~fracpete/ Ph. +64 (7) 858-5174 _______________________________________________ Wekalist mailing list Send posts to: Wekalist@... List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html |
|
|
Re: locally weighted regression problemHello,
2,5 hours ago, i sent a mail with questions about LWL, but I found the error now. So, the admin does not need to forward my previous mail (it was > 40 KB and therefore not forwarded automatically). Now, I have another question. I want to learn a model, or to stay correct, a state-prediction for a special real physical process with locally weighted regression, therefore i started playing with LWL in weka, generating a simple data set (20 instances) from a simple quadratic function y = x^2 with a little bit of noise. This works fine, when generating the test set (50 instances) to test prediction for each instance, i get a nice "approximation" over the data set. Plot 1 (red dots: known data, green dots: predictions) shows the result using all nearest neighbors (the whole data set, lwl.setKKN(0)) and a linear weighting kernel (lwl.setWeightingKernel(LINEAR)). The result makes sense. When using only the nearest 5 dots for local regression (lwl.setKKN(5)), i get plot 2, more realistic predictions. The problem is, when using a gaussian weighting kernel (lwl.setWeightingKernel(GAUSS)), the result is independent from the number of nearest neighbors i use, it's always the same result, it looks very similar to plot1. So, it seems, that when using a gaussian weighting kernel, it's always as i would have set lwl.setKKN(0). I know, it's the default value, but of course, i tried to overwrite it with lwl.setKKN(2/5/whatelse), but there is no effect. A bug? I don't thing so...any help? best Richard -- Richard Cubek, Dipl.-Ing.(FH) University of Applied Sciences Ravensburg-Weingarten Intelligent Mobile Robotics Laboratory Phone: (0049) (0)751 501 9838 Mobile: (0049) (0)163 88 39 529 _______________________________________________ Wekalist mailing list Send posts to: Wekalist@... List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html |
|
|
Re: Re: locally weighted regression problemOn 27/10/09 8:35 AM, Richard Cubek wrote:
> Hello, > > 2,5 hours ago, i sent a mail with questions about LWL, but I found the > error now. So, the admin does not need to forward my previous mail (it > was > 40 KB and therefore not forwarded automatically). > > Now, I have another question. I want to learn a model, or to stay > correct, a state-prediction for a special real physical process with > locally weighted regression, therefore i started playing with LWL in > weka, generating a simple data set (20 instances) from a simple > quadratic function y = x^2 with a little bit of noise. This works fine, > when generating the test set (50 instances) to test prediction for each > instance, i get a nice "approximation" over the data set. Plot 1 (red > dots: known data, green dots: predictions) shows the result using all > nearest neighbors (the whole data set, lwl.setKKN(0)) and a linear > weighting kernel (lwl.setWeightingKernel(LINEAR)). The result makes > sense. When using only the nearest 5 dots for local regression > (lwl.setKKN(5)), i get plot 2, more realistic predictions. > > The problem is, when using a gaussian weighting kernel > (lwl.setWeightingKernel(GAUSS)), the result is independent from the > number of nearest neighbors i use, it's always the same result, it looks > very similar to plot1. So, it seems, that when using a gaussian > weighting kernel, it's always as i would have set lwl.setKKN(0). I know, > it's the default value, but of course, i tried to overwrite it with > lwl.setKKN(2/5/whatelse), but there is no effect. A bug? I don't thing > so...any help? setKNN sets the bandwidth for scaling the weighting function. Gaussian weighting is not bounded and all instances receive non-zero weight. The idea is to try and be consistent regarding the set of k neighbours returned and the weighting functions. It doesn't seem right that there is no way to scale the gaussian (and inverse for that matter as well) weighting functions though. Perhaps in those cases all instances should be weighted, but only the user selected kth instance used for scaling. Using these kernels will always be more expensive than the others since all the instances need to be used for building the model (unless we try and prune instances with very small relative weights). Cheers, Mark. -- Mark Hall Senior Developer/Consultant, Pentaho Open Source Business Intelligence Citadel International, Suite 340, 5950 Hazeltine National Dr., Orlando, FL 32822, USA +64 7 348-7099 office, +64 21 399-132 mobile, +1 815 550-8637 fax, Skype: mark.andrew.hall, Yahoo: mark_andrew_hall Download the latest release today <http://www.sourceforge.net/projects/pentaho> _______________________________________________ Wekalist mailing list Send posts to: Wekalist@... List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html |
|
|
Re: Re: locally weighted regression problemMark Hall schrieb:
> On 27/10/09 8:35 AM, Richard Cubek wrote: >> Hello, >> >> 2,5 hours ago, i sent a mail with questions about LWL, but I found the >> error now. So, the admin does not need to forward my previous mail (it >> was > 40 KB and therefore not forwarded automatically). >> >> Now, I have another question. I want to learn a model, or to stay >> correct, a state-prediction for a special real physical process with >> locally weighted regression, therefore i started playing with LWL in >> weka, generating a simple data set (20 instances) from a simple >> quadratic function y = x^2 with a little bit of noise. This works fine, >> when generating the test set (50 instances) to test prediction for each >> instance, i get a nice "approximation" over the data set. Plot 1 (red >> dots: known data, green dots: predictions) shows the result using all >> nearest neighbors (the whole data set, lwl.setKKN(0)) and a linear >> weighting kernel (lwl.setWeightingKernel(LINEAR)). The result makes >> sense. When using only the nearest 5 dots for local regression >> (lwl.setKKN(5)), i get plot 2, more realistic predictions. >> >> The problem is, when using a gaussian weighting kernel >> (lwl.setWeightingKernel(GAUSS)), the result is independent from the >> number of nearest neighbors i use, it's always the same result, it looks >> very similar to plot1. So, it seems, that when using a gaussian >> weighting kernel, it's always as i would have set lwl.setKKN(0). I know, >> it's the default value, but of course, i tried to overwrite it with >> lwl.setKKN(2/5/whatelse), but there is no effect. A bug? I don't thing >> so...any help? one day now and I'm impressed - I like weka and I'm sure we will use it :-) I think, i didn't understand setKNN: From source: "Sets the number of neighbours used for kernel bandwidth setting. The bandwidth is taken as the distance to the kth neighbour." 1 Neighbour means 1 Instance here? If setting setKNN(3), what is the kth neighbour? The 3rd? At the end, i can understand the method as "setting the amount of nearest instances taken into account for the local regression?". > > setKNN sets the bandwidth for scaling the weighting function. Gaussian > weighting is not bounded and all instances receive non-zero weight. > The idea is to try and be consistent regarding the set of k neighbours > returned and the weighting functions. Why am I not consistent, if scaling the gaussian? I'm afraid, I didn't understand this block. > It doesn't seem right that there is no way to scale the gaussian (and > inverse for that matter as well) weighting functions though. Indeed. If we talk about locally weighted regression, IMO, it doesn't make sense to scale the gaussian weighting kernel over the whole dataset in every case, it's "globally" weighting regression then, not locally. > Perhaps in those cases all instances should be weighted, but only the > user selected kth instance used for scaling. Hmm, i don't understand, didn't we say, that we are not able to scale when weighting gaussian? Weighting instances in this case IMO doesn't make sense, the weighting, in LWL, should differ for each prediction... > Using these kernels will always be more expensive than the others > since all the instances need to be used for building the model (unless > we try and prune instances with very small relative weights). > > Cheers, > Mark. > Hmm, I still don't see why gaussian should not be scalable. At the end, /** The available kernel weighting methods. */ protected static final int LINEAR = 0; protected static final int EPANECHNIKOV = 1; protected static final int TRICUBE = 2; protected static final int INVERSE = 3; protected static final int GAUSS = 4; protected static final int CONSTANT = 5; in LWL should naturally all be public! I can't do LWL lwl = new LWL(); lwl.setWeightingKernel(LWL.LINEAR) but i would like to :-) (instead of watching in the LWL sources what integer stands for what method). Thanks for the fast answer best Richard -- Richard Cubek, Dipl.-Ing.(FH) University of Applied Sciences Ravensburg-Weingarten Intelligent Mobile Robotics Laboratory Phone: (0049) (0)751 501 9838 Mobile: (0049) (0)163 88 39 529 _______________________________________________ Wekalist mailing list Send posts to: Wekalist@... List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html |
|
|
Re: Re: locally weighted regression problemOn 28/10/09 1:51 AM, Richard Cubek wrote:
> Mark Hall schrieb: >> On 27/10/09 8:35 AM, Richard Cubek wrote: >>> Hello, >>> >>> 2,5 hours ago, i sent a mail with questions about LWL, but I found the >>> error now. So, the admin does not need to forward my previous mail (it >>> was > 40 KB and therefore not forwarded automatically). >>> >>> Now, I have another question. I want to learn a model, or to stay >>> correct, a state-prediction for a special real physical process with >>> locally weighted regression, therefore i started playing with LWL in >>> weka, generating a simple data set (20 instances) from a simple >>> quadratic function y = x^2 with a little bit of noise. This works fine, >>> when generating the test set (50 instances) to test prediction for each >>> instance, i get a nice "approximation" over the data set. Plot 1 (red >>> dots: known data, green dots: predictions) shows the result using all >>> nearest neighbors (the whole data set, lwl.setKKN(0)) and a linear >>> weighting kernel (lwl.setWeightingKernel(LINEAR)). The result makes >>> sense. When using only the nearest 5 dots for local regression >>> (lwl.setKKN(5)), i get plot 2, more realistic predictions. >>> >>> The problem is, when using a gaussian weighting kernel >>> (lwl.setWeightingKernel(GAUSS)), the result is independent from the >>> number of nearest neighbors i use, it's always the same result, it looks >>> very similar to plot1. So, it seems, that when using a gaussian >>> weighting kernel, it's always as i would have set lwl.setKKN(0). I know, >>> it's the default value, but of course, i tried to overwrite it with >>> lwl.setKKN(2/5/whatelse), but there is no effect. A bug? I don't thing >>> so...any help? > Well, first of all, before starting "criticizing" - i tried it only for > one day now and I'm impressed - I like weka and I'm sure we will use it :-) > > I think, i didn't understand setKNN: > > From source: "Sets the number of neighbours used for kernel bandwidth > setting. The bandwidth is taken as the distance to the kth neighbour." > > 1 Neighbour means 1 Instance here? If setting setKNN(3), what is the kth > neighbour? The 3rd? At the end, i can understand the method as "setting > the amount of nearest instances taken into account for the local > regression?". Yes, neighbour means instance. If k = 3, then the three nearest neighbours (according to the distance function) are returned. Actually, more than three might get returned as ties in distance are counted as one instance. LWL passes a weighted version of the training data to the base learning algorithm. Each training instance is assigned a weight according to the selected weighting function (which in turn uses the distance of the training instance to the current test instance - hence the "local" part of locally weighted learning). Typically, instances further from the test instance receive a lower weight. In the case of functions with bounded support, k determines the support, which, in turn, has the effect that some training instances receive zero weight (and can effectively be ignored). The linear kernel is bounded and gives a weight of 1.0001 - distance[i], for training instance i. Since distances are normalized to 0 - 1 range, you can see that this function decreases to zero. Setting a support value based on k, and then scaling the distances by this value, effectively means that the kth closest instance will receive the lowest weight and all instances further away can be ignored (i.e. weights <= 0). The Gaussian kernel is not bounded (exp(-1 * 1) = 0.3679). So, scaling by the kth nearest distance is not going to have the effect that some instances receive zero weight. All instances will always be used. However, at the moment no scaling is done for the Gaussian, and I should change it to allow the k parameter to scale the Gaussian. > At the end, > > /** The available kernel weighting methods. */ > protected static final int LINEAR = 0; > protected static final int EPANECHNIKOV = 1; > protected static final int TRICUBE = 2; protected static final int > INVERSE = 3; > protected static final int GAUSS = 4; > protected static final int CONSTANT = 5; > > in LWL should naturally all be public! Absolutely! Good call. Cheers, Mark. -- Mark Hall Senior Developer/Consultant, Pentaho Open Source Business Intelligence Citadel International, Suite 340, 5950 Hazeltine National Dr., Orlando, FL 32822, USA +64 7 348-7099 office, +64 21 399-132 mobile, +1 815 550-8637 fax, Skype: mark.andrew.hall, Yahoo: mark_andrew_hall Download the latest release today <http://www.sourceforge.net/projects/pentaho> _______________________________________________ Wekalist mailing list Send posts to: Wekalist@... List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html |
|
|
Re: Re: locally weighted regression problem> this function decreases to zero. Setting a support value based on k, and
> then scaling the distances by this value, effectively means that the kth > closest instance will receive the lowest weight and all instances further > away can be ignored (i.e. weights <= 0). The Gaussian kernel is not bounded > (exp(-1 * 1) = 0.3679). So, scaling by the kth nearest distance is not going > to have the effect that some instances receive zero weight. All instances > will always be used. However, at the moment no scaling is done for the > Gaussian, and I should change it to allow the k parameter to scale the > Gaussian. > I reckon what most users would expect to happen is that simply every instance further away than the kth one will be ignored, i.e. not included in the subset being passed on to the base learner; kind of like setting all these instances' weights to zero. That should be straightforward to implement. And in general the reweighting should probably make sure that the total weight of the subset equals k, the size of the subset, as a lot of learners in Weka work on the assumption that on average an instance weight is 1.0 Bernhard --------------------------------------------------------------------- Bernhard Pfahringer, Dept. of Computer Science, University of Waikato http://www.cs.waikato.ac.nz/~bernhard +64 7 838 4041 _______________________________________________ Wekalist mailing list Send posts to: Wekalist@... List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html |
| Free embeddable forum powered by Nabble | Forum Help |