|
View:
New views
3 Messages
—
Rating Filter:
Alert me
|
|
|
Announcing statistical inference package 'stats'Dear all,
I have just commited new package 'stats' to cvs, also help files in English and Spanish were added, together with the test file. This is the list (which should be increased in the future) of included procedures: * mean_test * dif_means_test * variance_test * variance_ratio_test * sign_test * signed_rank_test * rank_sum_test (or Wilcoxon-Mann-Whitney test) * shapiro_wilk_test (to check for normality) * simple_linear_reg This package also defines the Maxima object 'inference_result', which stores the results of all the computations, although only a subset of them are displayed by default (just the most commonly needed). See the help files to learn how to obtain the complete set of results. I have been inspired in some extent by the R statistical package: the idea of the 'inference_result' was taken from R, and some purely numerical algorithms are the same used by R (translated from C or fortran to lisp; in fact, some C code in R is a translation from previous algorithms written in fortran). I have not implemented the R 'frame' concept, but the door is open to make it in the future. Samples can be stored in Maxima lists or matrices and given as arguments to the functions of package 'stats'. I'm not sure I made the correct decisions while writing this package; I'm open to comments (included about the help file, since I'm not a good English writer). Tests passed in clisp, cmucl and sbcl. Files are in http://maxima.cvs.sourceforge.net/maxima/maxima/share/contrib/stats/ Best wishes Mario -- Mario Rodriguez Riotorto www.biomates.net _______________________________________________ Maxima mailing list Maxima@... http://www.math.utexas.edu/mailman/listinfo/maxima |
|
|
Re: Announcing statistical inference package 'stats'Mario,
Thanks a lot for writing the stats package. I appreciate your dedication to the project. As the major aspects of the package are very good, I'll restrict myself to some minor quibbling. (1) I don't think it's appropriate to modify global variables by loading the stats package. This can lead to bad surprises, e.g. interactions with other packages, or unexpected results. (2) About numer in particular, numer : true defeats one of Maxima's major features. We really shouldn't discourage people from exploiting Maxima's capability to do exact integer and rational arithmetic. If some functions in the stats package need to convert non-floats to floats, then (LET (($NUMER T)) ..) or block([numer : true], ...) is the way to go. (3) I think it is a good idea to present the results in an inference_result object. I like the way the results are presented in a nice format by a display function. A possibility here is to use the existing (though not quite finished) defstruct code to construct the inference_result objects. Then the methods for accessing fields within a structure don't need to be duplicated. (4) I think the written documentation is very good; every share package should have such nice documentation. I'll make some minor revisions to the texinfo file. (5) I recommend renaming shapiro_wilk_test --> test_normality and making shapiro_wilk an option (since there are other normality tests) (6) I recommend renaming dif_means_test --> means_difference_test or means_diff_test (7) I recommend renaming simple_linear_reg --> linear_regression or simple_linear_regression (8) (MAYBE) Test functions could be renamed in big-endian style, to give these similar functions names which are more similar. It's a minor point. mean_test --> test_mean means_difference_test --> test_means_difference variance_test --> test_variance variance_ratio_test --> test_variance_ratio sign_test --> test_sign signed_rank_test --> test_signed_rank normality_test --> test_normality Thanks again, & all the best, Robert _______________________________________________ Maxima mailing list Maxima@... http://www.math.utexas.edu/mailman/listinfo/maxima |
|
|
Re: Announcing statistical inference package 'stats'Hello Robert,
> (1) I don't think it's appropriate to modify global variables by > loading the stats package. This can lead to bad surprises, > e.g. interactions with other packages, or unexpected results. > > (2) About numer in particular, numer : true defeats one of Maxima's > major features. We really shouldn't discourage people from > exploiting Maxima's capability to do exact integer and rational > arithmetic. > > If some functions in the stats package need to convert non-floats > to floats, then (LET (($NUMER T)) ..) or block([numer : true], ...) > is the way to go. > I understand objection (1). But I think that a common user of this package will be mostly interested in looking at floating point results; if these are given in rational form, he's obliged to write '%,numer' most of the time. On the other hand, nobody needs a p-value with sixteen digits, that's why I restrict fpprintprec to 7. On the other hand, with global variables 'numer' and 'fpprintprec' set to their default values the displayed inference_result object is very ugly. I propose a third alternative. Let's define two new global variables 'stats_numer' (default true) and 'stats_fpprint' (default 7), and don't change the other two globally. > (3) I think it is a good idea to present the results in an > inference_result object. I like the way the results are presented > in a nice format by a display function. I like it too, but the original idea of porting this from R to Maxima is not mine ;) > A possibility here is to use the existing (though not quite > finished) defstruct code to construct the inference_result objects. > Then the methods for accessing fields within a structure don't > need to be duplicated. Not related with the stats package. Months ago, I have being studying how to use 'defstruct' in the distrib package, to make it similar to the Mathematica style of defining distributions; for example, the idea was to write something similar to cdf(1/2, normal_distribution(0,1)); instead of cdf_normal(1/2,0,1); but I wasn't sure about the benefits of this syntax, and gave up. > (4) I think the written documentation is very good; every share package > should have such nice documentation. I'll make some minor revisions > to the texinfo file. Please, make them. > (5) I recommend renaming shapiro_wilk_test --> test_normality and > making shapiro_wilk an option (since there are other normality > tests) > > (6) I recommend renaming dif_means_test --> means_difference_test > or means_diff_test > > (7) I recommend renaming simple_linear_reg --> linear_regression > or simple_linear_regression > > (8) (MAYBE) Test functions could be renamed in big-endian style, > to give these similar functions names which are more similar. > It's a minor point. > > mean_test --> test_mean > means_difference_test --> test_means_difference > variance_test --> test_variance > variance_ratio_test --> test_variance_ratio > sign_test --> test_sign > signed_rank_test --> test_signed_rank > normality_test --> test_normality Ok, I'll put changes (5), (6), (7), and (8) in my todo list. Thanks for your comments. I'm interested in reading these and other opinions before writing more tests. Mario -- Mario Rodriguez Riotorto www.biomates.net _______________________________________________ Maxima mailing list Maxima@... http://www.math.utexas.edu/mailman/listinfo/maxima |
| Free embeddable forum powered by Nabble | Forum Help |