|
View:
New views
3 Messages
—
Rating Filter:
Alert me
|
|
|
after PCA, the pc values are so large, wrong?rm(list=ls())
yx.df<-read.csv("c:/MK-2-72.csv",sep=',',header=T,dec='.') dim(yx.df) #get X matrix y<-yx.df[,1] x<-yx.df[,2:643] #conver to matrix mat<-as.matrix(x) #get row number rownum<-nrow(mat) #remove the constant parameters mat1<-mat[,apply(mat,2,function(.col)!(all(.col[1]==.col[2:rownum])))] dim(yx.df) dim(mat1) #remove columns with numbers of zero >0.95 mat2<-mat1[,apply(mat1,2,function(.col)!(sum(.col==0)/rownum>0.95))] dim(yx.df) dim(mat2) #remove colunms that sd<0.5 mat3<-mat2[,apply(mat2,2,function(.col)!all(sd(.col)<0.5))] dim(yx.df) dim(mat3) #PCA analysis mat3.pr<-prcomp(mat3,cor=T) summary(mat3.pr,loading=T) pre.cmp<-predict(mat3.pr) cmp<-pre.cmp[,1:3] cmp DF<-cbind(Y,cmp) DF<-as.data.frame(DF) names(DF)<-c('y','p1','p2','p3') DF summary(lm(y~p1+p2+p3,data=DF)) mat3.pr<-prcomp(DF,cor=T) summary(mat3.pr) pre<-predict(mat3.pr) pre1<-pre[,1:3] pre1 colnames(pre1)<-c("x1","x2","x3") pre1 pc<-cbind(y,pre1) pc<-as.data.frame(pc) lm.pc<-lm(y~x1+x2+x3,data=pc) summary(lm.pc) above, my code about pca, but after finishing it, the first three pcs are some large, why? and the fit value r2 are bad. belowe is my value on the firest 3 pcs. > pre1 PC1 PC2 PC3 [1,] -15181.5190 1944.392700 -1074.326182 [2,] -32152.4533 1007.113729 3201.361408 [3,] -15836.5362 2117.988273 -555.799383 [4,] -1618.5561 1481.020337 255.530132 [5,] -5407.5030 1975.779398 -84.646283 [6,] -9662.1949 2611.220928 -417.435782 [7,] -30488.2102 577.385588 1853.420297 [8,] -2135.2563 -4506.112873 1382.413284 [9,] -1584.2796 -4645.142062 929.146895 [10,] -668.7664 -4876.250486 177.691446 [11,] -2188.5914 -4495.203080 1432.428127 [12,] -19633.9581 2159.000138 -1598.710872 [13,] -26849.1088 -515.574085 -2683.552623 [14,] -9492.9503 -4868.648205 1236.986097 [15,] -13857.6517 -4810.228193 1296.342199 [16,] -11596.5097 -8181.631403 462.913210 [17,] -25948.6564 -746.442386 -3415.426682 [18,] 15386.4477 709.974524 555.160973 [19,] 21642.7516 1163.456075 -609.437740 [20,] 22236.7094 675.562564 -136.992578 [21,] 14354.9927 611.996274 -4.867054 [22,] 12569.9493 1111.842240 585.540985 [23,] 20739.0219 3078.679745 1662.902248 [24,] 9472.0249 648.769910 381.487034 [25,] 17299.5307 1424.712428 1522.311676 [26,] 13231.2735 587.761915 170.448061 [27,] 10843.5590 705.485396 -79.931518 [28,] 9402.8803 -1978.216853 -1534.244078 [29,] 13094.9525 212.042937 -363.941664 [30,] 9337.3522 537.885230 189.558999 [31,] 7747.1347 -141.004825 -1664.082447 [32,] 4640.1161 -1489.652284 -3584.574135 [33,] 13241.5054 175.630689 -486.250927 [34,] 3867.2204 814.830143 1584.358007 [35,] 8614.5030 708.274447 814.295587 [36,] -18815.6774 -480.311541 1248.369916 [37,] -1860.0810 1195.557861 269.322703 [38,] 7172.0057 4.216905 -1191.448702 [39,] -7233.2271 -2361.951658 -235.293358 [40,] 1841.3548 1187.225488 632.116420 [41,] 12465.2336 367.822405 160.751014 [42,] -39021.7259 1972.333778 3167.504098 [43,] 13098.7736 -424.152058 -567.846037 [44,] 9793.7729 -559.084900 -210.696126 [45,] 13111.1861 22.772626 -318.242722 [46,] 13169.0604 7.808885 -363.995563 [47,] 3306.6293 -694.908211 -642.996604 [48,] 10779.8582 -989.175596 -1619.861931 [49,] 10872.6913 -747.979343 -1375.317959 [50,] -3057.5633 1838.449143 1454.886518 [51,] -6854.9316 2338.753165 1113.510561 [52,] -15077.1823 1917.776905 -1158.158633 [53,] -45862.8305 1173.157521 -1707.293955 [54,] -14294.1553 1716.708462 -1794.064434 [55,] 24645.0508 2519.904889 1424.233563 [56,] 23303.5998 2250.088386 839.587354 [57,] 18865.5231 897.566446 36.240598 [58,] 227.2659 -6582.661199 -712.892569 [59,] 15336.8371 722.953549 593.903314 [60,] 13030.8715 228.509670 -312.933654 [61,] 5826.0388 331.077814 -53.417878 [62,] 13150.4446 -437.612023 -608.342969 [63,] 11728.3897 -83.151510 569.007995 [64,] 11021.5720 -869.425283 -1216.724017 [65,] 9625.3142 137.388994 138.735249 [66,] -15905.2704 3735.547166 421.846379 [67,] -15539.7628 3331.399648 104.886572 [68,] -2294.9924 1648.164750 822.075221 [69,] -10120.0153 1558.766306 -333.378256 [70,] -24241.4554 -533.700229 1516.603088 [71,] -1036.6022 -4782.136067 475.195011 [72,] -24575.2244 2655.599986 -1965.946921 the fit result below: Call: lm(formula = y ~ x1 + x2 + x3, data = pc) Residuals: Min 1Q Median 3Q Max -1.29638 -0.47622 0.01059 0.49268 1.69335 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 5.613e+00 8.143e-02 68.932 < 2e-16 *** x1 -3.089e-05 5.150e-06 -5.998 8.58e-08 *** x2 -4.095e-05 3.448e-05 -1.188 0.239 x3 -8.106e-05 6.412e-05 -1.264 0.210 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.691 on 68 degrees of freedom Multiple R-squared: 0.3644, Adjusted R-squared: 0.3364 F-statistic: 12.99 on 3 and 68 DF, p-value: 8.368e-07 x2,x3 is not significance. by pricipal, after PCA, the pcs should significance, but my data is not, why? |
|
|
Re: after PCA, the pc values are so large, wrong?bbslover <dluthm <at> yeah.net> writes:
> [snip] > the fit result below: > Call: > lm(formula = y ~ x1 + x2 + x3, data = pc) > > Residuals: > Min 1Q Median 3Q Max > -1.29638 -0.47622 0.01059 0.49268 1.69335 > > Coefficients: > Estimate Std. Error t value Pr(>|t|) > (Intercept) 5.613e+00 8.143e-02 68.932 < 2e-16 *** > x1 -3.089e-05 5.150e-06 -5.998 8.58e-08 *** > x2 -4.095e-05 3.448e-05 -1.188 0.239 > x3 -8.106e-05 6.412e-05 -1.264 0.210 > --- > Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 > > Residual standard error: 0.691 on 68 degrees of freedom > Multiple R-squared: 0.3644, Adjusted R-squared: 0.3364 > F-statistic: 12.99 on 3 and 68 DF, p-value: 8.368e-07 > > x2,x3 is not significance. by pricipal, after PCA, the pcs should > significance, but my data is not, why? Why is it necessary that the first few principal components should have significant relationships with some other response values? The strength, and weakness, of PCA is that it is calculated *without regard* to a response variable, so it does not constitute "data snooping" ... I may of course have misinterpreted your question, but at a quick look, I don't see anything obviously wrong here. ______________________________________________ R-help@... mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
|
Re: after PCA, the pc values are so large, wrong?ok,I understand your means, maybe PLS is better for my aim. but I have done that, also bad. the most questions for me is how to select less variables from the independent to fit dependent. GA maybe is good way, but I do not learn it well.
|
| Free embeddable forum powered by Nabble | Forum Help |