disaggregate frequency table into flat file

View: New views
6 Messages — Rating Filter:   Alert me  

disaggregate frequency table into flat file

by maiya :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

i appologise for the trivialness of this post - but i've been searching the forum wothout luck - probably simply because it's late and my brain is starting to go..

i have a frequency table as a matrix:

orig<-matrix(c(40,5,30,25), c(2,2))
orig
     [,1] [,2]
[1,]   40   30
[2,]    5   25

i basically need a random sample say 10 from 100:

     [,1] [,2]
[1,]   5   2
[2,]    0   3

i got as far as

orig<-as.data.frame.table(orig)
orig
 Var1 Var2 Freq
1    A    A   10
2    B    A    5
3    A    B   30
4    B    B   25

and then perhaps

individ<-rep(1:4, times=orig$Freq)

which gives a vector of the 100 individuals in each of the 4 groups - cells, but I'm
(a) stuck here and
(b) afraid this is a very round-about way at getting to what I want i.e. I can now sample(individ, 10), but then I'll have a heck of a time getting the result back into the original matrix form....

sorry again, just please tell me the simple solution that I've missed?

thanks!

maja

Re: disaggregate frequency table into flat file

by jholtman :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Not exactly clear what you are asking for.  Your data.frame.table does not
seem related to the original 'orig'.  What exactly are you expecting as
output?

On Wed, May 21, 2008 at 10:16 PM, maiya <maja.zaloznik@...> wrote:

>
> i appologise for the trivialness of this post - but i've been searching the
> forum wothout luck - probably simply because it's late and my brain is
> starting to go..
>
> i have a frequency table as a matrix:
>
> orig<-matrix(c(40,5,30,25), c(2,2))
> orig
>     [,1] [,2]
> [1,]   40   30
> [2,]    5   25
>
> i basically need a random sample say 10 from 100:
>
>     [,1] [,2]
> [1,]   5   2
> [2,]    0   3
>
> i got as far as
>
> orig<-as.data.frame.table(orig)
> orig
>  Var1 Var2 Freq
> 1    A    A   10
> 2    B    A    5
> 3    A    B   30
> 4    B    B   25
>
> and then perhaps
>
> individ<-rep(1:4, times=orig$Freq)
>
> which gives a vector of the 100 individuals in each of the 4 groups -
> cells,
> but I'm
> (a) stuck here and
> (b) afraid this is a very round-about way at getting to what I want i.e. I
> can now sample(individ, 10), but then I'll have a heck of a time getting
> the
> result back into the original matrix form....
>
> sorry again, just please tell me the simple solution that I've missed?
>
> thanks!
>
> maja
>
> --
> View this message in context:
> http://www.nabble.com/disaggregate-frequency-table-into-flat-file-tp17396040p17396040.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help@... mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



--
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

        [[alternative HTML version deleted]]

______________________________________________
R-help@... mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: disaggregate frequency table into flat file

by maiya :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

sorry, my mistake!
the data frame should read:
orig<-as.data.frame.table(orig)
orig
 Var1 Var2 Freq
1    A    A   40
2    B    A    5
3    A    B   30
4    B    B   25

but basicaly i would simply like a sample of the original matrix ( which is a frequency table/contingency table/crosstabulation)

hope this is clearer now!

maja





jholtman wrote:
Not exactly clear what you are asking for.  Your data.frame.table does not
seem related to the original 'orig'.  What exactly are you expecting as
output?

On Wed, May 21, 2008 at 10:16 PM, maiya <maja.zaloznik@gmail.com> wrote:

>
> i appologise for the trivialness of this post - but i've been searching the
> forum wothout luck - probably simply because it's late and my brain is
> starting to go..
>
> i have a frequency table as a matrix:
>
> orig<-matrix(c(40,5,30,25), c(2,2))
> orig
>     [,1] [,2]
> [1,]   40   30
> [2,]    5   25
>
> i basically need a random sample say 10 from 100:
>
>     [,1] [,2]
> [1,]   5   2
> [2,]    0   3
>
> i got as far as
>
> orig<-as.data.frame.table(orig)
> orig
>  Var1 Var2 Freq
> 1    A    A   10
> 2    B    A    5
> 3    A    B   30
> 4    B    B   25
>
> and then perhaps
>
> individ<-rep(1:4, times=orig$Freq)
>
> which gives a vector of the 100 individuals in each of the 4 groups -
> cells,
> but I'm
> (a) stuck here and
> (b) afraid this is a very round-about way at getting to what I want i.e. I
> can now sample(individ, 10), but then I'll have a heck of a time getting
> the
> result back into the original matrix form....
>
> sorry again, just please tell me the simple solution that I've missed?
>
> thanks!
>
> maja
>
> --
> View this message in context:
> http://www.nabble.com/disaggregate-frequency-table-into-flat-file-tp17396040p17396040.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



--
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: disaggregate frequency table into flat file

by Charilaos Skiadas-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On May 22, 2008, at 8:56 AM, maiya wrote:

>
> sorry, my mistake!
> the data frame should read:
> orig<-as.data.frame.table(orig)
> orig
>  Var1 Var2 Freq
> 1    A    A   40
> 2    B    A    5
> 3    A    B   30
> 4    B    B   25
>
> but basicaly i would simply like a sample of the original matrix  
> ( which is
> a frequency table/contingency table/crosstabulation)
>
> hope this is clearer now!
>
This should get you started:

with(list(x=sample(1:4, 10, prob=orig$Freq, replace=TRUE)), sapply
(1:4, function(k) sum(x==k)))

Or you can break it up in two steps (at the cost of creating a new  
variable):

x <- sample(1:4, 10, prob=orig$Freq, replace=TRUE)
sapply(1:4, function(k) sum(x==k))

> maja

Haris Skiadas
Department of Mathematics and Computer Science
Hanover College

>
> jholtman wrote:
>>
>> Not exactly clear what you are asking for.  Your data.frame.table  
>> does not
>> seem related to the original 'orig'.  What exactly are you  
>> expecting as
>> output?
>>
>> On Wed, May 21, 2008 at 10:16 PM, maiya <maja.zaloznik@...>  
>> wrote:
>>
>>>
>>> i appologise for the trivialness of this post - but i've been  
>>> searching
>>> the
>>> forum wothout luck - probably simply because it's late and my  
>>> brain is
>>> starting to go..
>>>
>>> i have a frequency table as a matrix:
>>>
>>> orig<-matrix(c(40,5,30,25), c(2,2))
>>> orig
>>>     [,1] [,2]
>>> [1,]   40   30
>>> [2,]    5   25
>>>
>>> i basically need a random sample say 10 from 100:
>>>
>>>     [,1] [,2]
>>> [1,]   5   2
>>> [2,]    0   3
>>>
>>> i got as far as
>>>
>>> orig<-as.data.frame.table(orig)
>>> orig
>>>  Var1 Var2 Freq
>>> 1    A    A   10
>>> 2    B    A    5
>>> 3    A    B   30
>>> 4    B    B   25
>>>
>>> and then perhaps
>>>
>>> individ<-rep(1:4, times=orig$Freq)
>>>
>>> which gives a vector of the 100 individuals in each of the 4  
>>> groups -
>>> cells,
>>> but I'm
>>> (a) stuck here and
>>> (b) afraid this is a very round-about way at getting to what I  
>>> want i.e.
>>> I
>>> can now sample(individ, 10), but then I'll have a heck of a time  
>>> getting
>>> the
>>> result back into the original matrix form....
>>>
>>> sorry again, just please tell me the simple solution that I've  
>>> missed?
>>>
>>> thanks!
>>>
>>> maja
>>>
>>> --
>>> View this message in context:
>>> http://www.nabble.com/disaggregate-frequency-table-into-flat-file- 
>>> tp17396040p17396040.html
>>> Sent from the R help mailing list archive at Nabble.com.
>>>
>>> ______________________________________________
>>> R-help@... mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html<http://www.r- 
>>> project.org/posting-guide.html>
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>>
>> --
>> Jim Holtman
>> Cincinnati, OH
>> +1 513 646 9390
>>
>> What is the problem you are trying to solve?
>>
>> [[alternative HTML version deleted]]
>>

______________________________________________
R-help@... mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: disaggregate frequency table into flat file

by Marc Schwartz :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Is this what you want?

 > xtabs(Freq ~ Var1 + Var2, data = orig)
     Var2
Var1  A  B
    A 40 30
    B  5 25

See ?xtabs


Or is this what you want?

expand.dft <- function(x, na.strings = "NA", as.is = FALSE, dec = ".")
{
   DF <- sapply(1:nrow(x), function(i) x[rep(i, each = x$Freq[i]), ],
                simplify = FALSE)

   DF <- subset(do.call("rbind", DF), select = -Freq)

   for (i in 1:ncol(DF))
   {
     DF[[i]] <- type.convert(as.character(DF[[i]]),
                             na.strings = na.strings,
                             as.is = as.is, dec = dec)

   }

   DF
}



DF <- expand.dft(orig)

 > str(DF)
'data.frame': 100 obs. of  2 variables:
  $ Var1: Factor w/ 2 levels "A","B": 1 1 1 1 1 1 1 1 1 1 ...
  $ Var2: Factor w/ 2 levels "A","B": 1 1 1 1 1 1 1 1 1 1 ...


HTH,

Marc Schwartz


on 05/22/2008 07:56 AM maiya wrote:

> sorry, my mistake!
> the data frame should read:
> orig<-as.data.frame.table(orig)
> orig
>  Var1 Var2 Freq
> 1    A    A   40
> 2    B    A    5
> 3    A    B   30
> 4    B    B   25
>
> but basicaly i would simply like a sample of the original matrix ( which is
> a frequency table/contingency table/crosstabulation)
>
> hope this is clearer now!
>
> maja
>
>
>
>
>
>
> jholtman wrote:
>> Not exactly clear what you are asking for.  Your data.frame.table does not
>> seem related to the original 'orig'.  What exactly are you expecting as
>> output?
>>
>> On Wed, May 21, 2008 at 10:16 PM, maiya <maja.zaloznik@...> wrote:
>>
>>> i appologise for the trivialness of this post - but i've been searching
>>> the
>>> forum wothout luck - probably simply because it's late and my brain is
>>> starting to go..
>>>
>>> i have a frequency table as a matrix:
>>>
>>> orig<-matrix(c(40,5,30,25), c(2,2))
>>> orig
>>>     [,1] [,2]
>>> [1,]   40   30
>>> [2,]    5   25
>>>
>>> i basically need a random sample say 10 from 100:
>>>
>>>     [,1] [,2]
>>> [1,]   5   2
>>> [2,]    0   3
>>>
>>> i got as far as
>>>
>>> orig<-as.data.frame.table(orig)
>>> orig
>>>  Var1 Var2 Freq
>>> 1    A    A   10
>>> 2    B    A    5
>>> 3    A    B   30
>>> 4    B    B   25
>>>
>>> and then perhaps
>>>
>>> individ<-rep(1:4, times=orig$Freq)
>>>
>>> which gives a vector of the 100 individuals in each of the 4 groups -
>>> cells,
>>> but I'm
>>> (a) stuck here and
>>> (b) afraid this is a very round-about way at getting to what I want i.e.
>>> I
>>> can now sample(individ, 10), but then I'll have a heck of a time getting
>>> the
>>> result back into the original matrix form....
>>>
>>> sorry again, just please tell me the simple solution that I've missed?
>>>
>>> thanks!
>>>
>>> maja
>>>
>

______________________________________________
R-help@... mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: disaggregate frequency table into flat file

by maiya :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Marc, it's the second "expansion" type transformation I was after, although your expand.dft looks quite complicated? here's what I finaly came up with - the bold lines correspond to what expand.dft  does?


> orig<-matrix(c(40,5,30,25), c(2,2))
> orig
     [,1] [,2]
[1,]   40   30
[2,]    5   25
> flat<-as.data.frame.table(orig)
> ind<-rep(1:nrow(flat), times=flat$Freq)
> flat<-flat[ind,-3]

> sample<-matrix(table(flat[sample(1:length(ind),10),]), c(2,2))
> sample
     [,1] [,2]
[1,]    4    2
[2,]    1    3

So i get from the orig matrix to the sample matrix, expanding and contracting it back in between!

It's just that I was hoping there was a more direct way of doing it!
Thanks!
maja