diff in a dataframe

View: New views
8 Messages — Rating Filter:   Alert me  

diff in a dataframe

by Vishal Belsare :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I have a dataframe say:

date   price_g   price_s
         0.34        0.56
         0.36        0.76
           .              .
           .              .
           .              .

and so on. say, 1000 rows.

Is it possible to add two columns to this dataframe, by computing say
diff(log(price_g) and diff(log(price_s)) ?

The elements in the first row of these columns cannot be computed, but
can I coerce this to happen and assign a missing value there? It would
be really great if I could do that, because in this case I don't have
to re-index my transformed series to the dates again in a new
dataframe.

Thanks in anticipation.


Vishal Belsare

______________________________________________
R-help@... mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: diff in a dataframe

by Henrique Dallazuanna :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Perhaps you can do this:

cbind(df, sapply(rbind(c(NA, NA),log(df)), diff))

On 10/01/2008, Vishal Belsare <shoot.spam@...> wrote:

> I have a dataframe say:
>
> date   price_g   price_s
>          0.34        0.56
>          0.36        0.76
>            .              .
>            .              .
>            .              .
>
> and so on. say, 1000 rows.
>
> Is it possible to add two columns to this dataframe, by computing say
> diff(log(price_g) and diff(log(price_s)) ?
>
> The elements in the first row of these columns cannot be computed, but
> can I coerce this to happen and assign a missing value there? It would
> be really great if I could do that, because in this case I don't have
> to re-index my transformed series to the dates again in a new
> dataframe.
>
> Thanks in anticipation.
>
>
> Vishal Belsare
>
> ______________________________________________
> R-help@... mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


--
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

______________________________________________
R-help@... mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: diff in a dataframe

by bartjoosen :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Maybe:
cbind(df,rbind(NA,apply(log(df),2,diff)))


Bart


Vishal Belsare wrote:
I have a dataframe say:

date   price_g   price_s
         0.34        0.56
         0.36        0.76
           .              .
           .              .
           .              .

and so on. say, 1000 rows.

Is it possible to add two columns to this dataframe, by computing say
diff(log(price_g) and diff(log(price_s)) ?

The elements in the first row of these columns cannot be computed, but
can I coerce this to happen and assign a missing value there? It would
be really great if I could do that, because in this case I don't have
to re-index my transformed series to the dates again in a new
dataframe.

Thanks in anticipation.


Vishal Belsare

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: diff in a dataframe

by Don MacQueen :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

So, what's the easiest way to add a column to a dataframe? Just do it.

Here is a really simple example to illustrate:

>  foo <- data.frame(x=1:4,y=2:5)
>  foo
   x y
1 1 2
2 2 3
3 3 4
4 4 5
>  foo$z <- c(NA,diff(foo$x))
>  foo
   x y  z
1 1 2 NA
2 2 3  1
3 3 4  1
4 4 5  1


Solutions using apply and sapply are overly complicated for what you requested.
If you had to do this for many, many columns a looping solution would
be worth it, but for just two columns, it's not.

-Don



At 12:46 PM +0530 1/10/08, Vishal Belsare wrote:

>I have a dataframe say:
>
>date   price_g   price_s
>          0.34        0.56
>          0.36        0.76
>            .              .
>            .              .
>            .              .
>
>and so on. say, 1000 rows.
>
>Is it possible to add two columns to this dataframe, by computing say
>diff(log(price_g) and diff(log(price_s)) ?
>
>The elements in the first row of these columns cannot be computed, but
>can I coerce this to happen and assign a missing value there? It would
>be really great if I could do that, because in this case I don't have
>to re-index my transformed series to the dates again in a new
>dataframe.
>
>Thanks in anticipation.
>
>
>Vishal Belsare
>
>______________________________________________
>R-help@... mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.


--
--------------------------------------
Don MacQueen
Environmental Protection Department
Lawrence Livermore National Laboratory
Livermore, CA, USA
925-423-1062

______________________________________________
R-help@... mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Parent Message unknown Re: diff in a dataframe

by silcha :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello Vishal
Maybe this is what you want?

x<-c(0.34,0.36,3)
y<-c(0.56,0.76,4)
a<-data.frame(x,y)
>a
     x    y
1 0.34 0.56
2 0.36 0.76
3 3.00 4.00

a$diff<-(log(a$x))
 a$diff2<-(log(a$y))
>a
     x    y      diff      diff2
1 0.34 0.56 -1.078810 -0.5798185
2 0.36 0.76 -1.021651 -0.2744368
3 3.00 4.00  1.098612  1.3862944
 
and if you wnat to  coerce the first row to a whatever
value just type
a[1,]<-NA


Cheers
A
 


----- Messaggio originale -----
Da: Vishal Belsare <shoot.spam@...>
A: r-help@...
Inviato: Mercoledì 9 gennaio 2008, 23:16:38
Oggetto: [R] diff in a dataframe

I have a dataframe say:

date   price_g   price_s
         0.34        0.56
         0.36        0.76
           .              .
           .              .
           .              .

and so on. say, 1000 rows.

Is it possible to add two columns to this dataframe, by computing say
diff(log(price_g) and diff(log(price_s)) ?

The elements in the first row of these columns cannot be computed, but
can I coerce this to happen and assign a missing value there? It would
be really great if I could do that, because in this case I don't have
to re-index my transformed series to the dates again in a new
dataframe.

Thanks in anticipation.


Vishal Belsare

______________________________________________
R-help@... mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.






      ___________________________________

ttp://it.docs.yahoo.com/nowyoucan.html
        [[alternative HTML version deleted]]


______________________________________________
R-help@... mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: diff in a dataframe

by Gabor Grothendieck :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Represent this as a time series.  Using
the zoo package:

> library(zoo)
> z <- zoo(cbind(price_g = c(0.34, 0.36), price_s = c(0.56, 0.76)), as.Date(c("2000-01-01", "2000-01-05")))
> diff(log(z))
              price_g   price_s
2000-01-05 0.05715841 0.3053816
> diff(log(z), na.pad = TRUE)
              price_g   price_s
2000-01-01         NA        NA
2000-01-05 0.05715841 0.3053816


See the two zoo vignettes:
vignette("zoo")
vignette("zoo-quickref")

On Jan 10, 2008 2:16 AM, Vishal Belsare <shoot.spam@...> wrote:

> I have a dataframe say:
>
> date   price_g   price_s
>         0.34        0.56
>         0.36        0.76
>           .              .
>           .              .
>           .              .
>
> and so on. say, 1000 rows.
>
> Is it possible to add two columns to this dataframe, by computing say
> diff(log(price_g) and diff(log(price_s)) ?
>
> The elements in the first row of these columns cannot be computed, but
> can I coerce this to happen and assign a missing value there? It would
> be really great if I could do that, because in this case I don't have
> to re-index my transformed series to the dates again in a new
> dataframe.
>
> Thanks in anticipation.
>
>
> Vishal Belsare
>
> ______________________________________________
> R-help@... mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
R-help@... mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: diff in a dataframe

by Vishal Belsare :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Thanks Henrique, Don and Gabor. I did come around to solving it by
using the zoo library. Very useful stuff that one for handling a bunch
of long irregular time series.

Gabor, thanks for your present and previous responses. The quickref
was indeed helpful. I do have another question regarding zoo however.
Say I have a zoo object named X and say it has 200 time series within
it. Each has a unique column name. I tried to retrieve merely one
column (time series) from X, trying variously: X$columnname, or
X["columnname"] or X[["columnname"]] in the hope that I would be able
to get that one time series, but the only way which seemed to work is
X[,"columnname"]

Is that the 'correct' way to retrieve a single time series from a zoo
of multiple time series? I would think that it'd be cooler if we could
merely do a : X$columnname sort of thing. Please enlighten. Thanks
much!

Cheers,

Vishal


On Jan 10, 2008 10:40 PM, Gabor Grothendieck <ggrothendieck@...> wrote:

> Represent this as a time series.  Using
> the zoo package:
>
> > library(zoo)
> > z <- zoo(cbind(price_g = c(0.34, 0.36), price_s = c(0.56, 0.76)), as.Date(c("2000-01-01", "2000-01-05")))
> > diff(log(z))
>               price_g   price_s
> 2000-01-05 0.05715841 0.3053816
> > diff(log(z), na.pad = TRUE)
>               price_g   price_s
> 2000-01-01         NA        NA
> 2000-01-05 0.05715841 0.3053816
>
>
> See the two zoo vignettes:
> vignette("zoo")
> vignette("zoo-quickref")
>
>
> On Jan 10, 2008 2:16 AM, Vishal Belsare <shoot.spam@...> wrote:
> > I have a dataframe say:
> >
> > date   price_g   price_s
> >         0.34        0.56
> >         0.36        0.76
> >           .              .
> >           .              .
> >           .              .
> >
> > and so on. say, 1000 rows.
> >
> > Is it possible to add two columns to this dataframe, by computing say
> > diff(log(price_g) and diff(log(price_s)) ?
> >
> > The elements in the first row of these columns cannot be computed, but
> > can I coerce this to happen and assign a missing value there? It would
> > be really great if I could do that, because in this case I don't have
> > to re-index my transformed series to the dates again in a new
> > dataframe.
> >
> > Thanks in anticipation.
> >
> >
> > Vishal Belsare
> >
>
> > ______________________________________________
> > R-help@... mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>

______________________________________________
R-help@... mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: diff in a dataframe

by Gabor Grothendieck :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

This is consistent with how matrices and ts series in R work: they all
use x[,j] only.

On Jan 10, 2008 1:08 PM, Vishal Belsare <shoot.spam@...> wrote:

> Thanks Henrique, Don and Gabor. I did come around to solving it by
> using the zoo library. Very useful stuff that one for handling a bunch
> of long irregular time series.
>
> Gabor, thanks for your present and previous responses. The quickref
> was indeed helpful. I do have another question regarding zoo however.
> Say I have a zoo object named X and say it has 200 time series within
> it. Each has a unique column name. I tried to retrieve merely one
> column (time series) from X, trying variously: X$columnname, or
> X["columnname"] or X[["columnname"]] in the hope that I would be able
> to get that one time series, but the only way which seemed to work is
> X[,"columnname"]
>
> Is that the 'correct' way to retrieve a single time series from a zoo
> of multiple time series? I would think that it'd be cooler if we could
> merely do a : X$columnname sort of thing. Please enlighten. Thanks
> much!
>
> Cheers,
>
> Vishal
>
>
>
> On Jan 10, 2008 10:40 PM, Gabor Grothendieck <ggrothendieck@...> wrote:
> > Represent this as a time series.  Using
> > the zoo package:
> >
> > > library(zoo)
> > > z <- zoo(cbind(price_g = c(0.34, 0.36), price_s = c(0.56, 0.76)), as.Date(c("2000-01-01", "2000-01-05")))
> > > diff(log(z))
> >               price_g   price_s
> > 2000-01-05 0.05715841 0.3053816
> > > diff(log(z), na.pad = TRUE)
> >               price_g   price_s
> > 2000-01-01         NA        NA
> > 2000-01-05 0.05715841 0.3053816
> >
> >
> > See the two zoo vignettes:
> > vignette("zoo")
> > vignette("zoo-quickref")
> >
> >
> > On Jan 10, 2008 2:16 AM, Vishal Belsare <shoot.spam@...> wrote:
> > > I have a dataframe say:
> > >
> > > date   price_g   price_s
> > >         0.34        0.56
> > >         0.36        0.76
> > >           .              .
> > >           .              .
> > >           .              .
> > >
> > > and so on. say, 1000 rows.
> > >
> > > Is it possible to add two columns to this dataframe, by computing say
> > > diff(log(price_g) and diff(log(price_s)) ?
> > >
> > > The elements in the first row of these columns cannot be computed, but
> > > can I coerce this to happen and assign a missing value there? It would
> > > be really great if I could do that, because in this case I don't have
> > > to re-index my transformed series to the dates again in a new
> > > dataframe.
> > >
> > > Thanks in anticipation.
> > >
> > >
> > > Vishal Belsare
> > >
> >
> > > ______________________________________________
> > > R-help@... mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> >
>

______________________________________________
R-help@... mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.