|
View:
New views
20 Messages
—
Rating Filter:
Alert me
|
| < Prev | 1 - 2 | Next > |
|
|
Staging area for data before read into RI am wondering if there is a better alternative than Excel for data
storage that does not require database knowledge (I will eventually have to learn this, but it is not on my immediate todo list). I need something that is not limited to 256 columns... I don't need any of the built in functions in excel just a spreadsheet like program with cells that hold data in a data.frame format for a staging area before I get it into R. Any help would be greatly appreciated. This is not a direct r question, but all of you folks have more experience than I do and I am having a time finding what I need with google. thanks in advance -- Stephen Sefick Research Scientist Southeastern Natural Sciences Academy Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis ______________________________________________ R-help@... mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
|
Re: Staging area for data before read into RYou didn't say which version of Excel you are using but Excel 2007
allows 16,384 columns. On Mon, Oct 20, 2008 at 2:27 PM, stephen sefick <ssefick@...> wrote: > I am wondering if there is a better alternative than Excel for data > storage that does not require database knowledge (I will eventually > have to learn this, but it is not on my immediate todo list). I need > something that is not limited to 256 columns... I don't need any of > the built in functions in excel just a spreadsheet like program with > cells that hold data in a data.frame format for a staging area before > I get it into R. Any help would be greatly appreciated. This is not > a direct r question, but all of you folks have more experience than I > do and I am having a time finding what I need with google. > thanks in advance > > -- > Stephen Sefick > Research Scientist > Southeastern Natural Sciences Academy > > Let's not spend our time and resources thinking about things that are > so little or so large that all they really do for us is puff us up and > make us feel like gods. We are mammals, and have not exhausted the > annoying little problems of being mammals. > > -K. Mullis > > ______________________________________________ > R-help@... mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@... mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
|
Re: Staging area for data before read into Rsorry excel 2003 with no immediate update in the future.
On Mon, Oct 20, 2008 at 3:12 PM, Gabor Grothendieck <ggrothendieck@...> wrote: > You didn't say which version of Excel you are using but Excel 2007 > allows 16,384 columns. > > On Mon, Oct 20, 2008 at 2:27 PM, stephen sefick <ssefick@...> wrote: >> I am wondering if there is a better alternative than Excel for data >> storage that does not require database knowledge (I will eventually >> have to learn this, but it is not on my immediate todo list). I need >> something that is not limited to 256 columns... I don't need any of >> the built in functions in excel just a spreadsheet like program with >> cells that hold data in a data.frame format for a staging area before >> I get it into R. Any help would be greatly appreciated. This is not >> a direct r question, but all of you folks have more experience than I >> do and I am having a time finding what I need with google. >> thanks in advance >> >> -- >> Stephen Sefick >> Research Scientist >> Southeastern Natural Sciences Academy >> >> Let's not spend our time and resources thinking about things that are >> so little or so large that all they really do for us is puff us up and >> make us feel like gods. We are mammals, and have not exhausted the >> annoying little problems of being mammals. >> >> -K. Mullis >> >> ______________________________________________ >> R-help@... mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > -- Stephen Sefick Research Scientist Southeastern Natural Sciences Academy Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis ______________________________________________ R-help@... mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
|
Re: Staging area for data before read into RDefine "better".
Really, it depends on what you need to do (are all your data appropriately represented in a 2D array?) and what resources are available. If all your data can be represented using a 2D array, then Excel is probably your best bet for th enear term. If not, you might as well bite the bullit and learn to use an RDBMS, as there are few other data management options that can cope with relational or hierarchical or object oriented data. I use a number of different RDBMS (ranging from MS SQL to PostgreSQL and MySQL). I also use Excel on occasion, and plain text editors (like Emacs), to create CSV files. Which I use depends on the details of the particular problem I am facing. While I have not yet explored them, I did notice that R includes a number of facilities for editing data (and the list of options is all the longer when I use help.search("edit"). It may be a bit quicker for you to study up on basic use of something like PostgreSQL, combined with pl/r (something I wish MySQL had), than it would be to diligently examine all the different options open to you using R. (I have a couple books I could recommend that would likely be sufficient for you to figure out what you need to do with either PostgreSQL or MySQL in a matter of a week or two). HTH Ted
|
|
|
|
|
|
Re: Staging area for data before read into RHi Stephen,
You don't say what "staging" is - do you mean for data entry or loading a data file for review, or ... ? In general, I keep away from Excel for data transfer purposes. It tends to make "intelligent" decisions on data types leading to strange & bizarre results (unless you explicitly type each column - which most users don't do). Integers are interpreted as dates, high order zeros are stripped off of ZIP codes, and the like. HTH, Jim Porzak TGN.com San Francisco, CA http://www.linkedin.com/in/jimporzak useR Group SF: http://ia.meetup.com/67/ On Mon, Oct 20, 2008 at 11:27 AM, stephen sefick <ssefick@...> wrote: > I am wondering if there is a better alternative than Excel for data > storage that does not require database knowledge (I will eventually > have to learn this, but it is not on my immediate todo list). I need > something that is not limited to 256 columns... I don't need any of > the built in functions in excel just a spreadsheet like program with > cells that hold data in a data.frame format for a staging area before > I get it into R. Any help would be greatly appreciated. This is not > a direct r question, but all of you folks have more experience than I > do and I am having a time finding what I need with google. > thanks in advance > > -- > Stephen Sefick > Research Scientist > Southeastern Natural Sciences Academy > > Let's not spend our time and resources thinking about things that are > so little or so large that all they really do for us is puff us up and > make us feel like gods. We are mammals, and have not exhausted the > annoying little problems of being mammals. > > -K. Mullis > > ______________________________________________ > R-help@... mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@... mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
|
Re: Staging area for data before read into RThere is a list of free spreadsheets with their row and column limits
at this link: http://en.wikipedia.org/wiki/OpenOffice.org_Calc On Mon, Oct 20, 2008 at 3:13 PM, stephen sefick <ssefick@...> wrote: > sorry excel 2003 with no immediate update in the future. > > On Mon, Oct 20, 2008 at 3:12 PM, Gabor Grothendieck > <ggrothendieck@...> wrote: >> You didn't say which version of Excel you are using but Excel 2007 >> allows 16,384 columns. >> >> On Mon, Oct 20, 2008 at 2:27 PM, stephen sefick <ssefick@...> wrote: >>> I am wondering if there is a better alternative than Excel for data >>> storage that does not require database knowledge (I will eventually >>> have to learn this, but it is not on my immediate todo list). I need >>> something that is not limited to 256 columns... I don't need any of >>> the built in functions in excel just a spreadsheet like program with >>> cells that hold data in a data.frame format for a staging area before >>> I get it into R. Any help would be greatly appreciated. This is not >>> a direct r question, but all of you folks have more experience than I >>> do and I am having a time finding what I need with google. >>> thanks in advance >>> >>> -- >>> Stephen Sefick >>> Research Scientist >>> Southeastern Natural Sciences Academy >>> >>> Let's not spend our time and resources thinking about things that are >>> so little or so large that all they really do for us is puff us up and >>> make us feel like gods. We are mammals, and have not exhausted the >>> annoying little problems of being mammals. >>> >>> -K. Mullis >>> >>> ______________________________________________ >>> R-help@... mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> > > > > -- > Stephen Sefick > Research Scientist > Southeastern Natural Sciences Academy > > Let's not spend our time and resources thinking about things that are > so little or so large that all they really do for us is puff us up and > make us feel like gods. We are mammals, and have not exhausted the > annoying little problems of being mammals. > > -K. Mullis > ______________________________________________ R-help@... mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
|
Re: Staging area for data before read into RWell, I am going to type in ever value because the data sheets are of
counts of insects that I identified, so I should be okay with accuracy... I really just need something that allows for more than 256 columns as I have encounter over 256 species of insects in even small streams. I think calc with it's 1000ish columns will do the trick... thanks everbody for your help. On Mon, Oct 20, 2008 at 8:25 PM, Gabor Grothendieck <ggrothendieck@...> wrote: > There is a list of free spreadsheets with their row and column limits > at this link: > http://en.wikipedia.org/wiki/OpenOffice.org_Calc > > On Mon, Oct 20, 2008 at 3:13 PM, stephen sefick <ssefick@...> wrote: >> sorry excel 2003 with no immediate update in the future. >> >> On Mon, Oct 20, 2008 at 3:12 PM, Gabor Grothendieck >> <ggrothendieck@...> wrote: >>> You didn't say which version of Excel you are using but Excel 2007 >>> allows 16,384 columns. >>> >>> On Mon, Oct 20, 2008 at 2:27 PM, stephen sefick <ssefick@...> wrote: >>>> I am wondering if there is a better alternative than Excel for data >>>> storage that does not require database knowledge (I will eventually >>>> have to learn this, but it is not on my immediate todo list). I need >>>> something that is not limited to 256 columns... I don't need any of >>>> the built in functions in excel just a spreadsheet like program with >>>> cells that hold data in a data.frame format for a staging area before >>>> I get it into R. Any help would be greatly appreciated. This is not >>>> a direct r question, but all of you folks have more experience than I >>>> do and I am having a time finding what I need with google. >>>> thanks in advance >>>> >>>> -- >>>> Stephen Sefick >>>> Research Scientist >>>> Southeastern Natural Sciences Academy >>>> >>>> Let's not spend our time and resources thinking about things that are >>>> so little or so large that all they really do for us is puff us up and >>>> make us feel like gods. We are mammals, and have not exhausted the >>>> annoying little problems of being mammals. >>>> >>>> -K. Mullis >>>> >>>> ______________________________________________ >>>> R-help@... mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>> >> >> >> >> -- >> Stephen Sefick >> Research Scientist >> Southeastern Natural Sciences Academy >> >> Let's not spend our time and resources thinking about things that are >> so little or so large that all they really do for us is puff us up and >> make us feel like gods. We are mammals, and have not exhausted the >> annoying little problems of being mammals. >> >> -K. Mullis >> > -- Stephen Sefick Research Scientist Southeastern Natural Sciences Academy Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis ______________________________________________ R-help@... mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
|
Re: Staging area for data before read into R> Well, I am going to type in ever value because the data sheets are of
> counts of insects that I identified, so I should be okay with > accuracy... I really just need something that allows for more than > 256 columns as I have encounter over 256 species of insects in even > small streams. ... Oh, ugh, that sounds difficult and prone to entry errors. You might be better off organizing in the 'long' format with 4-5 columns: siteID, subSiteID, species, count, comments You can then reshape(), or use the reshape package, or even the 'pivot table' available in excel and other spreadcheats. Glad you have an answer. Enjoy your day. cur -- Curt Seeliger, Data Ranger Raytheon Information Services - Contractor to ORD seeliger.curt@... 541/754-4638 r-help-bounces@... wrote on 10/20/2008 06:01:43 PM: > Well, I am going to type in ever value because the data sheets are of > counts of insects that I identified, so I should be okay with > accuracy... I really just need something that allows for more than > 256 columns as I have encounter over 256 species of insects in even > small streams. I think calc with it's 1000ish columns will do the > trick... thanks everbody for your help. > > On Mon, Oct 20, 2008 at 8:25 PM, Gabor Grothendieck > <ggrothendieck@...> wrote: > > There is a list of free spreadsheets with their row and column limits > > at this link: > > http://en.wikipedia.org/wiki/OpenOffice.org_Calc > > > > On Mon, Oct 20, 2008 at 3:13 PM, stephen sefick <ssefick@...> > >> sorry excel 2003 with no immediate update in the future. > >> > >> On Mon, Oct 20, 2008 at 3:12 PM, Gabor Grothendieck > >> <ggrothendieck@...> wrote: > >>> You didn't say which version of Excel you are using but Excel 2007 > >>> allows 16,384 columns. > >>> > >>> On Mon, Oct 20, 2008 at 2:27 PM, stephen sefick <ssefick@...> wrote: > >>>> I am wondering if there is a better alternative than Excel for data > >>>> storage that does not require database knowledge (I will eventually > >>>> have to learn this, but it is not on my immediate todo list). I need > >>>> something that is not limited to 256 columns... I don't need any of > >>>> the built in functions in excel just a spreadsheet like program with > >>>> cells that hold data in a data.frame format for a staging area before > >>>> I get it into R. Any help would be greatly appreciated. This is not > >>>> a direct r question, but all of you folks have more experience than I > >>>> do and I am having a time finding what I need with google. > >>>> thanks in advance > >>>> > >>>> -- > >>>> Stephen Sefick > >>>> Research Scientist > >>>> Southeastern Natural Sciences Academy > >>>> > >>>> Let's not spend our time and resources thinking about things that are > >>>> so little or so large that all they really do for us is puff us up and > >>>> make us feel like gods. We are mammals, and have not exhausted the > >>>> annoying little problems of being mammals. > >>>> > >>>> -K. Mullis > >>>> > >>>> ______________________________________________ > >>>> R-help@... mailing list > >>>> https://stat.ethz.ch/mailman/listinfo/r-help > >>>> PLEASE do read the posting guide http://www.R-project. > org/posting-guide.html > >>>> and provide commented, minimal, self-contained, reproducible code. > >>>> > >>> > >> > >> > >> > >> -- > >> Stephen Sefick > >> Research Scientist > >> Southeastern Natural Sciences Academy > >> > >> Let's not spend our time and resources thinking about things that are > >> so little or so large that all they really do for us is puff us up > >> make us feel like gods. We are mammals, and have not exhausted the > >> annoying little problems of being mammals. > >> > >> -K. Mullis > >> > > > > > > -- > Stephen Sefick > Research Scientist > Southeastern Natural Sciences Academy > > Let's not spend our time and resources thinking about things that are > so little or so large that all they really do for us is puff us up and > make us feel like gods. We are mammals, and have not exhausted the > annoying little problems of being mammals. > > -K. Mullis > > ______________________________________________ > R-help@... mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] ______________________________________________ R-help@... mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
|
Re: Staging area for data before read into RHow about simply a text editor, typing in your data, separated by commas
or tabs or spaces? One row for each case/subject/observation? R can read that in easily. A good, open-source, free data entry program is EpiData. www.epidata.dk. It is simple to use but probably more than you need for task. --Chris Christopher W. Ryan, MD SUNY Upstate Medical University Clinical Campus at Binghamton 40 Arch Street, Johnson City, NY 13790 cryanatbinghamtondotedu PGP public keys available at http://home.stny.rr.com/ryancw/ "If you want to build a ship, don't drum up the men to gather wood, divide the work and give orders. Instead, teach them to yearn for the vast and endless sea." [Antoine de St. Exupery] stephen sefick wrote: > I am wondering if there is a better alternative than Excel for data > storage that does not require database knowledge (I will eventually > have to learn this, but it is not on my immediate todo list). I need > something that is not limited to 256 columns... I don't need any of > the built in functions in excel just a spreadsheet like program with > cells that hold data in a data.frame format for a staging area before > I get it into R. Any help would be greatly appreciated. This is not > a direct r question, but all of you folks have more experience than I > do and I am having a time finding what I need with google. > thanks in advance > ______________________________________________ R-help@... mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
|
Re: Staging area for data before read into RStephen,
One of the big problems with spreadsheets (other than the column limit in some) is that the standard entry mode allows too much flexibility which does nothing to help you avoid data entry errors. The Webpage: http://www.burns-stat.com/pages/Tutor/spreadsheet_addiction.html has some examples of this going wrong, including one that happened to my group where the column for dates was not preformatted, the dates were entered using European format, and Excel did 2 different wrong things with them making it very difficult to do anything with the data without major extra work. If you are going to stick with a spreadsheet, then at a minimum you should start by naming all your columns, then formatting each column based on the type of data you expect to be entered there. Going the database route is not that much to learn to get started. You can use MSAccess, or the OpenOffice database, create a new table and enter the names of each column along with the data type (this is a big advantage in that it will not allow you to enter character data where numbers are expected, forces dates to look like dates, etc.). It is not that much extra work to enter valid levels for what will become factors (e.g. "Male" and "Female" for sex, so that those are the only values allowed, my current record for datasets entered by others using spreadsheets is 9 sexes). Then you can pick up more as you go along, but setting up the first database to enter data should only take you an hour or so to learn the basics. Another option is to just use R, the following code gives one approach that could get you started entering data: tmp <- rep( list(character(0), numeric(0)), c(2,5) ) names(tmp) <- c( 'ID','Sex', paste('Stream', 1:5, sep='') ) tmp <- as.data.frame(tmp) levels(tmp$Sex) <- c("Female","Male") tmp$ID <- as.character(tmp$ID) mydata <- edit(tmp) Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow@... 801.408.8111 > -----Original Message----- > From: r-help-bounces@... [mailto:r-help-bounces@r- > project.org] On Behalf Of stephen sefick > Sent: Monday, October 20, 2008 7:02 PM > To: R Help > Subject: Re: [R] Staging area for data before read into R > > Well, I am going to type in ever value because the data sheets are of > counts of insects that I identified, so I should be okay with > accuracy... I really just need something that allows for more than > 256 columns as I have encounter over 256 species of insects in even > small streams. I think calc with it's 1000ish columns will do the > trick... thanks everbody for your help. > > On Mon, Oct 20, 2008 at 8:25 PM, Gabor Grothendieck > <ggrothendieck@...> wrote: > > There is a list of free spreadsheets with their row and column limits > > at this link: > > http://en.wikipedia.org/wiki/OpenOffice.org_Calc > > > > On Mon, Oct 20, 2008 at 3:13 PM, stephen sefick <ssefick@...> > wrote: > >> sorry excel 2003 with no immediate update in the future. > >> > >> On Mon, Oct 20, 2008 at 3:12 PM, Gabor Grothendieck > >> <ggrothendieck@...> wrote: > >>> You didn't say which version of Excel you are using but Excel 2007 > >>> allows 16,384 columns. > >>> > >>> On Mon, Oct 20, 2008 at 2:27 PM, stephen sefick <ssefick@...> > wrote: > >>>> I am wondering if there is a better alternative than Excel for > data > >>>> storage that does not require database knowledge (I will > eventually > >>>> have to learn this, but it is not on my immediate todo list). I > need > >>>> something that is not limited to 256 columns... I don't need any > of > >>>> the built in functions in excel just a spreadsheet like program > with > >>>> cells that hold data in a data.frame format for a staging area > before > >>>> I get it into R. Any help would be greatly appreciated. This is > not > >>>> a direct r question, but all of you folks have more experience > than I > >>>> do and I am having a time finding what I need with google. > >>>> thanks in advance > >>>> > >>>> -- > >>>> Stephen Sefick > >>>> Research Scientist > >>>> Southeastern Natural Sciences Academy > >>>> > >>>> Let's not spend our time and resources thinking about things that > are > >>>> so little or so large that all they really do for us is puff us up > and > >>>> make us feel like gods. We are mammals, and have not exhausted > the > >>>> annoying little problems of being mammals. > >>>> > >>>> -K. > Mullis > >>>> > >>>> ______________________________________________ > >>>> R-help@... mailing list > >>>> https://stat.ethz.ch/mailman/listinfo/r-help > >>>> PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > >>>> and provide commented, minimal, self-contained, reproducible code. > >>>> > >>> > >> > >> > >> > >> -- > >> Stephen Sefick > >> Research Scientist > >> Southeastern Natural Sciences Academy > >> > >> Let's not spend our time and resources thinking about things that > are > >> so little or so large that all they really do for us is puff us up > and > >> make us feel like gods. We are mammals, and have not exhausted the > >> annoying little problems of being mammals. > >> > >> -K. > Mullis > >> > > > > > > -- > Stephen Sefick > Research Scientist > Southeastern Natural Sciences Academy > > Let's not spend our time and resources thinking about things that are > so little or so large that all they really do for us is puff us up and > make us feel like gods. We are mammals, and have not exhausted the > annoying little problems of being mammals. > > -K. > Mullis > > ______________________________________________ > R-help@... mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@... mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
|
Re: Staging area for data before read into Rthis is also the sort of thing that EpiData does very well. That's what
it was designed for: data entry with minimal errors. Also simplifies double data entry for error checking, if you need/want that. --Chris Christopher W. Ryan, MD SUNY Upstate Medical University Clinical Campus at Binghamton 40 Arch Street, Johnson City, NY 13790 cryanatbinghamtondotedu PGP public keys available at http://home.stny.rr.com/ryancw/ Greg Snow wrote: > Stephen, > > . . . . . > > Going the database route is not that much to learn to get started. You can use MSAccess, or the OpenOffice database, create a new table and enter the names of each column along with the data type (this is a big advantage in that it will not allow you to enter character data where numbers are expected, forces dates to look like dates, etc.). It is not that much extra work to enter valid levels for what will become factors (e.g. "Male" and "Female" for sex, so that those are the only values allowed, my current record for datasets entered by others using spreadsheets is 9 sexes). Then you can pick up more as you go along, but setting up the first database to enter data should only take you an hour or so to learn the basics. > > > -- > Gregory (Greg) L. Snow Ph.D. > Statistical Data Center > Intermountain Healthcare > greg.snow@... > 801.408.8111 > ______________________________________________ R-help@... mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
|
Re: Staging area for data before read into RExcel has a data validation facility and also has data input forms to
facilitate data entry. On Tue, Oct 21, 2008 at 1:45 PM, Greg Snow <Greg.Snow@...> wrote: > Stephen, > > One of the big problems with spreadsheets (other than the column limit in some) is that the standard entry mode allows too much flexibility which does nothing to help you avoid data entry errors. The Webpage: http://www.burns-stat.com/pages/Tutor/spreadsheet_addiction.html has some examples of this going wrong, including one that happened to my group where the column for dates was not preformatted, the dates were entered using European format, and Excel did 2 different wrong things with them making it very difficult to do anything with the data without major extra work. If you are going to stick with a spreadsheet, then at a minimum you should start by naming all your columns, then formatting each column based on the type of data you expect to be entered there. > > Going the database route is not that much to learn to get started. You can use MSAccess, or the OpenOffice database, create a new table and enter the names of each column along with the data type (this is a big advantage in that it will not allow you to enter character data where numbers are expected, forces dates to look like dates, etc.). It is not that much extra work to enter valid levels for what will become factors (e.g. "Male" and "Female" for sex, so that those are the only values allowed, my current record for datasets entered by others using spreadsheets is 9 sexes). Then you can pick up more as you go along, but setting up the first database to enter data should only take you an hour or so to learn the basics. > > Another option is to just use R, the following code gives one approach that could get you started entering data: > > tmp <- rep( list(character(0), numeric(0)), c(2,5) ) > names(tmp) <- c( 'ID','Sex', paste('Stream', 1:5, sep='') ) > tmp <- as.data.frame(tmp) > levels(tmp$Sex) <- c("Female","Male") > tmp$ID <- as.character(tmp$ID) > > mydata <- edit(tmp) > > Hope this helps, > > -- > Gregory (Greg) L. Snow Ph.D. > Statistical Data Center > Intermountain Healthcare > greg.snow@... > 801.408.8111 > > >> -----Original Message----- >> From: r-help-bounces@... [mailto:r-help-bounces@r- >> project.org] On Behalf Of stephen sefick >> Sent: Monday, October 20, 2008 7:02 PM >> To: R Help >> Subject: Re: [R] Staging area for data before read into R >> >> Well, I am going to type in ever value because the data sheets are of >> counts of insects that I identified, so I should be okay with >> accuracy... I really just need something that allows for more than >> 256 columns as I have encounter over 256 species of insects in even >> small streams. I think calc with it's 1000ish columns will do the >> trick... thanks everbody for your help. >> >> On Mon, Oct 20, 2008 at 8:25 PM, Gabor Grothendieck >> <ggrothendieck@...> wrote: >> > There is a list of free spreadsheets with their row and column limits >> > at this link: >> > http://en.wikipedia.org/wiki/OpenOffice.org_Calc >> > >> > On Mon, Oct 20, 2008 at 3:13 PM, stephen sefick <ssefick@...> >> wrote: >> >> sorry excel 2003 with no immediate update in the future. >> >> >> >> On Mon, Oct 20, 2008 at 3:12 PM, Gabor Grothendieck >> >> <ggrothendieck@...> wrote: >> >>> You didn't say which version of Excel you are using but Excel 2007 >> >>> allows 16,384 columns. >> >>> >> >>> On Mon, Oct 20, 2008 at 2:27 PM, stephen sefick <ssefick@...> >> wrote: >> >>>> I am wondering if there is a better alternative than Excel for >> data >> >>>> storage that does not require database knowledge (I will >> eventually >> >>>> have to learn this, but it is not on my immediate todo list). I >> need >> >>>> something that is not limited to 256 columns... I don't need any >> of >> >>>> the built in functions in excel just a spreadsheet like program >> with >> >>>> cells that hold data in a data.frame format for a staging area >> before >> >>>> I get it into R. Any help would be greatly appreciated. This is >> not >> >>>> a direct r question, but all of you folks have more experience >> than I >> >>>> do and I am having a time finding what I need with google. >> >>>> thanks in advance >> >>>> >> >>>> -- >> >>>> Stephen Sefick >> >>>> Research Scientist >> >>>> Southeastern Natural Sciences Academy >> >>>> >> >>>> Let's not spend our time and resources thinking about things that >> are >> >>>> so little or so large that all they really do for us is puff us up >> and >> >>>> make us feel like gods. We are mammals, and have not exhausted >> the >> >>>> annoying little problems of being mammals. >> >>>> >> >>>> -K. >> Mullis >> >>>> >> >>>> ______________________________________________ >> >>>> R-help@... mailing list >> >>>> https://stat.ethz.ch/mailman/listinfo/r-help >> >>>> PLEASE do read the posting guide http://www.R-project.org/posting- >> guide.html >> >>>> and provide commented, minimal, self-contained, reproducible code. >> >>>> >> >>> >> >> >> >> >> >> >> >> -- >> >> Stephen Sefick >> >> Research Scientist >> >> Southeastern Natural Sciences Academy >> >> >> >> Let's not spend our time and resources thinking about things that >> are >> >> so little or so large that all they really do for us is puff us up >> and >> >> make us feel like gods. We are mammals, and have not exhausted the >> >> annoying little problems of being mammals. >> >> >> >> -K. >> Mullis >> >> >> > >> >> >> >> -- >> Stephen Sefick >> Research Scientist >> Southeastern Natural Sciences Academy >> >> Let's not spend our time and resources thinking about things that are >> so little or so large that all they really do for us is puff us up and >> make us feel like gods. We are mammals, and have not exhausted the >> annoying little problems of being mammals. >> >> -K. >> Mullis >> >> ______________________________________________ >> R-help@... mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting- >> guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help@... mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@... mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
|
Re: Staging area for data before read into RThere are tradeoffs no matter what route you take.
I worked on a project a few years ago, repairing an MS Access DB that had been constructed, data entry forms and all, by one of the consulting engineers. They supported that development because they found that even with all the power and utility of Excel, in support of data entry, errors were still much too common, requiring significant time and therefore money to address. The more data you have, the more costly it is to find and repair the data when something goes awry. Their problem was that it was put together in haste by a man who knew nothing about RDBMS. He was learning as he went. While what he produced was adequate for a single site with no turnover in staff, and would have been fine if it was intended for his own use in his consulting practice, it would inevitably have broken the moment the service was extended to more than one site or the moment there was any turnover in staff at all. The client he delivered it to was a large mining company that wanted to deploy it on all their mines and ore processing facilities. Yes the error rate in data entry went way down, but the mistake was in trying to deliver a software product to a client without the input of an experienced software engineer. You can do validation in Access as you can in Excel, but Excel is not designed to manage data where Access is, and both are crippled by their dependance on VB (a seriouusly broken language: fine for scripting MS Office, but not what you want to develop a real application - not that the OP wants that anyway). I don't want to beat on Excel, as it is a useful tool when used for what it is designed for; and others have pointed out some hazards when using it. Dr. Snow is right in recommending going the route of using an RDBMS and in saying that it isn't that hard to get started. I'd be recommending PostgreSQL, though, since it is relatively easy to use, and it has pl/r (which lets you run R code within stored procedures in the DB) which carries obvious advantages. The bottom line is that the best option depends on your objectives and what you need to do. You can use Excel quite effectively if you are careful and know what you're doing. If you are going to manage data that will require significant effort to enter, you may want something a bit more robust and better designed to manage your data. If you are going to deliver services based on your software, you need a software engineer to ensure it doesn't break on your client (that could be quite costly). Since the OP is apparently using it for his own purposes, and unlikely to be selling the data or services based on it, the services of an engineer aren't needed, though they can be useful if there are concerns about administering the DB so as to guarantee the security of the data. I could tell you tales of how data that cost millions of dollars to collect were almost lost because a consultant was careless in this regard and made mistakes in handling the data. Fortunately, recovery was quick in these cases because my colleagues were diligent in maintaining backups. But you get the point. Murphy's law says that whatever can go wrong, will. There are plenty of options, and the OP will need to do what he's most confortable doing. If I were in his place, I'd say my data is sacred, and can not be replaced (just as you can't step into the same stream twice); and therefore I'd use a RDBMS to manage it, and the very moment it is all entered, I'd make a backup of both the data (e.g. in MySQL I'd use mysqldump) AND the software, and copy both backups to two CDs or DVDs. And, if the data were originally recorded on paper, I'd be scanning the pages and copying those images onto a couple CDs or DVDs also: with two copies on optical media, one copy can be stored in a fireproof vault while the other is in the office ready to be used should a HDD fail, or some other disaster interrupt my work. OK, so I'm paranoid about my data, but I'd rather go the extra mile than risk losing it. Cheers, Ted
|
|
|
Re: Staging area for data before read into ROn 22/10/2008, at 8:18 AM, Ted Byers wrote: <snip> > ... even with all the power and utility of Excel ... <snip> Is this some kind of joke? cheers, Rolf Turner ###################################################################### Attention:\ This e-mail message is privileged and confid...{{dropped:9}} ______________________________________________ R-help@... mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
|
Re: Staging area for data before read into ROn Tue, Oct 21, 2008 at 3:18 PM, Ted Byers <r.ted.byers@...> wrote:
> There are tradeoffs no matter what route you take. > You can do validation in Access as you can in Excel, but Excel is not > designed to manage data where Access is, and both are crippled by their > dependance on VB (a seriouusly broken language: fine for scripting MS Excel can do validation without VB. For example, you can restrict data to a certain range of dates, limit choices by using a list, or make sure that only positive whole numbers are entered all without any VB. ______________________________________________ R-help@... mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
|
Re: Staging area for data before read into RI wasn't suggesting that the validation requires VB.
Creating forms and handling form events does (unless MS has introduced new utilities to hide all that since last I used it). Some of the most interesting things I have seen done with Excel did involve VB, and there are better tools to do most of those things.
|
|
|
Re: Staging area for data before read into RNo. Excel, like most spreadsheets, does what it designed for reasonably well. It is easy to find fault, but not so easy to satisfy all one's critics. There is no doubt that Excel has faults, but it provides significant modelling and analysis capability to users with no programming expertise or limited experience using IT.
I have used it as a teaching tool for very basic modelling to undergraduate students who would not have been able to do any modelling without it. In a one session course, there just isn't time to teach students enough programming in any language for them to have a hope of producing an interesting model. But they can produce an interesting model with some guidance using Excel. Similarly, they can do elementary data analysis entering their data into Excel and using it to analyse it. Excel was designed primarily for business people, and I have seen them use it effectively, doing things I don't fully understand (as I am not a businessman). But these same people would go into a catatonic state the moment a discussion becomes technical or mathematical. They describe Excel as powerful, and until I become an expert MBA type, I won't knock them for that. If they find it useful, why would I argue with them. Don't get me wrong, I do not normally use it, and for 99% of the work I do, it provides no value to me, so I do not have it installed on my own systems. I am better served by C++, Java, and the related tools specific to my work. But that it isn't useful to me, or apparently you, is not sufficient grounds to question its utility for others (neither is the existance of bugs, as ALL software has bugs: MS makes for an easy target, but I try to be as fair to them as I am to an independant developer who works alone - lets not have this degenerate into an attack on MS, please). As a software engineer myself, I won't knock the work of another just because what he's produced isn't particularly useful for me. I won't even knock him if I don't agree with the design decisions he's made. When that happens, it is likely I was not part of his intended market: nothing more can be implied.
|
|
|
Re: Staging area for data before read into RYou can create data entry forms without VB in Excel too.
On Tue, Oct 21, 2008 at 5:09 PM, Ted Byers <r.ted.byers@...> wrote: > > I wasn't suggesting that the validation requires VB. > > Creating forms and handling form events does (unless MS has introduced new > utilities to hide all that since last I used it). > > Some of the most interesting things I have seen done with Excel did involve > VB, and there are better tools to do most of those things. > > Gabor Grothendieck wrote: >> >> On Tue, Oct 21, 2008 at 3:18 PM, Ted Byers <r.ted.byers@...> wrote: >>> There are tradeoffs no matter what route you take. >>> You can do validation in Access as you can in Excel, but Excel is not >>> designed to manage data where Access is, and both are crippled by their >>> dependance on VB (a seriouusly broken language: fine for scripting MS >> >> Excel can do validation without VB. For example, you can restrict >> data to a certain range of dates, limit choices by using a list, or >> make sure that only positive whole numbers are entered all without >> any VB. >> >> ______________________________________________ >> R-help@... mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> > > -- > View this message in context: http://www.nabble.com/Staging-area-for-data-before-read-into-R-tp20075962p20099445.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@... mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@... mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
|
Re: Staging area for data before read into RAh, OK. That is new since I used Excel last.
Thanks On Tue, Oct 21, 2008 at 5:52 PM, Gabor Grothendieck <ggrothendieck@...> wrote: > You can create data entry forms without VB in Excel too. > > On Tue, Oct 21, 2008 at 5:09 PM, Ted Byers <r.ted.byers@...> wrote: >> >> I wasn't suggesting that the validation requires VB. >> >> Creating forms and handling form events does (unless MS has introduced new >> utilities to hide all that since last I used it). >> >> Some of the most interesting things I have seen done with Excel did involve >> VB, and there are better tools to do most of those things. >> >> Gabor Grothendieck wrote: >>> >>> On Tue, Oct 21, 2008 at 3:18 PM, Ted Byers <r.ted.byers@...> wrote: >>>> There are tradeoffs no matter what route you take. >>>> You can do validation in Access as you can in Excel, but Excel is not >>>> designed to manage data where Access is, and both are crippled by their >>>> dependance on VB (a seriouusly broken language: fine for scripting MS >>> >>> Excel can do validation without VB. For example, you can restrict >>> data to a certain range of dates, limit choices by using a list, or >>> make sure that only positive whole numbers are entered all without >>> any VB. >>> >>> ______________________________________________ >>> R-help@... mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >>> >> >> -- >> View this message in context: http://www.nabble.com/Staging-area-for-data-before-read-into-R-tp20075962p20099445.html >> Sent from the R help mailing list archive at Nabble.com. >> >> ______________________________________________ >> R-help@... mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > ______________________________________________ R-help@... mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
| < Prev | 1 - 2 | Next > |
| Free embeddable forum powered by Nabble | Forum Help |