Check whether 9 string variables are identical over some 70 "respondents"

View: New views
8 Messages — Rating Filter:   Alert me  

Check whether 9 string variables are identical over some 70 "respondents"

by Ruben van den Berg :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Some parts of this message have been removed. Learn more about Nabble's security policy.
Dear all,
 
I merged 9 data files with ADD FILES. However, to make sure that the variable labels are identical over the 9 files, I made a table with a single column of variable names and the corresponding variable labels for each of the 9 files (so 10 string variables in total). Since the original files had a set of some 70 variables in common, my 'variable label table' has some 70 lines. Ideally, all variable labels should be identical but on visual inspection I've already spotted some slight differences.
 
What I was thinking about, is to count the number of different values within 'respondents' over my 9 string variables in order to identify those variables for which labels differ between files. I thought about FLIPping the data and using OMS and FREQUENCIES but I think FLIP doesn't work with strings.
 
Does anybody have an idea whether/how this is possible? I've Python installed but virtually no experience with it.
 
Thanks a lot!
 
Ruben van den Berg



 





New Windows 7: Find the right PC for you. Learn more.

Re: Check whether 9 string variables are identical over some 70 "respondents"

by Jon K Peck :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


The easiest way to get a table of variable labels across files would be to use the GATHERMD extension command.  You give it a file specification, and it reads all the files and collects variable names and labels.  (The original motivation was to catalog a lot of datasets).  From that, you could just do FREQUENCIES on the label column after filtering by the set of variable names of interest.

This extension command will work with V17 or 18 and probably works with V16, too.  Of course it requires the Python plugin and the extension command, both of which can be downloaded from SPSS Developer Central, www.spss.com/DevCentral.

HTH,

Jon Peck
SPSS, an IBM Company
peck@...
312-651-3435



From: Ruben van den Berg <ruben_van_den_berg@...>
To: SPSSX-L@...
Date: 10/29/2009 08:18 AM
Subject: [SPSSX-L] Check whether 9 string variables are identical over              some 70              "respondents"
Sent by: "SPSSX(r) Discussion" <SPSSX-L@...>





Dear all,

I merged 9 data files with ADD FILES. However, to make sure that the variable labels are identical over the 9 files, I made a table with a single column of variable names and the corresponding variable labels for each of the 9 files (so 10 string variables in total). Since the original files had a set of some 70 variables in common, my 'variable label table' has some 70 lines. Ideally, all variable labels should be identical but on visual inspection I've already spotted some slight differences.

What I was thinking about, is to count the number of different values within 'respondents' over my 9 string variables in order to identify those variables for which labels differ between files. I thought about FLIPping the data and using OMS and FREQUENCIES but I think FLIP doesn't work with strings.

Does anybody have an idea whether/how this is possible? I've Python installed but virtually no experience with it.

Thanks a lot!

Ruben van den Berg









New Windows 7: Find the right PC for you. Learn more.


SOLVED: Check whether 9 string variables are identical over some 70 "respondents"

by Ruben van den Berg :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Some parts of this message have been removed. Learn more about Nabble's security policy.
Dear Jon,
 
GATHERMD is lovely and I'll surely use it more often. Especially the ability to get an overview of all SPSS files in a single folder is great!
 
But honestly, it didn't really solve the problem I posted. The easiest -but unelegant- solution was saving the entire file as XLS, transposing it in XLS, and reopening it in SPSS (essentially FLIP with string variables). Now I could use OMS -> frequencies -> aggregate -> match files to add the number of different string values to the original table. I realized the structure of the original table (varname and 9 labels in single rows) facilitated the intercomparison of the labels a lot.
 
Kind regards!
 
Ruben van den Berg
 


 



 

Date: Thu, 29 Oct 2009 08:33:44 -0600
From: peck@...
Subject: Re: Check whether 9 string variables are identical over some 70 "respondents"
To: SPSSX-L@...


The easiest way to get a table of variable labels across files would be to use the GATHERMD extension command.  You give it a file specification, and it reads all the files and collects variable names and labels.  (The original motivation was to catalog a lot of datasets).  From that, you could just do FREQUENCIES on the label column after filtering by the set of variable names of interest.

This extension command will work with V17 or 18 and probably works with V16, too.  Of course it requires the Python plugin and the extension command, both of which can be downloaded from SPSS Developer Central, www.spss.com/DevCentral.

HTH,

Jon Peck
SPSS, an IBM Company
peck@...
312-651-3435



From: Ruben van den Berg <ruben_van_den_berg@...>
To: SPSSX-L@...
Date: 10/29/2009 08:18 AM
Subject: [SPSSX-L] Check whether 9 string variables are identical over              some 70              "respondents"
Sent by: "SPSSX(r) Discussion" <SPSSX-L@...>





Dear all,

I merged 9 data files with ADD FILES. However, to make sure that the variable labels are identical over the 9 files, I made a table with a single column of variable names and the corresponding variable labels for each of the 9 files (so 10 string variables in total). Since the original files had a set of some 70 variables in common, my 'variable label table' has some 70 lines. Ideally, all variable labels should be identical but on visual inspection I've already spotted some slight differences.

What I was thinking about, is to count the number of different values within 'respondents' over my 9 string variables in order to identify those variables for which labels differ between files. I thought about FLIPping the data and using OMS and FREQUENCIES but I think FLIP doesn't work with strings.

Does anybody have an idea whether/how this is possible? I've Python installed but virtually no experience with it.

Thanks a lot!

Ruben van den Berg









New Windows 7: Find the right PC for you. Learn more.



Express yourself instantly with MSN Messenger! MSN Messenger

Re: Check whether 9 string variables are identical over some 70 "respondents"

by Albert-jan Roskam :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

Just for fun and in case you have version < 16, I created the code below. It loops over all savs in a given dir and for each specified var, it returns a list of unique variable names, as well as the number of unique variable names. It's case-sensitive, and it will even nag about differences in preceding and trailing blanks.

* sample code to generate some files.
begin program.
import spss, random
for fileno in range(20):
  suffix1, suffix2 = random.randint(0, 20), random.randint(20, 40)
  spss.Submit("""
data list free / respondent (a5) somevar (a10).
begin data
'blah' 'qwerty'
end data.
variable label respondent 'mylabel %02d' / somevar 'somelabel %02d'.
save outfile = 'd:/temp2/file_%02d.sav'.
new file.
""" % (suffix1, suffix2, fileno))
end program.

* actual code.
begin program.
import os, spss, spssaux
def func(var, path):
    savs = [os.path.join(path, sav) for sav in os.listdir(path) if sav.lower().endswith(".sav")]
    labels = []
    for sav in sorted(savs):
        spssaux.OpenDataFile(sav)
        for v in spssaux.VariableDict(var):
            labels.append(v.VariableLabel)
            varname = v.VariableName.upper()
    print varname, "- there are", len(set(labels)), "unique variable labels out of a total of", len(labels), ":"
    for unique_label in sorted(frozenset(labels)):
        print "\t" + unique_label
def checkvars (vars_to_be_checked, path="d:/temp"):
    for var in vars_to_be_checked:
        func(var, path)
        print "\n" + 70 * "*"
checkvars (vars_to_be_checked = ["respondent", "somevar"], path="d:/temp2")
end program.


Cheers!!
Albert-Jan

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Before you criticize someone, walk a mile in their shoes, that way
when you do criticize them, you're a mile away and you have their shoes!
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


--- On Thu, 10/29/09, Jon K Peck <peck@...> wrote:

> From: Jon K Peck <peck@...>
> Subject: Re: [SPSSX-L] Check whether 9 string variables are identical over              some 70              "respondents"
> To: SPSSX-L@...
> Date: Thursday, October 29, 2009, 3:33 PM
>
>
> The easiest way to get a
> table of variable
> labels across files would be to use the GATHERMD extension
> command.  You
> give it a file specification, and it reads all the files
> and collects variable
> names and labels.  (The original motivation was to
> catalog a lot of
> datasets).  From that, you could just do FREQUENCIES
> on the label
> column after filtering by the set of variable names of
> interest.
>
>
>
> This extension command
> will work with
> V17 or 18 and probably works with V16, too.  Of course
> it requires
> the Python plugin and the extension command, both of which
> can be downloaded
> from SPSS Developer Central, www.spss.com/DevCentral.
>
>
>
> HTH,
>
>
>
> Jon Peck
>
> SPSS, an IBM Company
>
> peck@...
>
> 312-651-3435
>
>
>
>
>
>
>
>
> From:
> Ruben van den Berg
> <ruben_van_den_berg@...>
>
> To:
> SPSSX-L@...
>
> Date:
> 10/29/2009 08:18
> AM
>
> Subject:
> [SPSSX-L] Check
> whether 9 string variables
> are identical over
>    some
> 70
>  "respondents"
>
> Sent
> by:
> "SPSSX(r)
> Discussion"
> <SPSSX-L@...>
>
>
>
>
>
>
>
>
> Dear all,
>
>
>
> I merged 9 data files with ADD FILES. However, to make sure
> that the variable
> labels are identical over the 9 files, I made a table with
> a single column
> of variable names and the corresponding variable labels for
> each of the
> 9 files (so 10 string variables in total). Since the
> original files had
> a set of some 70 variables in common, my 'variable
> label table' has some
> 70 lines. Ideally, all variable labels should be identical
> but on visual
> inspection I've already spotted some slight
> differences.
>
>
>
> What I was thinking about, is to count the number of
> different values
> within 'respondents' over my 9 string variables
> in order to identify
> those variables for which labels differ between files. I
> thought about
> FLIPping the data and using OMS and FREQUENCIES but I think
> FLIP doesn't
> work with strings.
>
>
>
> Does anybody have an idea whether/how this is possible?
> I've Python installed
> but virtually no experience with it.
>
>
>
> Thanks a lot!
>
>
>
> Ruben van den Berg
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> New Windows 7: Find the
> right PC for you.
> Learn
> more.
>
>
>
>

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Re: Check whether 9 string variables are identical over some 70 "respondents"

by Ajay Ohri Decisionstats :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Could we use some data visualization technique since this is just some
70 rows

Sent from my iPhone

On Oct 29, 2009, at 12:26, Albert-Jan Roskam <fomcl@...> wrote:

> Hi,
>
> Just for fun and in case you have version < 16, I created the code
> below. It loops over all savs in a given dir and for each specified
> var, it returns a list of unique variable names, as well as the
> number of unique variable names. It's case-sensitive, and it will
> even nag about differences in preceding and trailing blanks.
>
> * sample code to generate some files.
> begin program.
> import spss, random
> for fileno in range(20):
>  suffix1, suffix2 = random.randint(0, 20), random.randint(20, 40)
>  spss.Submit("""
> data list free / respondent (a5) somevar (a10).
> begin data
> 'blah' 'qwerty'
> end data.
> variable label respondent 'mylabel %02d' / somevar 'somelabel %02d'.
> save outfile = 'd:/temp2/file_%02d.sav'.
> new file.
> """ % (suffix1, suffix2, fileno))
> end program.
>
> * actual code.
> begin program.
> import os, spss, spssaux
> def func(var, path):
>    savs = [os.path.join(path, sav) for sav in os.listdir(path) if
> sav.lower().endswith(".sav")]
>    labels = []
>    for sav in sorted(savs):
>        spssaux.OpenDataFile(sav)
>        for v in spssaux.VariableDict(var):
>            labels.append(v.VariableLabel)
>            varname = v.VariableName.upper()
>    print varname, "- there are", len(set(labels)), "unique variable
> labels out of a total of", len(labels), ":"
>    for unique_label in sorted(frozenset(labels)):
>        print "\t" + unique_label
> def checkvars (vars_to_be_checked, path="d:/temp"):
>    for var in vars_to_be_checked:
>        func(var, path)
>        print "\n" + 70 * "*"
> checkvars (vars_to_be_checked = ["respondent", "somevar"], path="d:/
> temp2")
> end program.
>
>
> Cheers!!
> Albert-Jan
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Before you criticize someone, walk a mile in their shoes, that way
> when you do criticize them, you're a mile away and you have their
> shoes!
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
>
> --- On Thu, 10/29/09, Jon K Peck <peck@...> wrote:
>
>> From: Jon K Peck <peck@...>
>> Subject: Re: [SPSSX-L] Check whether 9 string variables are
>> identical over              some 70              "respondents"
>> To: SPSSX-L@...
>> Date: Thursday, October 29, 2009, 3:33 PM
>>
>>
>> The easiest way to get a
>> table of variable
>> labels across files would be to use the GATHERMD extension
>> command.  You
>> give it a file specification, and it reads all the files
>> and collects variable
>> names and labels.  (The original motivation was to
>> catalog a lot of
>> datasets).  From that, you could just do FREQUENCIES
>> on the label
>> column after filtering by the set of variable names of
>> interest.
>>
>>
>>
>> This extension command
>> will work with
>> V17 or 18 and probably works with V16, too.  Of course
>> it requires
>> the Python plugin and the extension command, both of which
>> can be downloaded
>> from SPSS Developer Central, www.spss.com/DevCentral.
>>
>>
>>
>> HTH,
>>
>>
>>
>> Jon Peck
>>
>> SPSS, an IBM Company
>>
>> peck@...
>>
>> 312-651-3435
>>
>>
>>
>>
>>
>>
>>
>>
>> From:
>> Ruben van den Berg
>> <ruben_van_den_berg@...>
>>
>> To:
>> SPSSX-L@...
>>
>> Date:
>> 10/29/2009 08:18
>> AM
>>
>> Subject:
>> [SPSSX-L] Check
>> whether 9 string variables
>> are identical over
>>   some
>> 70
>> "respondents"
>>
>> Sent
>> by:
>> "SPSSX(r)
>> Discussion"
>> <SPSSX-L@...>
>>
>>
>>
>>
>>
>>
>>
>>
>> Dear all,
>>
>>
>>
>> I merged 9 data files with ADD FILES. However, to make sure
>> that the variable
>> labels are identical over the 9 files, I made a table with
>> a single column
>> of variable names and the corresponding variable labels for
>> each of the
>> 9 files (so 10 string variables in total). Since the
>> original files had
>> a set of some 70 variables in common, my 'variable
>> label table' has some
>> 70 lines. Ideally, all variable labels should be identical
>> but on visual
>> inspection I've already spotted some slight
>> differences.
>>
>>
>>
>> What I was thinking about, is to count the number of
>> different values
>> within 'respondents' over my 9 string variables
>> in order to identify
>> those variables for which labels differ between files. I
>> thought about
>> FLIPping the data and using OMS and FREQUENCIES but I think
>> FLIP doesn't
>> work with strings.
>>
>>
>>
>> Does anybody have an idea whether/how this is possible?
>> I've Python installed
>> but virtually no experience with it.
>>
>>
>>
>> Thanks a lot!
>>
>>
>>
>> Ruben van den Berg
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> New Windows 7: Find the
>> right PC for you.
>> Learn
>> more.
>>
>>
>>
>>
>
> =====================
> To manage your subscription to SPSSX-L, send a message to
> LISTSERV@... (not to SPSSX-L), with no body text except
> the
> command. To leave the list, send the command
> SIGNOFF SPSSX-L
> For a list of commands to manage subscriptions, send the command
> INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

need help with spss 17 for mac

by Sandra Sigmon :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

Where do I get the patch for spss 17 (that fixed the recoding problem)?
Thanks,
Sandy

Sandra T. Sigmon, Ph.D.
Professor, Department of Psychology
Senior Scientist, Maine Institute of Human Genetics & Health
376 Little Hall
University of Maine, Orono, ME  04469
phone: 207-581-2049
fax: 207-581-6128

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Re: need help with spss 17 for mac

by SPSS Support :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Sandra,
   The 17.0.2 patch resolved the problem of limits in the Recode into Same Variable dialog. You can find it on the Support web site at
http://support.spss.com . Once you enter the site, click the Statistics link at the left side of the page, then click the Patches link that appears under Statistics. You will see links for the 17.0.2 and earlier patches as well as for the 17.0.3 patch.
  The Resolution below my signature is also available at the support web site. Click the Knowledgebase Search link on the main page of the site. You can find this resolution by entering "patch" and "recode" (quotes not necessary) into the search terms box.

David Matheson
Statistical Support
SPSS, an IBM company

********************

Resolution number: 81717  Created on: Jan 26 2009  Last Reviewed on: Aug 7 2009

Problem Subject:  Problem with Transform and the Recode into same Variables selection

Problem Description:  In cleaning a data file, I wanted to Recode values that had been partially entered as lower-case into upper-case only, but found that the procedure "Recode into same Variable" has a bug - only 6 changed values can be added. This window should change to have scroll bars on the right but doesn't. It works perfectly in V16 and also in V17 when using "Recode into a different Variable". How can I get around this?

Resolution Subject: Issue resolved in 17.0.2 patch.

Resolution Description:
This problem has been corrected in version 17.0.2 patch. Please visit support.spss.com, register and log in to download patches. We apologize for any inconvenience.


-----Original Message-----
From: SPSSX(r) Discussion [mailto:SPSSX-L@...] On Behalf Of Sandra Sigmon
Sent: Thursday, October 29, 2009 12:09 PM
To: SPSSX-L@...
Subject: need help with spss 17 for mac

Hi,

Where do I get the patch for spss 17 (that fixed the recoding problem)?
Thanks,
Sandy

Sandra T. Sigmon, Ph.D.
Professor, Department of Psychology
Senior Scientist, Maine Institute of Human Genetics & Health
376 Little Hall
University of Maine, Orono, ME  04469
phone: 207-581-2049
fax: 207-581-6128

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@... (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD

Re: Check whether 9 string variables are identical over some 70 "respondents"

by Bruce Weaver :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Ruben van den Berg wrote:
Dear all,

 

I merged 9 data files with ADD FILES. However, to make sure that the variable labels are identical over the 9 files, I made a table with a single column of variable names and the corresponding variable labels for each of the 9 files (so 10 string variables in total). Since the original files had a set of some 70 variables in common, my 'variable label table' has some 70 lines. Ideally, all variable labels should be identical but on visual inspection I've already spotted some slight differences.

 

What I was thinking about, is to count the number of different values within 'respondents' over my 9 string variables in order to identify those variables for which labels differ between files. I thought about FLIPping the data and using OMS and FREQUENCIES but I think FLIP doesn't work with strings.

 

Does anybody have an idea whether/how this is possible? I've Python installed but virtually no experience with it.

 

Thanks a lot!

 

Ruben van den Berg
Hi Ruben.  If I understand, something like this might work for you.


DATASET DECLARE  vinfo.
OMS
  /SELECT TABLES
  /IF COMMANDS=['Sysfile Info'] SUBTYPES=['Variable Information']
  /DESTINATION FORMAT=SAV NUMBERED=FileNo  OUTFILE='vinfo'.

SYSFILE INFO 'C:\MyFolder\file1.sav'.
SYSFILE INFO 'C:\MyFolder\file2.sav'.
* etc .
SYSFILE INFO 'C:\MyFolder\file9.sav'.

OMSEND.

dataset activate vinfo window = front.

* Keep only the needed variables in VINFO .
* For now, I assume that is FileNo, Var1 and Label .

match files
 file = * /
 keep = FileNo Var1 Label .
exe.

* Now restructure to move all variable labels onto a single row .
sort cases by Var1 FileNo.
CASESTOVARS
  /ID=Var1
  /INDEX=FileNo
  /GROUPBY=VARIABLE
  /AUTOFIX = NO
.

If you set AUTOFIX to YES (the default), then in the event that all files have exactly the same variable labels, you'll end up having a single LABEL variable, not LABEL.1, LABEL.2, ... LABEL.9.

--
Bruce Weaver
bweaver@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/
"When all else fails, RTFM."

NOTE:  My Hotmail account is not monitored regularly.  
To send me an e-mail, please use the address shown above.