{"id":126,"date":"2018-04-23T05:29:10","date_gmt":"2017-04-23T06:27:57","guid":{"rendered":""},"modified":"2022-03-22T09:01:48","modified_gmt":"2022-03-22T07:01:48","slug":"spss-batch-process-files-with-python","status":"publish","type":"post","link":"https:\/\/www.spss-tutorials.com\/spss-batch-process-files-with-python\/","title":{"rendered":"SPSS &#8211; Batch Process Files with Python"},"content":{"rendered":"<!--body-->\n\n<p>Running syntax over several <a href=\"https:\/\/www.spss-tutorials.com\/spss-what-is-it\/\">SPSS<\/a> data files in one go is fairly easy. If we use <a href=\"https:\/\/www.spss-tutorials.com\/python-for-spss-what-is-it\/\">SPSS with Python<\/a> we don't even have to type in the file names. The Python os (for <strong>o<\/strong>perating <strong>s<\/strong>ystem) module will do it for us.<br>\nTry it for yourself by downloading <a href=\"https:\/\/www.spss-tutorials.com\/downloads\/spssfiles.zip\">spssfiles.zip<\/a>. Unzip these files into <code>d:\\spssfiles<\/code> as shown below and you're good to go.<\/p>\n\n<span class = \"img w720\"> \n    <img src='https:\/\/spss-tutorials.com\/img\/spss-data-and-syntax-files-in-folder.png' alt = \"SPSS Data And Syntax Files In Folder\"> \n<\/span> \n \n \n \n<h2>Find All Files and Folders in Root Directory<\/h2>\n\n<p>The syntax below creates a <a href=\"https:\/\/www.spss-tutorials.com\/overview-python-object-types\/#list\">Python list<\/a> of files and folders in <code>rDir<\/code>, our root directory. Prefixing it with an r as in <code>r'D:\\spssfiles'<\/code> ensures that the <a href=\"https:\/\/www.spss-tutorials.com\/overview-python-operators\/#backslash \">backslash<\/a> doesn't do anything weird.<\/p>\n\n<div class='code'><strong>*Find all files and folders in root directory.<br><\/strong><br>begin program.<br>import os<br>rDir = r&#39;D:\\spssfiles&#39;<br>print os.listdir(rDir)<br>end program.<\/div><!--class='code'-->\n\n<h2>Result<\/h2>\n\n<span class = \"img w720\"> \n    <img src='https:\/\/spss-tutorials.com\/img\/python-list-of-all-files-in-folder.png' alt = \"Python List Of All Files In Folder\"> \n<\/span> \n \n\n \n<h2>Filter Out All .Sav Files<\/h2>\n \n<p>As we see, <code>os.listdir()<\/code> creates a list of all files and folders in <code>rDir<\/code> but we only want SPSS data files. For filtering them out, we first create and empty list with <code>savs = []<\/code>. Next, we'll add each file to this list if it <code>endswith(\".sav\")<\/code>.<\/p>\n \n<div class='code'><strong>*Add all .sav (SPSS data) files to Python list.<br><\/strong><br>begin program.<br>import os<br>rDir = r&#39;D:\\spssfiles&#39;<br>savs = []<br>for fil in os.listdir(rDir):<br>&nbsp;&nbsp;&nbsp;&nbsp;if fil.endswith(&#34;.sav&#34;):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;savs.append(fil)<br>print savs<br>end program.<\/div><!--class='code'--> \n \n<h2>Using Full Paths for SPSS Files<\/h2>\n\n<p>For doing anything whatsoever with our data files, we probably want to open them. For doing so, SPSS needs to know in which folder they are located. We could simply set a default directory in SPSS with CD as in\n<span class='code'>CD \"d:\\spssfiles\".<\/span>\nHowever, having Python create full paths to our files with <code>os.path.join()<\/code> is a more fool proof approach for this.\n<\/p>\n\n<div class='code'><strong>*Create full paths to all .sav files.<br><\/strong><br>begin program.<br>import os<br>rDir = r&#39;D:\\spssfiles&#39;<br>savs = []<br>for fil in os.listdir(rDir):<br>&nbsp;&nbsp;&nbsp;&nbsp;if fil.endswith(&#34;.sav&#34;):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;savs.append(os.path.join(rDir,fil))<br>for sav in savs:<br>&nbsp;&nbsp;&nbsp;&nbsp;print sav<br>end program.<\/div><!--class='code'-->\n<h2>Result<\/h2>\n  \n<span class = \"img w720\"> \n    <img src='https:\/\/spss-tutorials.com\/img\/spss-full-paths-to-sav-files-in-output.png' alt = \"SPSS Full Paths To Sav Files In Output\"> \n<\/span> \n \n\n<h2>Have SPSS Open Each Data File<\/h2>\n\n<p>Generally, we open a data file in SPSS with something like\n<span class='code'>GET FILE \"d:\\spssfiles\\mydata.sav\".<\/span>\nIf we replace the file name with each of the paths in our Python list, we'll open each data file, one by one. We could then add some syntax we'd like to run on each file. Finally, we could save our edits with\n<span class='code'>SAVE OUTFILE \"...\".<\/span>\nand that'll batch process multiple files. In this example, however, we'll simply look up which variables each file contains with <code>spssaux.GetVariableNamesList()<\/code>.<\/p>\n\n<div class='code'><strong>*Open all SPSS data files and print the variables they contain.<br><\/strong><br>begin program.<br>import os,spss,spssaux<br>rDir = r&#39;D:\\spssfiles&#39;<br>savs = []<br>for fil in os.listdir(rDir):<br>&nbsp;&nbsp;&nbsp;&nbsp;if fil.endswith(&#34;.sav&#34;):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;savs.append(os.path.join(rDir,fil))<br>for sav in savs:<br>&nbsp;&nbsp;&nbsp;&nbsp;spss.Submit(&#34;GET FILE &#39;%s&#39;.&#34;%sav)<br>&nbsp;&nbsp;&nbsp;&nbsp;print sav,spssaux.GetVariableNamesList()<br>end program.<\/div><!--class='code'--> \n\n<h2>Result<\/h2>\n \n<span class = \"img w720\"> \n    <img src='https:\/\/spss-tutorials.com\/img\/spss-python-find-file-names-and-variables-in-files.png' alt = \"SPSS File Names And Variable Names With Python\"> \n<\/span> \n \n \n \n<h2>Inspect which Files Contain &ldquo;Salary&rdquo;<\/h2>\n\n<p>Now suppose we'd like to know which of our files contain some variable &ldquo;salary&rdquo;. We'll simply check if it's present in our variable names list and -if so- print back the name of the data file.<\/p>\n \n<div class='code'><strong>*Report all .sav files that contain a variable &#34;salary&#34; (case sensitive).<br><\/strong><br>begin program.<br>import os,spss,spssaux<br>rDir = r&#39;D:\\spssfiles&#39;<br>findVar = &#39;salary&#39;<br>savs = []<br>for fil in os.listdir(rDir):<br>&nbsp;&nbsp;&nbsp;&nbsp;if fil.endswith(&#34;.sav&#34;):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;savs.append(os.path.join(rDir,fil))<br>for sav in savs:<br>&nbsp;&nbsp;&nbsp;&nbsp;spss.Submit(&#34;get file &#39;%s&#39;.&#34;%sav)<br>&nbsp;&nbsp;&nbsp;&nbsp;if findVar in spssaux.GetVariableNamesList():<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;print sav<br>end program.<\/div><!--class='code'-->\n\n\n<h2>Result<\/h2>\n \n<span class = \"img w720\"> \n    <img src='https:\/\/spss-tutorials.com\/img\/spss-find-variable-across-files-with-python.png' alt = \"SPSS Find Variable Across Files With Python\"> \n<\/span> \n \n \n<h2>Circumvent Python&rsquo;s Case Sensitivity<\/h2>\n\n<p>There's one more point I'd like to cover: since we search for &ldquo;salary&rdquo;, Python won't detect &ldquo;Salary&rdquo; or &ldquo;SALARY&rdquo; because it's fully case sensitive. I you don't like that, the simple solution is to convert all variable names for all files to <code>lower()<\/code>case.<br>\nA basic way to <strong>change all items in a Python list<\/strong> is\n<span class='code'>[i... for i in list]<\/span>\nwhere <code>i...<\/code> is a modified version of <code>i<\/code>, in our case <code>i.lower()<\/code>. This technique is known as a Python list comprehension and the syntax below uses it to lowercase all variable names (line 13).<\/p>\n\n\n\n<div class='code'><strong>*Report all .sav files that contain a variable &#34;salary&#34; (case insensitive).<br><\/strong><br>begin program.<br>import os,spss,spssaux<br>rDir = r&#39;D:\\spssfiles&#39;<br>findVar = &#39;salary&#39;<br>savs = []<br>for fil in os.listdir(rDir):<br>&nbsp;&nbsp;&nbsp;&nbsp;if fil.endswith(&#34;.sav&#34;):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;savs.append(os.path.join(rDir,fil))<br>for sav in savs:<br>&nbsp;&nbsp;&nbsp;&nbsp;spss.Submit(&#34;get file &#39;%s&#39;.&#34;%sav)<br>&nbsp;&nbsp;&nbsp;&nbsp;if findVar.lower() in [varNam.lower() for varNam in spssaux.GetVariableNamesList()]:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;print sav<br>end program.<\/div><!--class='code'-->\n<p>Note: since I usually avoid all uppercasing in SPSS variable names, the result is identical to our case sensitive search.<br><br>\nThanks for reading.<\/p>\n\n\n\n","protected":false},"excerpt":{"rendered":"<p>Running syntax on several SPSS data files in one go is fairly easy with Python. This tutorial walks you through.<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[290],"tags":[],"class_list":["post-126","post","type-post","status-publish","format-standard","hentry","category-spss-python-ii-examples"],"_links":{"self":[{"href":"https:\/\/www.spss-tutorials.com\/wp-json\/wp\/v2\/posts\/126","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.spss-tutorials.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.spss-tutorials.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.spss-tutorials.com\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.spss-tutorials.com\/wp-json\/wp\/v2\/comments?post=126"}],"version-history":[{"count":0,"href":"https:\/\/www.spss-tutorials.com\/wp-json\/wp\/v2\/posts\/126\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.spss-tutorials.com\/wp-json\/wp\/v2\/media?parent=126"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.spss-tutorials.com\/wp-json\/wp\/v2\/categories?post=126"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.spss-tutorials.com\/wp-json\/wp\/v2\/tags?post=126"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}