Page 1 of 2

addition in dictionary item caused issue in the downloaded data

Posted: September 9th, 2020, 12:36 pm
by asif
Hi there,
i have added a 4 digit numeric field in my dictionary, my CAPI application
was already running in the field. i have updated the application on server.
Few of interviewer didn't updated the application and conducted few
interviews. Now when i am downloading the data, the data is shuffled. Like
name is coming in the next address field and address is coming in the next
field and so on. Can anybody suggest, how to correct this issue?

i am in a bit urgency.

Re: addition in dictionary item caused issue in the downloaded data

Posted: September 9th, 2020, 2:31 pm
by josh
Unfortunately once the data is mixed up like this it is very difficult to fix it. You need to separate out the cases using the old dictionary and the ones using the new dictionary into two separate data files. You can do this with a batch edit application. Then use the Reformat Data tool to reformat the data that uses the old dictionary to match the new dictionary. Finally combine the data that was captured with the new dictionary with the reformatted data to form the complete data file.

In the future, NEVER add a new field in the middle of a record or change the size of an existing field once you have started collecting data. Instead, add new fields at the END of the record and you won't have this issue. Also with CSPro 7.5/CSWeb 7.5 this will no longer be a problem.

Re: addition in dictionary item caused issue in the downloaded data

Posted: September 10th, 2020, 2:33 am
by asif
Thank you josh for your reply, i swear i will not ad anything in the middle in future :)
I need more info about the below and if possible step by step details
  • How to separate the two data files
    How to reformat the data
how can i get two separate data files in spss, as it is easier for me to combine them later in spss.

Re: addition in dictionary item caused issue in the downloaded data

Posted: September 10th, 2020, 8:10 am
by josh
If you are good at SPSS then it may be easier to fix the data in SPSS as this is not easy to do in CSPro.

To separate the files in CSPro create a batch edit application. In that application you need to find a field (or multiple fields) that has an invalid value in the cases that were captured using the old dictionary but has a valid value in the cases collected with the new dictionary. Then you can use the value of that field to determine which output file to write the case to. You can use the setoutput command (https://www.csprousers.org/help/CSPro/s ... ction.html)in the batch application to change the output file for each case. So you would have logic like:

Code: Select all

if SOME_FIELD in 1:10 then
  setoutput("goodcases.csdb");
else
  setoutput("problemcases.csdb");
endif;
In the above example the idea is that in the data collected with the new dictionary the value of SOME_FIELD will always be in the range 1 to 10 but in the cases that were collected with the old dictionary the value of SOME_FIELD will be some other value. The field and the condition you use will depend on your data.

Re: addition in dictionary item caused issue in the downloaded data

Posted: September 10th, 2020, 8:54 am
by asif
thank you josh
I have just deleted the field from the dictionary that i had added, it was the 4 digit first field in the first record. And created a new data file(.csdb) using the batch edit. I thought the new data file would be correct but its not. So now how can i reformat that, assuming that field is not required now.
what are the exact steps to reformat using the reformat tool in this scenario.

Re: addition in dictionary item caused issue in the downloaded data

Posted: September 10th, 2020, 10:05 am
by asif
Hi Josh
In addition to the above, i have separated the two files, what should i do next? what i did, i put the bad data file in a separate folder,i put the dictionary(with the additional filed) in the same folder, made another copy of it and deleted the additional field and run reformat tool. In Input i have provided the dictionary with the additional field and the bad data file and in the output i provided the file from which i have deleted the additional field. But when run and checked the new data file, there was no change :(
please suggest what i did wrong.

Re: addition in dictionary item caused issue in the downloaded data

Posted: September 10th, 2020, 11:15 am
by josh
That sounds like the right approach. Maybe there were other changes to the dictionary or your criteria for separating the data was incorrect.

Re: addition in dictionary item caused issue in the downloaded data

Posted: September 11th, 2020, 12:08 am
by asif
Files are correctly separated, i can see the bad and clean records in the respective files. There are no other changes in the dictionary, exactly 4 digits/characters are shuffled to the previous field. Is there any other approach to resolve this issue?

Re: addition in dictionary item caused issue in the downloaded data

Posted: September 11th, 2020, 8:13 am
by josh
That is the approach. You can send the data files and dictionaries to cspro@lists.census.gov and we can take a look. Please let us know which item is the new one that you added.

Re: addition in dictionary item caused issue in the downloaded data

Posted: September 16th, 2020, 4:42 am
by asif
Hi
As instructed i have sent files.
In addition, what is the right approach if you are conducting a survey in waves, that is each wave in each month. Right now, what i am doing is, just making the copy of the cspro CAPI application, with the save as command in a new folder and giving it a new name. Saving the dictionary with the new name and similarly data file with the new name. Delete the old wave data and application from the server and deploy the new wave on server. Ask all interviewers to install the new app. I am unable to understand two things, after deploying, the old dictionary name is still showing on the csweb, secondly the number of cases shown are very confusing. Right now it shows 107 cases, if now i download it, it will show a strange figure(cases 1242, deleted cases 8) in the data viewer but if i will export it in to spss, it will show just 69.

Can you please thoroughly explain, how to handle waves in cspro.