exporting data to R

Discussions about tools to complement CSPro data processing
Forum rules
New release: CSPro 8.0
Post Reply
AriSilva
Posts: 591
Joined: July 22nd, 2016, 3:55 pm

exporting data to R

Post by AriSilva »

Hi folks,
We are having problems exporting to R, the data is being exported using ANSI, and we got errors when reading them in R.
See the messages below, caused by reading .R files:
source(paste0(pasta, "D_SELECIONADO.R"))
Error in substring(x, first, last) :
invalid multibyte string at '<c9>RIE'
> source(paste0(pasta, "E_TRABALHO.R"))
Error in substring(x, first, last) :
invalid multibyte string at '<e7>o D'

I´ve tried to generate a batch program to export it and change the behavior(), like
set behavior() export (R, SubitemOnly, UTF8);
or
set behavior() export (R, SubitemOnly, UNICODE);
but it just accepts ANSI
Best
Ari
Gregory Martin
Posts: 1777
Joined: December 5th, 2011, 11:27 pm
Location: Washington, DC

Re: exporting data to R

Post by Gregory Martin »

If you're using CSPro 7.7, can you use the R writer directly?

https://www.csprousers.org/help/CSPro/d ... ces.html#R

The helps aren't quite accurate on this, but you can write directly to a R data file (.Rdata), which means that you can skip the import altogether. Simply run a batch application with the input as your CSPro data file and the output as a .Rdata file.

If this isn't a viable option, can you send source string that is causing that problem? I can look at why the old R writer isn't converting that to to a dummy ANSI character, which is what it should do for non-ANSI characters.
AriSilva
Posts: 591
Joined: July 22nd, 2016, 3:55 pm

Re: exporting data to R

Post by AriSilva »

Hi Greg,
Unfortuntely I cannot send you the example anymore, since I decided to write a snipet of code in cspro replacing the offending special characters by their counterpart ( á to a, etc.).
As far as using the wrinting to R directly I´ll try it later but I think it will help only if I do not have any droppings, that is, weriting the outout .Rdata with all the variables.
Is there a way to filter the variables I do not need?
Best
Ari
Best
Ari
Gregory Martin
Posts: 1777
Joined: December 5th, 2011, 11:27 pm
Location: Washington, DC

Re: exporting data to R

Post by Gregory Martin »

The R writer will write out everything in your dictionary, though you can specify that you would like to write out only one record by modifying the connection string. For example:

my_data.RData|record=REC_NAME_TO_EXPORT

If there are items that you don't want to write out, then you can create a copy of your dictionary, delete those items, and then use that copied dictionary for the batch program. If your input is text and the dictionary is in relative positioning mode, then switch it to absolute positioning before deleting the items so that the gaps in the dictionary do not accidentally get associated with other items. If your input is CSPro DB, then this does not matter.
AriSilva
Posts: 591
Joined: July 22nd, 2016, 3:55 pm

Re: exporting data to R

Post by AriSilva »

Hi Greg,
I´ve tried two things:
1. Generate the case logic to export the R files, and tried to change the set behavior command like
set behavior() export (R, SubitemOnly, UNICODE); or
set behavior() export (R, SubitemOnly, UTF8);
but it does not accept either one, just the ANSI, and for what ever reason, my R does not read the ANSI command file, it has to be converted always to utf8, which is a pain.
2. Export autommatically from a using a batch program, and settng the output format to Rdata.
It worked halfway, creating the variables, but: neither the valueset labels nor the category labels were saved in the output Rdata file, which is no good for us.
Besides the fact that there is a difference between exporting a) using the utility and b) using the batch program. In the utility it converts the variable names to lower case, while in the batch program it uses the upper case as in the original dfc dictionary.

Is it possible to export to R using the utility and writing both files (data and command) in utf8?
I´m attaching a small part of my input file (just 100 questionnaires), the utility and the bath program. I´m sorry but you´ll have to touch the relative directions to make them work. If you need any other files that I forgot to include just tell me.

By the way, a very funny thing that is happening with batch programs: the very first time they work it does not write all the variables, but the ones you have a proc. Then all the other runs they work well. Have you ever seen that or is it my imagination? Is there a place in the userdata that cspro uses to sotre some hidden files that can be provoking that?

Best
Ari
Attachments
Exporting_R.zip
(247.26 KiB) Downloaded 125 times
Best
Ari
Post Reply