New Functions: getusername, randomin, randomizevs


CSPro 4.1.002 comes with a few new functions that may come in handy for your data processing needs. The function getusername returns the name under which the user logged onto Windows. This can be useful if you want to restrict access to a data entry program for only a few users. Imagine a control system for a data entry operation in which only supervisors have access to parts of the system. By giving the supervisors logins such as super1, super2, etc., you could restrict access as follows:

OPERATOR_NAME = getusername();

if pos("super",tolower(OPERATOR_NAME)) <> 1 then
   
errmsg("You must be logged in as a supervisor on this machine to access this program.");
   
stop(1);
endif;

OPERATOR_NUMBER = 
tonumber(OPERATOR_NAME[6]); // starting at position 6 will skip past "super"

The function randomin works by accepting as an argument either an in list or a value set. The function then returns a random value that falls somewhere in the listed values. It is now easy to get a random value within a non-continuous range:

numeric randomElectionYear = randomin(2000,2004,2008,2012);

You can weight certain values by repeating the occurrence of the value in the in list. For example, the following example will return three times as many 1 values as 2 values:

randomin(1,1,1,2);

The function is probably most useful when used with a value set. For example, if you have a data file for a survey and you are planning on adding a question to the survey about religion, you may want to test out your edits for this variable. Your old data does not have this variable, so you would write a batch edit program and then add random religion data to a test data file, which you could pass through your edit programs:

PROC RELIGION

    RELIGION = 
randomin(RELIGION_VS1);

The randomizevs function is useful for data entry applications that use the extended controls introduced in CSPro 4.1. Sometimes you may want to ask questions and randomize the order of the possible responses to avoid bias due to the item positioning. In the preproc of the relevant question, you could execute the randomizevs function to present a different order each time:

PROC POLITICAL_PARTY

preproc

   
randomizevs(PARTY_LIST_VSET,exclude(98,99));

In the above example, 98 might be a "None of the above" code and 99 might be a "Do not know" code. You do not want these codes randomized with the rest, you always want them at the bottom of the value set list, so you exclude them from the randomization.

Adding Multiple Logic Files to an Application


This code looks like it will not compile correctly:

PROC GLOBAL

PROC APPLICATION_FF

preproc

   
errmsg("Welcome to CSPro, this is version %s",versionNumber());

The compiler may complain that versionNumber is not a function that exists. However, starting in CSPro 4.1.002, you can attach multiple logic files to an application. In this case, if versionNumber is defined in a different file, the code will compile successfully.

When compiling your code, CSPro will load any additional logic files as if the text of these files were inserted in PROC GLOBAL. This means that only things that can go in PROC GLOBAL can be declared in an external logic file, but that includes variables, arrays, and user-defined functions that you use across many applications. In the above example, I might create a file, myfunctions.app, that contains code that I use frequently. It would look like this:

function alpha versionNumber()

    versionNumber = 
"4.1.002";

end;

There is no need to write PROC GLOBAL in this external logic file. To add the file to the application, I select File -> Add Files, and then select my External Logic File:

20111209include

These files are listed in the tree in the Files window under the main logic file for your application. External logic files can be removed by selecting File -> Drop Files.

Download the above example as a batch application.

Small Language Additions: Count, Seek, Sort


In CSPro 4.1.002 there are three small language changes and additions that may make writing code easier. First, the change:

numeric numChildren = count(POP_REC where RELATIONSHIP = 3); // old
numeric numChildren = count(RELATIONSHIP = 3); // new

In the past, when using the count function it was necessary to specify what record or group you wanted to search through (in the above case, POP_REC). Now you can write the code without specifying this clause.

The seek function, new to CSPro 4.1, searches a record for the first instance of something being true. Now, in CSPro 4.1.002, you can search for the nth occurrence of the conditional statement. For example, this loop will continue until there are no longer two spouses in a household:

do while seek(RELATIONSHIP = 2,@2)

   
// change the second spouse's relationship

enddo;

Finally, the sort function has long been a useful feature of the CSPro language, but now you can use it with a where statement. For example, a common request is to have a roster sorted based on relationships. This example sorts a roster by relationship, and then sorts the children in order by descending age:

sort(POP_REC_EDT using RELATIONSHIP);
sort(POP_REC_EDT using -AGE where RELATIONSHIP = 3);

Using Constants to Improve Code Readability


CSPro does not have a constant modifier that can be used to indicate that a variable cannot be modified, but the principle of using constant values is still one that is useful for CSPro users. For example, if your task is to write an edit program that replaces invalid sex values with a random selection of the previous 10 valid sex values, here are two ways you could code it.

Version 1

PROC GLOBAL

array sexHD(10);
numeric sexCounter;

PROC APPLICATION_FF

preproc

    sexHD(
1) = 1;
    sexHD(
2) = 2;
    sexHD(
3) = 1;
    sexHD(
4) = 2;
    sexHD(
5) = 1;
    sexHD(
6) = 2;
    sexHD(
7) = 1;
    sexHD(
8) = 2;
    sexHD(
9) = 1;
    sexHD(
10) = 2;

    sexCounter = 
1;
   
seed(systime());

PROC SEX

   
if SEX in 1:2 then { valid sex }
        sexHD(sexCounter) = SEX;

        sexCounter = sexCounter + 
1;

       
if sexCounter > 10 then
            sexCounter = 
1;
       
endif;

   
else { invalid sex, use the hotdeck }
       
impute(SEX,sexHD(random(1:10)));

   
endif;

Version 2

PROC GLOBAL

numeric sizeSexHD = 10;
array sexHD(sizeSexHD) = 1 2 ...;
numeric sexCounter = 1;

PROC APPLICATION_FF

preproc

   
seed(systime());

PROC SEX

   
if SEX in 1:2 then // valid sex
        sexHD(sexCounter) = SEX;

       
inc(sexCounter);

       
if sexCounter > sizeSexHD then
            sexCounter = 
1;
       
endif;

   
else // invalid sex, use the hotdeck
        impute(SEX,sexHD(random(1:sizeSexHD)));

   
endif;

The first version is the way you might have coded this application using older versions of CSPro. The second version uses features in CSPro 4.1. The second version shows how you can initialize the values for temporary variables and arrays while declaring them in PROC GLOBAL. It also shows the use of the inc (increment) function, as well as the new CSPro 4.1.002 feature which allows you to specify the size of your array by using a declared variable.

The advantages of the second version are two-fold. First, the amount of code is reduced. Secondly, the code is much more dynamic. If, for instance, you are told that the edit should now choose between the previous 20 valid sex values, you only need to change the value of sizeSexHD. Changing this one value affects the size of the array, the number of times that the array is initialized with the alternating values of 1 and 2, the reset of the sexCounter increment variable, and the selection of values from the array.

Using constants when possible is a great idea for edits as well. Using these arrays to store constant values makes code much clearer, and allows for the easy editing of the parameters under which your program runs. Note in the following example the use of a value set to define the size of the educationAges array. These constant values can now be accessed using the getdeck function.

//                                      Male    Female
array minAgeForChildbearing(P16_VS1) =  15      12;
array maxAgeForChildbearing(P16_VS1) =  99      49;

//                                      Min     Max
array educationAges(P21_VS1,2) =        0       99      // no schooling
                                        5       12      // class 1
                                        6       13      // class 2
                                        7       14      // class 3
                                        8       15      // class 4
                                        9       16      // class 5
                                        10      17      // class 6
                                        11      18      // class 7
                                        12      19      // class 8
                                        13      20      // class 9
                                        14      99      // class 10
                                        16      99      // bachelor's
                                        19      99      // master's
                                        20      99      // PhD
                                        21      99      // postdoc
                                        ;

CSPro 4.1.002 to be Released on Monday


CSPro 4.1.002 will be officially released on Monday. Over the next few days I will blog about some new features that exist in this minor upgrade. Any applications written for CSPro 4.1.001 will work on this new version, though some people who have defined an absolute value function named abs will have to remove it from their programs, as it is now a built-in function.

What are the next steps for CSPro? Unicode support, support for handheld devices (including Android devices), and RDBMS support are among the most requested features. If you have a feature that you would like the developers to consider, post about it in the comments.

Enjoy CSPro 4.1.002.