Validating Text Fields with Regular Expressions


There are many ways of formatting the text data you collect in a CSPro application. For example, in the United States it is common to write a telephone number as xxx-xxx-xxxx or (xxx) xxx-xxxx. If only a text field is used, the interviewer could enter either format. However, not knowing the format creates extra work post-data collection, so as the application developer you will want to accept a single format.

This is done using the regexmatch function which was introduced in CSPro 7.2. The function takes two strings, the target and a regular expression and returns whether there is a match or not. In this example, the target string is the telephone number and the regular expression string describes the valid variations of the telephone number.

Regular expressions have their own syntax separate from CSPro logic. To help write your regular expression you can use any regular expression editor that supports the ECMAScript (JavaScript) engine (or flavor).

Writing a Regular Expression

Let us write a regular expression that describes a telephone number in the following format: xxx-xxx-xxxx. We will use the online regular expression editor regex101, make sure to select ECMAScript as the flavor. Start by typing the phone number 123-456-7890 into the test string field. As you write the regular expression, you will notice that the test string is highlighted as it is described by the regular expression.

Step 1

alt text

Begin your regular expression by asserting its position at the start of a newline. This will keep your phone number from matching something like otherData123-456-7890.

Step 2

alt text

The first character is any number from 0 to 9.

Step 3

alt text

The following two characters are also any numbers from 0 to 9. Signal that the pattern will repeat three times.

Step 4

alt text

The next character is a hyphen, and will match nothing else, so enter the literal hyphen character.

Step 5

alt text

Notice the pattern of the next four characters is the same as the past four. Wrap everything, but the caret in parentheses to create a capture group and signal that the pattern will repeat two times.

Step 6

alt text

The last four characters are any numbers from 0 to 9. Signal that the pattern will repeat four times.

Step 7

alt text

Finally, end your regular expression by asserting its position at the end of a newline. This will keep your phone number from matching something like 123-456-7890otherData.

Validating a Text Field

With your regular expression in hand, you are ready to validate the telephone number in CSPro. Call regexmatch passing in the telephone number and the regular expression. If 0 is returned then display an error message and re-enter. This allows the interviewer to correct the telephone number. Otherwise, if 1 is returned, do nothing and let the interview continue.

PROC TELEPHONE_NUMBER

postproc

    if regexmatch(TELEPHONE_NUMBER, "^([0-9]{3}-){2}[0-9]{4}$") = 0 then
        errmsg("Invalid format! Use the following format: xxx-xxx-xxxx.");
        reenter;
    endif;

To see a working example, download the regexmatch application.


#Logic