• <Helps for GetStart>
  • CSPro User's Guide
    • The CSPro System
    • Data Dictionary Module
    • The CSPro Language
    • Data Entry Module
    • Batch Editing Applications
    • Tabulation Applications
    • CSPro Statements and Functions
      • Statement Format Symbols
      • Alphabetical List of Functions and Statements
      • List of Reserved Words
      • Deprecated Features
      • Declaration Statements
      • Array Object
      • Audio Object
      • Barcode and QR Codes
      • Document Object
      • File Object
      • Freq Object
      • Geometry Object
      • HashMap Object
      • Image Object
      • List Object
      • Map Object
      • Path
      • Pff Object
      • SystemApp Object
      • ValueSet Object
      • Program Control Statements
      • Assignment Statements
        • Assignment Statement
        • Recode Statement
        • Recode Statement (Deprecated)
        • Impute Function
        • SetValue Function
        • GetValue Function
      • Data Entry Statements and Functions
      • Batch Edit Statements
      • Numeric Functions
      • String Functions
      • Multiple Occurrence Functions
      • General Functions
      • Date and Time Functions
      • External File Functions
      • Synchronization Functions
      • Export Attributes
    • Templated Reporting System
    • HTML and JavaScript Integration
    • Appendix
  • <Helps for CSEntry>
  • <Helps for CSBatch>
  • <Helps for CSTab>
  • <Helps for DataViewer>
  • <Helps for TextView>
  • <Helps for TblView>
  • <Helps for TRSWin>
  • <Helps for CSDeploy>
  • <Helps for CSPack>
  • <Helps for CSFreq>
  • <Helps for CSSort>
  • <Helps for CSExport>
  • <Helps for CSReFmt>
  • <Helps for CSDiff>
  • <Helps for CSConcat>
  • <Helps for TRSSetup>
  • <Helps for ParadataViewer>
  • <Helps for ParadataConcat>
  • <Helps for CSIndex>
  • <Helps for Excel2CSPro>
  • <Helps for CSWeb>

Impute Function

Format
d = impute(item_name, new_value)
   
ʃtitle(frequency_table_title)ʅ
   
ʃvalueset(valueset_name)ʅ
   
ʃspecificʅ
   
ʃstat(ʃitem_name1, ..., item_nameNʅ)ʅ;
Description
The impute function assigns a new value to an item. The item_name is a dictionary item, either numeric or alphanumeric, and new_value is an expression that matches the type of the item. The function is similar to using the assignment operator:
item_name = new_value;
However, unlike using the assignment operator, the impute function keeps track of these assignments and generates a report on the frequency of values used in the imputations. These imputation statistics are useful when cleaning data in a batch application. If your program contains any impute statements, the results of this function will be written a frequencies file. The default file extension is .impute_freq.lst, but you can use whatever extension you prefer.
The function has several optional arguments:
Specify a frequency title (title): If supplying a string expression as a frequency_table_title, this title will be used when the frequency writer creates the imputation frequencies. If no title is specified, a default title such as "Imputed Item SEX: Sex" will be used.
Specify a frequency value set (valueset): By default, when the frequency writer creates the imputation frequencies, it shows each value imputed and it looks at the item's primary value set, if one exists, to find a label that matches the value. If you would prefer to use a different value set when creating the imputation frequencies, you can specify a valueset_name that belongs to the item.
Create a specific frequency table (specific): Typically, if you have multiple impute statements for one item (with the same valueset setting), only one frequency table will be written, with the frequencies for all imputations combined. Even if differing titles are specified, one table will be written, with the title coming from the last executed imputation. If you would like a frequencies table for a particular imputation statement, you can use the specific command to indicate that a frequency table should be created for that imputation.
Impute Stat Data
Create a data file with frequency details (stat): If you want more details about imputations beyond the frequencies showing the imputed values, you can use the stat command to generate a data file that will contain information about each imputation. For each case in the input data that results in any imputations, the stat data file will contain an entry with the case IDs, the original value of the imputed item, the new value used in the imputation, and the line number of the impute statement that resulted in the imputation. For example:
impute(SEX, 2)
stat();
This would result in a data file with a record, IMPUTE_SEX_REC, with three items: IMPUTE_SEX_INITIAL (the initial value of SEX), IMPUTE_SEX_IMPUTED (the imputed value; in this case 2), and IMPUTE_SEX_LINE_NUMBER (the line number of the imputation).
If you would like to see the value of other items that might be useful during analysis, you can specify item_name1, item_name2, and so on. The values of these items will be included in the stat data file. For example:
impute(EDUCATION, getdeck(educationHotdeckBySexAge))
stat(SEX, AGE);
By default, the only entries written to the stat data file are imputations where stat is included as part of the impute statement. Alternatively, you can specify an override:
set impute(stat, on ‖ off ‖ default);
If an override is coded, any impute statement that follows will either:
  • on: automatically be included in the stat data file as if stat() were coded.
  • off: any stat commands will be ignored.
  • default: behave using the default behavior, where the stat data file only includes entries for imputations with stat commands.
Imputation Files
Applications using the impute function can generate up to three files:
  • Imputation frequencies (with the default extension .impute_freq.lst)
  • Imputation stat dictionary (if using stat, with the default extension .impute_stat.dcf)
  • Imputation stat data (if using stat, with the default extension .impute_stat.csdb)
You can specify the names of these data files in the File Associations dialog or in your application's PFF file.
The frequencies report contains five columns:
                                  Imputed Item SEX: Sex
                                _____________________________ _____________
  Categories                         Frequency        CumFreq      %  Cum %
_______________________________ _____________________________ _____________
  1 Male                                   271            271   52.9   52.9
  2 Female                                 241            512   47.1  100.0
_______________________________ _____________________________ _____________
  Total                                    512            512  100.0  100.0
  • Categories: Lists the values that were assigned during the imputations and a value set label for the value (if applicable). For example: "2 Female."
  • Frequency: Shows the frequency (that is, the total number of times) each value was assigned. For example: 241 (code 2 assigned 241 times).
  • CumFreq: Displays the cumulative totals of the Frequency column.
  • %: Indicates what percentage each imputation represents from the total number of imputations made. For example: 47.1 (code 2 assigned 47.1% of the total number of imputations of SEX made).
  • Cum %: Displays the cumulative totals of the % column.
Return Value
When imputing a numeric item, the function returns the numeric expression new_value. When imputing an alphanumeric item, the function returns 1 (true).
Example
PROC SEX

if not invalueset(SEX) then

   
// set all heads to men and everyone else to the opposite of the head's sex
    // (note that this is not a good imputation but is just a simple example)
    if curocc() = 1 then
       
impute(SEX, 1)
       
title("Head's Sex")
       
specific;

   
else
       
impute(SEX, 3 - SEX(1));

   
endif;
See also: Imputation