d = impute(item_name, new_value)
ʃtitle(frequency_table_title)ʅ
ʃvalueset(valueset_name)ʅ
ʃspecificʅ
ʃstat(ʃitem_name1, ..., item_nameNʅ)ʅ;
The
impute function assigns a new value to an item. The
item_name is a dictionary item, either numeric or alphanumeric, and
new_value is an expression that matches the type of the item. The function is similar to using the assignment operator:
However, unlike using the assignment operator, the
impute function keeps track of these assignments and generates a report on the frequency of values used in the imputations. These
imputation statistics are useful when cleaning data in a batch application. If your program contains any
impute statements, the results of this function will be written a frequencies file. The default file extension is .impute_freq.lst, but you can use whatever extension you prefer.
The function has several optional arguments:
Specify a frequency title (title): If supplying a string expression as a frequency_table_title, this title will be used when the frequency writer creates the imputation frequencies. If no title is specified, a default title such as "Imputed Item SEX: Sex" will be used.
Specify a frequency value set (
valueset): By default, when the frequency writer creates the imputation frequencies, it shows each value imputed and it looks at the item's primary value set, if one exists, to find a label that matches the value. If you would prefer to use a different value set when creating the imputation frequencies, you can specify a
valueset_name that belongs to the item.
Create a specific frequency table (
specific): Typically, if you have multiple
impute statements for one item (with the same
valueset setting), only one frequency table will be written, with the frequencies for all imputations combined. Even if differing titles are specified, one table will be written, with the title coming from the last executed imputation. If you would like a frequencies table for a particular imputation statement, you can use the
specific command to indicate that a frequency table should be created for that imputation.
Create a data file with frequency details (
stat): If you want more details about imputations beyond the frequencies showing the imputed values, you can use the
stat command to generate a data file that will contain information about each imputation. For each case in the input data that results in any imputations, the stat data file will contain an entry with the case IDs, the original value of the imputed item, the new value used in the imputation, and the line number of the
impute statement that resulted in the imputation. For example:
This would result in a data file with a record, IMPUTE_SEX_REC, with three items: IMPUTE_SEX_INITIAL (the initial value of SEX), IMPUTE_SEX_IMPUTED (the imputed value; in this case 2), and IMPUTE_SEX_LINE_NUMBER (the line number of the imputation).
If you would like to see the value of other items that might be useful during analysis, you can specify item_name1, item_name2, and so on. The values of these items will be included in the stat data file. For example:
impute(EDUCATION, getdeck(educationHotdeckBySexAge))
stat(SEX, AGE);
By default, the only entries written to the stat data file are imputations where
stat is included as part of the
impute statement. Alternatively, you can specify an override:
If an override is coded, any
impute statement that follows will either:
- on: automatically be included in the stat data file as if stat() were coded.
- off: any stat commands will be ignored.
- default: behave using the default behavior, where the stat data file only includes entries for imputations with stat commands.
Applications using the
impute function can generate up to three files:
- Imputation frequencies (with the default extension .impute_freq.lst)
- Imputation stat dictionary (if using stat, with the default extension .impute_stat.dcf)
- Imputation stat data (if using stat, with the default extension .impute_stat.csdb)
The frequencies report contains five columns:
Imputed Item SEX: Sex
_____________________________ _____________
Categories Frequency CumFreq % Cum %
_______________________________ _____________________________ _____________
1 Male 271 271 52.9 52.9
2 Female 241 512 47.1 100.0
_______________________________ _____________________________ _____________
Total 512 512 100.0 100.0
- Categories: Lists the values that were assigned during the imputations and a value set label for the value (if applicable). For example: "2 Female."
- Frequency: Shows the frequency (that is, the total number of times) each value was assigned. For example: 241 (code 2 assigned 241 times).
- CumFreq: Displays the cumulative totals of the Frequency column.
- %: Indicates what percentage each imputation represents from the total number of imputations made. For example: 47.1 (code 2 assigned 47.1% of the total number of imputations of SEX made).
- Cum %: Displays the cumulative totals of the % column.
When imputing a numeric item, the function returns the numeric expression new_value. When imputing an alphanumeric item, the function returns 1 (true).