Static
imputation means providing a value from a predetermined set of values. Suppose a person's sex is missing or invalid (out of range). If we decide to change the response using static imputation, there are two basic methods to use: hard coded or from a "cold deck."
Using our example above, we would programmatically set SEX to the value we think it should be. For example:
What we've done above is a very primitive form of imputation. Essentially, when we encounter a bad value for sex, 50% of the time the variable SEX will be assigned the value "male," and 50% of the time the value "female." Note that no accommodation was made for other responses; for example, if fertility data were present, you might not wish to make this person "male." Or if this were an enumeration of a prison where the entire population is male, you would probably not want to be adding females to this group! So while this method can be used, you need to take into account other responses. We attempt to do this in our next method of static imputation, where we use a cold deck.
With the cold deck approach, known information about individuals with similar characteristics (sex, age, relationship to household head, economic status, etc.) is used to determine the "most appropriate" response to be used when some piece (or pieces) of related information is invalid.
For example, suppose a person's age is missing or invalid. We might have a table as follows, where the row indices represent the person's sex (1 = male, 2 = female), and the column indices refers to the person's relationship codes (1-5) (this table assumes that the relationship and sex codes have already been corrected):
| Head (1) | Spouse (2) | Child (3) | Other Relative (4) | Non-Relative (5) |
Male (1) | 35 | 50 | 10 | 41 | 65 |
Female (2) | 32 | 48 | 10 | 37 | 68 |
In the event that a female child was found to have a missing age, she would be given the age of 10. If a female head of the household had a missing age, then her age would be given as 32. This method is acceptable if you do not need to use it often; that is, if very little of your data is missing or invalid. Also, if your population is fairly homogeneous (for example if you were correcting for religion and 90% of the population is Muslim), then this will not result in an unrealistic portrayal of your country.
However, if you find yourself referring to this table often, or you have a very diverse population where a few static values do not give an accurate portrayal, then your data will end up skewed. For these reasons (and others),
dynamic imputation is the preferred method.