Freq
include(variables_to_tabulate)
ʃexclude(variables_not_to_tabulate)ʅ
ʃdisjointʅ
ʃbreakdown(length)ʅ
ʃuniverse(condition)ʅ
ʃweight(weight_value)ʅ
ʃformatting_optionsʅ
;
The
Freq statement is used to define a frequency table that will be written to the
frequencies file. Unlike
named frequencies, which allow for control over when frequencies are tallied, frequencies generated using the
Freq statement are tallied at the location where they are defined. Due to this, the statement cannot be located in
PROC GLOBAL, in user-defined functions, or in application procedures.
The
Tabulate Frequencies tool generates
Freq statements automatically, so one way to learn about how to use the statement is to select items to tabulate in that tool and then use the
View ->
Batch Logic option to see the commands used to generate the specified frequencies.
The include command is used to specify what variables should be tabulated. At least one frequency table will be created for every variable specified in the include list. The variables_to_tabulate is a list of variables, separated by commas, that can include:
- Dictionary names: dictionaries, records, and items
- Form names: forms, groups, and blocks
- Logic names: numeric and string variables
The optional exclude command is used to specify variables that should be removed from the inclusion list. The variables_not_to_tabulate is a list of variables as defined above. The exclude command is particularly useful when including records. For example, if you want to tabulate most items on a record with a couple exceptions, you might code:
include(PERSON_REC)
exclude(P25_RELIGION, P26_TRIBE)
When including or excluding a name that may contain more than one variable—dictionaries, records, forms, groups, and blocks—CSPro uses a rule to determine whether items contained in that grouping should be included in the list of variables to tabulate:
Default rule: Include the item if it has a value set; if no value set is defined, then include the item if it has length 1 - 4.
You can override the default selection by adding one or more of these flags to the include/exclude list:
Flag | Include/Exclude All... |
numeric | numeric items |
float | numeric items with decimals |
integer | numeric items without decimals |
long | numeric items without decimals and length 3 - 15 |
short | numeric items without decimals and length 1 - 2 |
alpha | alphanumeric items |
These six flags will never include items that have subitems. You can use a combination of flags; for example, this would include all items from PERSON_REC with the exception of items that have subitems (though the subitems would be included).
Leaving the include list empty is as if you coded the primary dictionary name. For example:
include()
// may be translated to:
include(CENSUS_DICT)
If you do not need to exclude any variables, the include command is optional. For example, these unnamed frequency statements are the same:
By default, when frequencies are tallied for a multiply-occurring item, all of the occurrences of the item are tallied. For example, assuming that P03_SEX is on a record with 50 occurrences, coding this will tally all defined occurrences of P03_SEX:
If the first house is vacant, then no sex values are tallied; if the second house has three people, then three sex values are tallied; and so on.
If you would like to tally a specific occurrence, you can specify the occurrence in the include/exclude list. For example, this would create two tables, one for all sex occurrences, and one for the head's sex (assuming that the head is the first occurrence):
Freq include(P03_SEX, P03_SEX(1));
You can specify occurrence values when using items or records. PERSON_REC(1), for example, would create tables for the first occurrence of the items in PERSON_REC. If you specify an occurrence, the value will be tallied regardless of whether the occurrence exists. For example, while P03_SEX would not tally vacant households, P03_SEX(1) will include tallies of blank values for vacant households.
An optional command,
disjoint, is a shortcut way of indicating that a frequency table should be created for every occurrence of a variable. For example, this code would create a table for each of the occurrences of
P03_SEX, resulting in 50 tables (for occurrence 1, occurrence 2, and so on until occurrence 50):
When using
disjoint, you can use
(*) as an occurrence to specify that you would like to ignore the disjoint setting. For example, this code would create 49 tables (for the combined occurrences, for occurrence 3, occurrence 4, and so on until occurrence 50).
Freq include(P03_SEX(*), P03_SEX)
exclude(P03_SEX(1), P03_SEX(2))
disjoint;
The optional command breakdown allows you to control how alphanumeric items and string variables are tallied. A positive numeric constant, length, specifies a number used to split these values before tallying. This can be useful when creating frequencies for data collected using checkboxes. For example, assuming CHECKBOX_FIELD occurs twice, first as "AB" and then as "BC":
Freq include(CHECKBOX_FIELD); // results in: "AB" (1)
// "BC" (1)
Freq include(CHECKBOX_FIELD) // results in: "A" (1)
breakdown(1); // "B" (2)
// "C" (1)
The optional command
universe allows you to specify a condition under which the frequency should be tallied. The values will be tallied when the
condition evaluates to true.
The optional command
weight allows you to weight each tally. The
weight_value can be a constant number (like
10), a dictionary item (like
HH_WEIGHT) or any other numeric expression. If no weight is provided, a weight of 1 is used during the tallying.
Optional
formatting options allow you to control how the frequency tables are generated. The formatting options include the following commands:
valueset,
distinct,
vset,
heading,
stat,
percentiles,
nofreq,
decimals,
sort,
nonetpercents, and
pagelength.
PROC QUEST
// create frequency tables for all numeric items in the
// HOUSING_REC record with the exception of H13_PERSONS
Freq
include(numeric, HOUSING_REC)
exclude(H13_PERSONS);
PROC QUEST
// create frequency tables for the default selection of items in
// the PERSON_REC record, tallying only the first record occurrence
// universe is used to make sure that we do not tally empty households
// weight is 20 because we are creating frequencies on a 5% sample file
// distinct means that the frequency tables will show all values,
// not combining values if the value sets have ranges (such as age 0-4)
Freq
include(PERSON_REC(1))
universe(totocc(PERSON_REC) > 0)
weight(20)
distinct;
PROC QUEST
// create a frequency table for the item H15_ASSETS, which
// was collected using a checkbox with each value of length 2
Freq(H15_ASSETS)
breakdown(2);