• <GetStart>
  • CSPro User's Guide
    • The CSPro System
    • Data Dictionary Module
    • The CSPro Language
      • Introduction to CSPro Language
      • Data Requirements
      • CSPro Program Structure
      • Programming Standards
      • Code Folding
      • Debugging CSPro Applications
      • Declaration Section
      • Procedural Sections
      • Logic
      • Language Elements
    • Data Entry Module
    • Batch Editing Applications
    • Tabulation Applications
    • Data Sources
    • CSPro Statements and Functions
    • Templated Reporting System
    • HTML and JavaScript Integration
    • Action Invoker
    • Appendix
  • <CSEntry>
  • <CSBatch>
  • <CSTab>
  • <DataViewer>
  • <TextView>
  • <TblView>
  • <CSFreq>
  • <CSDeploy>
  • <CSPack>
  • <CSDiff>
  • <CSConcat>
  • <Excel2CSPro>
  • <CSExport>
  • <CSIndex>
  • <CSReFmt>
  • <CSSort>
  • <ParadataConcat>
  • <ParadataViewer>
  • <CSCode>
  • <CSDocument>
  • <CSView>
  • <CSWeb>

Data Requirements

Data files appear in many different formats and structures. CSPro can work with several types of data sources, but most people work with CSPro DB or Text files. If you are using data files created by another software package, you must save the data in a separate text file before you can use it with CSPro. Data files are limited to 2 gigabytes in overall size; the maximum length of any record in the file is 32,000 characters. CSPro encodes data files using UTF-8. Read more about Unicode text files at the Unicode Primer.
CSPro processes one case at a time. Each record must contain a unique questionnaire identification code in the same position in each record. This number must be the same for all records of the same case. If the file is a "flat" file, meaning that each questionnaire contains only one record, the questionnaire identification becomes irrelevant. In this case, any data item can be used as the questionnaire identification, but it should be unique for each record. CSPro uses the case identification values to determine where one case ends, and the next one begins. Records belonging to the same case must be contiguous within the data file, but there is no requirement that the data file be sorted by case identifier.
CSPro can handle a data file with multiple record types—for example, housing and population—but a record type code must identify the type of record. This code must be in the same position in each record. Within the same record type, each data field must be in the same position.
CSPro can process one input data file at a time, but it can access one or more external files. These files must also be described by a data dictionary.
In some survey data, especially where the total number of data items is great but only a few responses are expected, the user may choose a format in which each data field is preceded by a "source code" relating it back to the original document. By using this scheme, non-response fields (empty responses) need not be entered. With this type of format, each data field is not in a pre-defined location on the record. Before a file like this can be processed by CSPro, it must be reformatted so that the data fields are in fixed positions. Items in data files must be fixed format, that is, items must have the same starting position and length in every record where they occur.
See also: Data File Type Structure, Data Sources