• <GetStart>
  • CSPro User's Guide
    • The CSPro System
      • What is CSPro?
      • CSPro Capabilities
      • Release History
      • What's New in CSPro?
      • CSPro Applications
      • CSPro General Concepts
      • CSPro General Functionality
        • Data Sources
        • Connection String
        • Encrypted Data
        • Unicode Primer
        • Synchronization Overview
        • Paradata
        • Multiple Language Applications
        • Mapping
        • Questionnaire View
      • How To ...
    • Data Dictionary Module
    • The CSPro Language
    • Data Entry Module
    • Batch Editing Applications
    • Tabulation Applications
    • Data Sources
    • CSPro Statements and Functions
    • Templated Reporting System
    • HTML and JavaScript Integration
    • Action Invoker
    • Appendix
  • <CSEntry>
  • <CSBatch>
  • <CSTab>
  • <DataViewer>
  • <TextView>
  • <TblView>
  • <CSFreq>
  • <CSDeploy>
  • <CSPack>
  • <CSDiff>
  • <CSConcat>
  • <Excel2CSPro>
  • <CSExport>
  • <CSIndex>
  • <CSReFmt>
  • <CSSort>
  • <ParadataConcat>
  • <ParadataViewer>
  • <CSCode>
  • <CSDocument>
  • <CSView>
  • <CSWeb>

Data Sources

Overview
In CSPro, a data source is a file used to store case data. Traditionally, CSPro data files were stored as text files, but starting with CSPro 7.0, the suggested data file format is CSPro DB, a proprietary format that stores all data about cases and supports synchronization.
The format of the cases in a data source is described by a dictionary containing levels, records, and items. The way that these entities are serialized depends on the data source used, and is described on the help pages for each data source.
CSPro Data Sources
The following are full-fledged data sources that work with CSPro data to facilicate reading and writing cases:
Data SourceDefault ExtensionDescription
CSPro DB.csdbCases are stored in a SQLite database in a relational format. This data source has the most functionality and you are encouraged to use it when possible.
Encrypted CSPro DB.csdbeA version of the CSPro DB data source that supports AES-256 encryption.
Text.datCases are represented as text lines, with one record per line, one case following another in the file.
JSON.jsonCases are represented in JSON as an array of case objects.
None—A data source, not associated with any file, that does not contain any data.
In-Memory—A data source, not associated with any file, that stores cases in memory for the duration of the running application.
Export Data Sources
The following export data sources only support writing cases:
Data SourceDefault ExtensionDescription
Comma Delimited (CSV).csvCases from a single record are written to a comma-separated values file.
Semicolon Delimited.skvCases from a single record are written to a semicolon-separated values file.
Tab Delimited.tsvCases from a single record are written to a tab-separated values file.
Excel.xlsxCases are written to a Microsoft Excel file.
R.RData / .rdaCases are written to a R Data file that can be read in R.
SAS.xptCases are written to a SAS Transport file that can be read in SAS.
SPSS.savCases from a single record are written to a SPSS Statistics Data file that can be read in SPSS.
Stata.dtaCases from a single record are written to a Stata Data file that can be read in Stata.
CSPro Export—A data source that wraps another data source, allowing you to restrict what records are written.
Functionality
The functionality of each data source is summarized in the following table:
Feature
CSPro DB
Encrypted
CSPro DB
Text
JSON
None
In-Memory
Comma
Delimited
Semicolon
Delimited
Tab
Delimited
Excel
R
SAS
SPSS
Stata
Reading cases
✔
✔
✔
✔
✔
✔
✘
✘
✘
✘
✘
✘
✘
✘
Writing cases
✔
✔
✔
✔
✔
✔
✔
✔
✔
✔
✔
✔
✔
✔
Notes, case labels, and case statuses
✔
✔
✔
✔
✔
✔
✘
✘
✘
✘
✘
✘
✘
✘
Storage of more than one kind of record
✔
✔
✔
✔
✔
✔
✘
✘
✘
✔
✔
✔
✘
✘
Binary data items
✔
✔
✘
✔
✔
✔
✘
✘
✘
✘
✘
✘
✘
✘
Deleting cases
✔
✔
✔
✔
✔
✔
✘
✘
✘
✘
✘
✘
✘
✘
Undeleting cases
✔
✔
✘
✔
✘
✔
✘
✘
✘
✘
✘
✘
✘
✘
Syncing data
✔
✔
✘
✘
✘
✘
✘
✘
✘
✘
✘
✘
✘
✘
Cases with duplicate keys
✔
✔
✘
✔
✔
✔
✘
✘
✘
✘
✘
✘
✘
✘
Case identification via UUID
✔
✔
✘
✔
✔
✔
✘
✘
✘
✘
✘
✘
✘
✘
Contains an embedded dictionary
✔
✔
✘
✘
✘
✘
✘
✘
✘
✘
✘
✘
✘
✘
Allows record sorts
✘
✘
✔
✘
✔
✘
✘
✘
✘
✘
✘
✘
✘
✘
Data sources that support notes, case labels, and case statuses store notes entered by the operator or set in logic, case labels set in logic, and case status information such as whether a case has been partially saved or verified.
Data sources that contain an embedded dictionary can be opened in Data Viewer and some tools without the need to specify a dictionary.
Determining What Data Source Is Used
All data sources have behavior that can be customized by specifying properties in the connection string. When CSPro analyzes a connection string to determine what data source to use, the default behavior is to match the data source using the file extension. If the extension matches any of the extensions listed in the data source tables above, that data source is used. If it matches none, the Text data source is used. In other words, most extensions will map to a Text data source.
The "type" property can be used to override this behavior for data sources that do not use a proprietary file format. For example, the .tsv extension is associated with the Tab Delimited data source, but if you instead wanted to use that extension for a Text data source, you could use the connection string:
filename.tsv|type=Text
These are the "type" values that can be used to override the default behavior associated with the file extension: "Text", "JSON", "CSV", "Semicolon", and "Tab".
The data sources that do not use file names must be specified by specifying the "type": "None" and "Memory".
Finally, because CSPro Export wraps other data sources, to use it you must specify the "type": "CSProExport".
See also: Case Read Optimization