JSON Data Source

Overview

The JSON data source allows reading and writing data to a text file containing a JSON array of case objects. Because JSON is widely used, and human readable, using this data source may be a good way to archive your data, or to work with your data in other applications. However, unless there is a requirement that your data be in JSON format, during data collection you are encouraged to use the CSPro DB data source as it has the most functionality of all data sources.

The JSON data source reads and writes UTF-8 text, and treats all files as UTF-8, even those without a BOM (byte order mark).

The JSON data source is used when a file has the extension .json.

Functionality

The JSON data source supports the following features:

Feature	Supported
Reading cases	✔
Writing cases	✔
Notes, case labels, and case statuses	✔
Storage of more than one kind of record	✔
Binary data items	✔
Deleting cases	✔
Undeleting cases	✔
Syncing data	✘
Cases with duplicate keys	✔
Case identification via UUID	✔
Contains an embedded dictionary	✘
Allows record sorts	✘

Associated Files

The JSON data source stores all case-related information in the file, but also has one associated file:

Index: The index stores information about where cases are located in the data file, allowing CSPro to quickly lookup cases. Because the index is a SQLite database, it is possible to query the index using SQL statements.

Binary Data

The JSON data source stores binary data in one of two ways:

Saved to the disk: By default, binary data files are saved in a subdirectory, located in the same directory as the JSON file, with the suffix " (files)". Files are named using the MD5 of the binary data, followed by the file extension, if known. For example, binary data carn03.jpg, collected as part of memory.json, might be saved as: memory.json (files)/244358f4725ac956bb74b3a17a588eb9.jpg. When reading case data, binary data on the disk is loaded asynchronously, read from the disk only when needed by the application.
Embedded: Alternatively, the binary data can be embedded in the JSON file, encoded as a data URL. This option will result in larger data files, but results in a single file containing all case data.

Customizable Behavior

The following behavior can be customized by specifying properties in the connection string. The default behavior is marked with ⁺⁺⁺.

Property Name and Values	Description

"binaryDataDirectory"	By default, binary data saved to the disk is stored in a subdirectory with the suffix "(files)". This property allows you to specify a different directory where you want to read and write binary data. The property is evaluated relative to the path of the data file.

"binaryDataFormat"	Determines how binary data is stored (as described above).
"dataUrl"	Binary data is embedded in the file as a data URL.
"disk" ⁺⁺⁺	Binary data is saved to the disk.

"cache"	Determines if cases are cached in memory. This may be useful to advanced users who want to optimize programs that do a lot of case lookups.
true	Cases are cached, meaning that a case is only read from the data source once.
false ⁺⁺⁺	Cases are not cached.

"jsonFormat"	Determines the amount of whitespace used when writing cases.
"compact"	Cases are written with no extra spacing.
"pretty" ⁺⁺⁺	Cases are written in a more readable format with spaces and newlines.

"verbose"	Determines if cases are written in verbose mode, outputting all case details rather than only those with defined, non-default, values.
true	Cases are written in verbose mode.
false ⁺⁺⁺	Cases are not written in verbose mode.

"writeBlankValues"	Determines if items without a value (notappl numbers, blank strings, etc.), are written.
true	Blank values are written as objects without content.
false ⁺⁺⁺	Blank values are not written.

"writeLabels"	Determines if the value set label associated with the item is written along with the code.
true	Both codes and labels are written.
false ⁺⁺⁺	Only codes are written.

For example, the following connection string, specified in a data entry PFF, would attach to an external dictionary a JSON file where labels are written and the binary data directory is overridden to a subdirectory named "Images":

CENSUS_2024_DICT=.\Census.json|writeLabels=true&binaryDataDirectory=Images

See also: Data Sources