import seedcase_soil as so
from pathlib import Path
properties = so.read_properties(so.Example.simple.address)
so.write_properties(properties, Path('datapackage.json'))Configuration files
check-datapackage settings using a configuration file.
check-datapackage can be configured through command-line options and a configuration file. This guide covers configuration files for the CLI. For command-line usage, see the CLI guide. For the Python Config class used with check(), see Configuring the checks.
A configuration file lets you set preferences once, so you don’t have to repeat them every time you run check-datapackage check. check-datapackage supports two configuration files: .cdp.toml and pyproject.toml. These files should be placed in the root of your project—check-datapackage will find them automatically.
Use .cdp.toml if you want a file dedicated to check-datapackage settings, or pyproject.toml if your Data Package is a Python project that already uses pyproject.toml and you prefer to keep all config settings in one place.
Settings provided via the CLI or Python interface always take priority over the configuration file.
Configurable settings
Before diving into editing your configuration, let’s see which settings can be configured via the configuration file:
strict: Iftrue, include checks for Data Package properties marked as “SHOULD” in addition to those marked as “MUST”. Defaults tofalse.exclusions: Exclude issues by JSONPath, issue type, or both.extensions: Add extra checks that supplement the Data Package standard. Configuration files supportRequiredCheckextensions.
Note that the positional source argument cannot be set via the configuration file; it must be passed via the command line or Python interface.
Configuration file syntax
Both configuration files use TOML syntax. The examples below start with .cdp.toml, then show the pyproject.toml variant. All the examples below assume that a datapackage.json file exists in the current directory. Here, we use the simple Data Package example from our package Soil. You can write the same content to datapackage.json in your current directory by running the following code snippet:
Strict checking
Before creating a configuration file, we can see that the simple example data package passes all checks:
Terminal
check-datapackage checkAll checks passed!
Now, create a .cdp.toml file and enable strict checking:
.cdp.toml
strict = trueWhen you run check-datapackage check again, strict mode is enabled and the example package does not pass the stricter checks because the id property is missing:
Terminal
check-datapackage check╭─ DataPackageError ───────────────────────────────────────╮
│ 1 issue was found in your datapackage.json: │
│ │
│ At top level: │
│ | │
│ | id: <MISSING> │
│ | ^^^^^^^^^ │
│ 'id' is a required property │
│ │
╰──────────────────────────────────────────────────────────╯
This is the same as if we had run check-datapackage check --strict before creating the configuration file.
If you prefer to configure check-datapackage via pyproject.toml, the syntax is almost the same as above; the only difference is that you need to include a heading to indicate where in the file check-datapackage can find the settings. Open up your pyproject.toml (or create one in the root of your project) and add a new table header [tool.check-datapackage] with the same configuration settings as you used above:
pyproject.toml
[tool.check-datapackage]
strict = trueExclusions
Above, we saw that the simple example data package fails strict checking because the required id property is missing. To exclude that issue from the checks, we can add an exclusion to the .cdp.toml file that matches both the JSONPath and issue type:
.cdp.toml
strict = true
[[exclusions]]
jsonpath = "$.id"
type = "required"Running the same command now passes:
Terminal
check-datapackage checkAll checks passed!
If you want to add different exclusions, you can keep adding more [[exclusions]] below the current one.
If you prefer to configure check-datapackage via pyproject.toml, the syntax is almost the same as above; the only difference is that you need to include a heading to indicate where in the file check-datapackage can find the settings:
.cdp.toml
[tool.check-datapackage]
strict = true
[[tool.check-datapackage.exclusions]]
jsonpath = "$.id"
type = "required"Extensions
Required checks
You can also add required checks in your configuration file. To require the contributors property, use the TOML header [[extensions.required_checks]]:
.cdp.toml
[[extensions.required_checks]]
jsonpath="$.contributors"
message="The 'contributors' field is required in the Data Package properties."Run the check:
Terminal
check-datapackage check╭─ DataPackageError ───────────────────────────────────────╮
│ 1 issue was found in your datapackage.json: │
│ │
│ At top level: │
│ | │
│ | contributors: <MISSING> │
│ | ^^^^^^^^^ │
│ The 'contributors' field is required in the Data Package │
│ properties. │
│ │
╰──────────────────────────────────────────────────────────╯
If you use pyproject.toml, put the same nested table under [tool.check-datapackage]:
pyproject.toml
[tool.check-datapackage]
[[tool.check-datapackage.extensions.required_checks]]
jsonpath="$.contributors"
message="The 'contributors' field is required in the Data Package properties."You can add more required checks by adding more [[extensions.required_checks]] tables.
Custom checks
Custom checks require a Python function for their check argument, so define CustomCheck extensions in Python rather than in a TOML configuration file.
# Clean up example data from tmp dir
Path('datapackage.json').unlink(missing_ok=True)