Skip to content

Configuration section: data_validations

Description

Data validation rules ensure the quality of records delivered by the DataBridge Engine Pipeline. Records that fail validation are written to the Rejected Records Data Frame and delivered to the Results folder under the Drop Zone folder.

Specification

data_frame (string)

The Data Frame to evaluate.

field (string | null)

Name of the field to validate (or null for dataset-level tests)

test_type ("unique" | "not_null" | "accepted_values" | "num_rows")

The type of validation to run

test_configuration (object | null)

Shape depends on test_type:

unique

A per-record validation.

Dedicated configuration not required.

The first occurrence of each duplicate value in the field passes; subsequent duplicates are rejected. "First" is determined by the input record order.

not_null

A per-record validation.

Dedicated configuration not required.

Rejects all records where the value in the specified field is null.

accepted_values

A per-record validation.

Accepts records where the specified field is in the configured set; rejects others.

Configuration

{ "accepted_values": { "values": ["Active", "Inactive"] } }

Example

json
{
  "version": 1,
  "data_validations": [
    {
      "data_frame": "employees",
      "field": "id",
      "test_type": "unique",
      "test_configuration": null
    },
    {
      "data_frame": "employees",
      "field": "status",
      "test_type": "accepted_values",
      "test_configuration": {
        "accepted_values": {"values": ["Active", "Inactive"]}
      }
    }
  ]
}