Completeness Data Quality Rules

Completeness rules validate the presence of values in your data. These checks ensure that required fields contain data and identify missing or null values that could impact data quality.

POST /integration/v1/native_data_quality/monitor

Missing Values Configuration

Defines custom values to be treated as missing.

PropertyDescription
missing valuesList of specific values to treat as missing (e.g., ["", "N/A", "NULL", -999])
missing regexRegular expression pattern to identify missing values (e.g., "^\\s*$" for empty/whitespace-only values)

Numeric Column Metrics

  • missing_count: Counts the number of rows containing NULL values and any user-defined values that qualify as missing.

Supported Operators: =, <, >, <=, >=, !=, <>, between, not between

Basic:

{
       "metric_name": "missing_count",
       "operator": "=",
       "threshold": "0",
       "category": "completeness"
   }

With missing custom values:

{
       "metric_name": "missing_count",
       "operator": "<=",
       "threshold": "5",
       "configuration_keys": {
           "missing values": [-1, -999, 0]
       },
       "category": "completeness"
   }
  • missing_percent: Calculates the percentage of rows containing NULL values and user-defined missing values relative to the total row count.

Supported Operators: =, <, >, <=, >=, !=, <>, between, not between

Basic:

{
       "metric_name": "missing_percent",
       "operator": "<",
       "threshold": "5.0",
       "category": "completeness"
   }

With custom missing values:

{
       "metric_name": "missing_percent",
       "operator": "between",
       "threshold": ["0", "2"],
       "configuration_keys": {
           "missing values": [-1, -999]
       },
       "category": "completeness"
   }

Text Column Metrics

  • missing_count: Counts the number of rows containing NULL values and any user-defined values that qualify as missing.

Supported Operators: =, <, >, <=, >=, !=, <>, between, not between

Basic:

{
       "metric_name": "missing_count",
       "operator": "=",
       "threshold": "0",
       "category": "completeness"
   }

With custom missing values:

{
       "metric_name": "missing_count",
       "operator": "<=",
       "threshold": "10",
       "configuration_keys": {
           "missing values": ["", "N/A", "NULL", "unknown"]
       },
       "category": "completeness"
   }

With missing regex pattern:

{
       "metric_name": "missing_count",
       "operator": "=",
       "threshold": "0",
       "configuration_keys": {
           "missing regex": "^\\s*$"
       },
       "category": "completeness"
   }
  • missing_percent: Calculates the percentage of rows containing NULL values and user-defined missing values relative to the total row count.

Supported Operators: =, <, >, <=, >=, !=, <>, between, not between

Basic:

 {
       "metric_name": "missing_percent",
       "operator": "<=",
       "threshold": "1.0",
       "category": "completeness"
   }

With custom missing values:

{
       "metric_name": "missing_percent",
       "operator": "<",
       "threshold": "3.0",
       "configuration_keys": {
           "missing values": ["", "N/A", "TBD"]
       },
       "category": "completeness"
   }

With missing regex pattern:

{
       "metric_name": "missing_percent",
       "operator": "<=",
       "threshold": "2.0",
       "configuration_keys": {
           "missing regex": "^(na|n/a|none)$"
       },
       "category": "completeness"
   }

Time Column Metrics

  • missing_count: Counts the number of rows containing NULL values and any user-defined values that qualify as missing.

Supported Operators: =, <, >, <=, >=, !=, <>, between, not between

Basic:

 {
       "metric_name": "missing_count",
       "operator": "=",
       "threshold": "0",
       "category": "completeness"
   }

With custom missing values:

{
       "metric_name": "missing_count",
       "operator": "<=",
       "threshold": "5",
       "configuration_keys": {
           "missing values": ["1900-01-01"]
       },
       "category": "completeness"
   }
  • missing_percent: Calculates the percentage of rows containing NULL values and user-defined missing values relative to the total row count.

Supported Operators: =, <, >, <=, >=, !=, <>, between, not between

Basic:

{
       "metric_name": "missing_percent",
       "operator": "<=",
       "threshold": "0.5",
       "category": "completeness"
   }

With custom missing values:

{
       "metric_name": "missing_percent",
       "operator": "between",
       "threshold": ["0", "1"],
       "configuration_keys": {
           "missing values": ["1900-01-01", "9999-12-31"]
       },
       "category": "completeness"
   }

Custom Metrics (All Column Types)

  • sql (Custom SQL Query): Defines custom completeness metrics using SQL queries.

Supported Operators: =, <, >, <=, >=, !=, <>, between, not between

{
       "metric_name": "sql",
       "operator": "=",
       "threshold": "0",
       "configuration_keys": {
           "custom_metric": "data_completeness",
           "query value": "SELECT COUNT(*) FROM schema.table WHERE required_field IS NULL"
       },
       "category": "completeness"
   }