Validity Data Quality Rules
Validity rules validate that data values conform to expected formats, patterns, and business rules. These checks ensure data meets specific format requirements, falls within acceptable value ranges, and follows defined patterns.
POST /integration/v1/native_data_quality/monitor
Numeric Column Metrics
invalid_count: Counts the number of rows containing values that are not valid according to specified criteria.
Supported Operators: =, <, >, <=, >=, !=, <>, between, not between
Basic Usage:
{
"metric_name": "invalid_count",
"operator": "=",
"threshold": "0",
"category": "validity"
}
With Valid Min:
{
"metric_name": "invalid_count",
"operator": "<=",
"threshold": "5",
"configuration_keys": {
"valid min": 0
},
"category": "validity"
}
With Valid Max:
{
"metric_name": "invalid_count",
"operator": "<=",
"threshold": "5",
"configuration_keys": {
"valid max": 100
},
"category": "validity"
}
With Valid Values:
{
"metric_name": "invalid_count",
"operator": "=",
"threshold": "0",
"configuration_keys": {
"valid values": [1, 2, 3, 4, 5]
},
"category": "validity"
}
With Invalid Values:
{
"metric_name": "invalid_count",
"operator": "=",
"threshold": "0",
"configuration_keys": {
"invalid values": [-1, -999, 0]
},
"category": "validity"
}
invalid_percent: Calculates the percentage of rows containing invalid values relative to the total row count.
Supported Operators: =, <, >, <=, >=, !=, <>, between, not between
Basic Usage:
{
"metric_name": "invalid_percent",
"operator": "<=",
"threshold": "2.0",
"category": "validity"
}
With Valid Min:
{
"metric_name": "invalid_percent",
"operator": "<",
"threshold": "1.0",
"configuration_keys": {
"valid min": 18
},
"category": "validity"
}
With Valid Max:
{
"metric_name": "invalid_percent",
"operator": "<",
"threshold": "1.0",
"configuration_keys": {
"valid max": 120
},
"category": "validity"
}
With Valid Values:
{
"metric_name": "invalid_percent",
"operator": "<",
"threshold": "1.0",
"configuration_keys": {
"valid values": [18, 21, 25, 30]
},
"category": "validity"
}
With Invalid Values:
{
"metric_name": "invalid_percent",
"operator": "<",
"threshold": "1.0",
"configuration_keys": {
"invalid values": [0, -1, 999]
},
"category": "validity"
}
Text Column Metrics
invalid_count: Counts the number of rows containing values that don't conform to specified formats or patterns.
Supported Operators: =, <, >, <=, >=, !=, <>, between, not between
With Format Validation:
Optional predefined formats can be used for validation.
{
"metric_name": "invalid_count",
"operator": "=",
"threshold": "0",
"configuration_keys": {
"valid format": "email"
},
"category": "validity"
}
With Invalid Format:
{
"metric_name": "invalid_count",
"operator": "=",
"threshold": "0",
"configuration_keys": {
"invalid format": "email"
},
"category": "validity"
}
With Valid Length:
{
"metric_name": "invalid_count",
"operator": "=",
"threshold": "0",
"configuration_keys": {
"valid length": 10
},
"category": "validity"
}
With Valid Min Length:
{
"metric_name": "invalid_count",
"operator": "<=",
"threshold": "5",
"configuration_keys": {
"valid min length": 3
},
"category": "validity"
}
With Valid Max Length:
{
"metric_name": "invalid_count",
"operator": "<=",
"threshold": "5",
"configuration_keys": {
"valid max length": 50
},
"category": "validity"
}
With Regex Validation:
{
"metric_name": "invalid_count",
"operator": "=",
"threshold": "0",
"configuration_keys": {
"valid regex": "^[A-Z]{2}[0-9]{4}$"
},
"category": "validity"
}
With Invalid Regex:
{
"metric_name": "invalid_count",
"operator": "=",
"threshold": "0",
"configuration_keys": {
"invalid regex": "^[A-Z]{2}[0-9]{4}$"
},
"category": "validity"
}
With Valid Values:
{
"metric_name": "invalid_count",
"operator": "<=",
"threshold": "5",
"configuration_keys": {
"valid values": ["Active", "Inactive", "Pending"]
},
"category": "validity"
}
With Invalid Values:
{
"metric_name": "invalid_count",
"operator": "<=",
"threshold": "5",
"configuration_keys": {
"invalid values": ["Unknown", "Error", ""]
},
"category": "validity"
}
invalid_percent: Calculates the percentage of rows containing invalid text values.
Supported Operators: =, <, >, <=, >=, !=, <>, between, not between
Phone Number Validation:
{
"metric_name": "invalid_percent",
"operator": "<=",
"threshold": "1.0",
"configuration_keys": {
"valid format": "phone number"
},
"category": "validity"
}
Credit Card Validation:
{
"metric_name": "invalid_percent",
"operator": "=",
"threshold": "0",
"configuration_keys": {
"valid format": "credit card number"
},
"category": "validity"
}
Time Column Metrics
invalid_count: Counts the number of rows containing invalid time/date values.
Supported Operators: =, <, >, <=, >=, !=, <>, between, not between
Basic Usage:
{
"metric_name": "invalid_count",
"operator": "=",
"threshold": "0",
"configuration_keys": {
"valid values": ["2023-01-01", "2023-12-31"]
},
"category": "validity"
}
With Invalid Values:
{
"metric_name": "invalid_count",
"operator": "=",
"threshold": "0",
"configuration_keys": {
"invalid values": ["1900-01-01", "9999-12-31"]
},
"category": "validity"
}
invalid_percent: Calculates the percentage of rows containing invalid time/date values.
Supported Operators: =, <, >, <=, >=, !=, <>, between, not between
{
"metric_name": "invalid_percent",
"operator": "<=",
"threshold": "0.5",
"configuration_keys": {
"invalid values": ["0000-00-00", "1900-01-01"]
},
"category": "validity"
}
Custom Metrics (All Column Types)
sql(Custom SQL Query): Define custom validity metrics using SQL queries.
Supported Operators: =, <, >, <=, >=, !=, <>, between, not between
Required Configuration:
{
"metric_name": "sql",
"operator": "<=",
"threshold": "10",
"configuration_keys": {
"custom_metric": "business_rule_validation",
"query value": "SELECT COUNT(*) FROM schema.table WHERE price < 0 OR quantity < 0"
},
"category": "validity"
}
Supported Formats for Valid Format and Invalid Format
Financial Formats
| Format Name | Example | Description |
|---|---|---|
credit card number | 1234 5678 9012 3456 | Various credit card number formats |
money | $123.45 USD | Currency with symbol and abbreviation |
money comma | $123,45 USD | Currency with comma decimal separator |
money point | $123.45 USD | Currency with point decimal separator |
Date Formats
| Format Name | Example | Description |
|---|---|---|
date eu | 31/12/2023 | European date format (dd/mm/yyyy) |
date us | 12/31/2023 | US date format (mm/dd/yyyy) |
date inverse | 2023/12/31 | Inverse date format (yyyy/mm/dd) |
date iso 8601 | 2021-04-28T09:00:00+02:00 | ISO 8601 standard format |
Number Formats
| Format Name | Example | Description |
|---|---|---|
integer | 123 | Whole numbers |
decimal | 123.45 | Decimal numbers (. or ,) |
percentage | 50% | Percentage values |
positive integer | 123 | Positive whole numbers only |
negative integer | -123 | Negative whole numbers only |
Contact Formats
| Format Name | Example | Description |
|---|---|---|
email | [email protected] | Valid email addresses |
phone number | +1 123-123-1234 | Various phone number formats |
Technical Formats
| Format Name | Example | Description |
|---|---|---|
ip address | 192.168.1.1 | IPv4 addresses |
ipv4 address | 192.168.1.1 | IPv4 addresses (explicit) |
ipv6 address | 2001:0db8:85a3::8a2e | IPv6 addresses |
uuid | 550e8400-e29b-41d4-a716 | Universally unique identifiers |
Time Formats
| Format Name | Example | Description |
|---|---|---|
time 12h | 02:30:45 PM | 12-hour format with seconds |
time 24h | 14:30:45 | 24-hour format with seconds |
time 12h nosec | 02:30 PM | 12-hour format without seconds |
time 24h nosec | 14:30 | 24-hour format without seconds |
Configuration Options
Text Validation Options
| Property | Description |
|---|---|
valid format | Predefined format from supported formats list |
invalid format | Predefined format that should be considered invalid |
valid regex | Custom regex pattern for valid values |
invalid regex | Custom regex pattern for invalid values |
valid values | List of acceptable values |
invalid values | List of unacceptable values |
valid length | Exact valid string length |
valid min length | Minimum valid string length |
valid max length | Maximum valid string length |
Numeric Validation Options
| Property | Description |
|---|---|
valid values | List of acceptable numeric values |
invalid values | List of unacceptable numeric values |
valid min | Minimum acceptable numeric value |
valid max | Maximum acceptable numeric value |
Updated about 1 month ago