Timeliness Data Quality Rules
Timeliness rules validate the freshness and temporal aspects of data. These checks ensure data is current, updated within expected timeframes, and meets business requirements for data latency and availability.
POST /integration/v1/native_data_quality/monitor
Time Column Metrics
freshness: Measures the age of the youngest row in a table using timestamp columns. This check validates data freshness by examining how recently data was last updated.
Supported Operators: < (only)
Threshold Formats:
#d- Number of days (e.g.,"3d"= 3 days)#h- Number of hours (e.g.,"1h"= 1 hour)#m- Number of minutes (e.g.,"30m"= 30 minutes)#d#h- Days and hours (e.g.,"1d6h"= 1 day and 6 hours)#h#m- Hours and minutes (e.g.,"1h30m"= 1 hour and 30 minutes)
Basic Freshness Check:
{
"metric_name": "freshness",
"operator": "<",
"threshold": "1d",
"category": "timeliness",
"check_description": "Data should be updated within last 24 hours"
}
Hourly Data Freshness:
{
"metric_name": "freshness",
"operator": "<",
"threshold": "2h",
"category": "timeliness",
"check_description": "Real-time data should be no more than 2 hours old"
}
Business Hours Freshness:
{
"metric_name": "freshness",
"operator": "<",
"threshold": "1d6h",
"category": "timeliness",
"check_description": "Business data should be updated within 1 day 6 hours"
}
High-Frequency Data:
{
"metric_name": "freshness",
"operator": "<",
"threshold": "15m",
"category": "timeliness",
"check_description": "Streaming data should be no more than 15 minutes old"
}
Custom Metrics
sql(Custom SQL Query): Define custom timeliness metrics using SQL queries for comprehensive temporal analysis.
Supported Operators: =, <, >, <=, >=, !=, <>, between, not between
Event Processing Timeliness:
{
"metric_name": "sql",
"operator": "<=",
"threshold": "5",
"configuration_keys": {
"custom_metric": "event_processing_delays",
"query value": "SELECT COUNT(*) FROM events WHERE processed_at - event_timestamp > INTERVAL '1 hour'"
},
"category": "timeliness",
"check_description": "No more than 5 events should have processing delays over 1 hour"
}
Customer Response Time:
{
"metric_name": "sql",
"operator": ">=",
"threshold": "90",
"configuration_keys": {
"custom_metric": "response_time_sla",
"query value": "SELECT COUNT(*) * 100.0 / (SELECT COUNT(*) FROM support_tickets WHERE created_date >= CURRENT_DATE - INTERVAL '30 days') FROM support_tickets WHERE first_response_at - created_at <= INTERVAL '24 hours' AND created_date >= CURRENT_DATE - INTERVAL '30 days'"
},
"category": "timeliness",
"check_description": "90% of support tickets should receive first response within 24 hours"
}
Batch Job Completion:
{
"metric_name": "sql",
"operator": "=",
"threshold": "0",
"configuration_keys": {
"custom_metric": "batch_job_overruns",
"query value": "SELECT COUNT(*) FROM batch_jobs WHERE job_date = CURRENT_DATE AND (completed_at IS NULL OR completed_at - started_at > expected_duration)"
},
"category": "timeliness",
"check_description": "All daily batch jobs should complete within expected duration"
}
Updated about 1 month ago