Skip to main content

Max Check

The max check validates the maximum value of a numeric column. This check helps ensure data quality by validating that the largest values in your dataset meet expectations and don't exceed reasonable bounds.

Configuration

ParameterRequiredDescription
nameYesUnique name for the check
datasetYesTable name or SQL query
typeYesMust be max
measureYesColumn name to find maximum
conditionYesComparison operator
thresholdYesValue to compare against
dimensionsNoGroup by columns
filterNoWHERE clause conditions

Examples

Basic Max Check

- name: maximum_order_amount
dataset: orders
type: max
measure: order_amount
condition: le
threshold: 10000

This check ensures no order exceeds $10,000.

Max with Filter

- name: max_discount_percentage
dataset: promotions
type: max
measure: discount_percentage
condition: le
threshold: 0.5
filter: status = 'active'

This check validates that active promotions don't exceed 50% discount.

Max by Dimensions

- name: max_salary_by_level
dataset: employees
type: max
measure: salary
dimensions: [job_level]
condition: le
threshold: 200000

This check ensures maximum salary per job level doesn't exceed $200K.

Max with Time Dimension

- name: daily_max_temperature
dataset: sensor_readings
type: max
measure: temperature
condition: le
threshold: 45
time_dimension:
name: reading_date
granularity: day

This check validates daily maximum temperature stays below 45°.

Use Cases

  • Data Quality: Ensure no unreasonably high values
  • Business Rules: Validate maximum acceptable limits
  • Range Validation: Check upper bounds of data ranges
  • Outlier Detection: Identify potentially problematic maximum values
  • Security: Prevent values that might indicate data issues or attacks
  • Compliance: Ensure maximum limits are not exceeded

Generated SQL

The max check generates SQL like:

SELECT MAX(order_amount)
FROM orders
WHERE status = 'completed'

With dimensions:

SELECT job_level, MAX(salary)
FROM employees
GROUP BY job_level

Example Results

✓ maximum_order_amount: 8500.00 (≤ 10000)
✗ max_discount_percentage: 0.75 (≤ 0.5)
✓ max_salary_by_level_senior: 180000 (≤ 200000)
✗ max_salary_by_level_executive: 250000 (≤ 200000)

Data Types

The measure column should be numeric or comparable. Common types include:

  • INTEGER
  • DECIMAL / NUMERIC
  • FLOAT / DOUBLE
  • DATE / TIMESTAMP
  • TIME

NULL Handling

  • MAX() ignores NULL values automatically
  • If all values are NULL, MAX() returns NULL
  • The check will fail if MAX() returns NULL (unless specifically handled)

Common Patterns

Fraud Detection

- name: suspicious_transaction_amount
dataset: transactions
type: max
measure: amount
condition: le
threshold: 100000
dimensions: [user_id]

Performance Monitoring

- name: max_response_time
dataset: api_logs
type: max
measure: response_time_ms
condition: le
threshold: 30000 # 30 seconds max

Data Validation

- name: age_upper_bound
dataset: users
type: max
measure: age
condition: le
threshold: 120 # Reasonable max age

System Limits

- name: max_file_size
dataset: uploads
type: max
measure: file_size_bytes
condition: le
threshold: 104857600 # 100MB max

Performance Tips

  1. Indexes: Ensure the measure column has appropriate indexes
  2. Partitioning: Use partitioning for time-based queries
  3. Filters: Apply filters to reduce data scanning
  4. Statistics: Keep table statistics updated for query optimization

Alerting Scenarios

Max checks are particularly useful for alerting on:

  • Anomalous spikes in transaction amounts
  • Performance degradation (response times, queue lengths)
  • Resource exhaustion (disk usage, memory consumption)
  • Data quality issues (unrealistic values)
  • Min - Minimum value validation
  • Sum - Sum aggregation validation
  • Numeric - Custom numeric expressions