where
Definition
Filter the resource being tested (model, source, seed, or snapshot).
The where
condition is templated into the test query by replacing the resource reference with a subqueryA subquery is a query within another query. Subqueries are often used when you need to process data in multiple steps.. For instance, a not_null
test may look like:
select *
from my_model
where my_column is null
If the where
config is set to where date_column = current_date
, then the test query will be updated to:
select *
from (select * from my_model where date_column = current_date) dbt_subquery
where my_column is null
Examples
- Specific test
- One-off test
- Generic test block
- Project level
Configure a specific instance of a generic (schema) test:
version: 2
models:
- name: large_table
columns:
- name: my_column
tests:
- accepted_values:
values: ["a", "b", "c"]
config:
where: "date_column = current_date"
- name: other_column
tests:
- not_null:
where: "date_column < current_date"
Configure a one-off (data) test:
{{ config(where = "date_column = current_date") }}
select ...
Set the default for all instances of a generic (schema) test, by setting the config inside its test block (definition):
{% test <testname>(model, column_name) %}
{{ config(where = "date_column = current_date") }}
select ...
{% endtest %}
Set the default for all tests in a package or project:
tests:
+where: "date_column = current_date"
<package_name>:
+where: >
date_column = current_date
and another_column is not null
Custom logic
The rendering context for the where
config is the same as for all configurations defined in .yml
files. You have access to {{ var() }}
and {{ env_var() }}
, but you do not have access to custom macros for setting this config. If you do want to use custom macros to template out the where
filter for certain tests, there is a workaround.
As of v0.21, dbt defines a get_where_subquery
macro.
dbt replaces {{ model }}
in generic test definitions with {{ get_where_subquery(relation) }}
, where relation
is a ref()
or source()
for the resource being tested. The default implementation of this macro returns:
{{ relation }}
when thewhere
config is not defined (ref()
orsource()
)(select * from {{ relation }} where {{ where }}) dbt_subquery
when thewhere
config is defined
You can override this behavior by:
- Defining a custom
get_where_subquery
in your root project - Defining a custom
<adapter>__get_where_subquery
dispatch candidate in your package or adapter plugin
Within this macro definition, you can reference whatever custom macros you want, based on static inputs from the configuration. At simplest, this enables you to DRY up code that you'd otherwise need to repeat across many different .yml
files. Because the get_where_subquery
macro is resolved at runtime, your custom macros can also include fetching the results of introspective database queries.
Example: Filter your test to the past three days of data, using dbt's cross-platform dateadd()
utility macro.
version: 2
models:
- name: my_model
columns:
- name: id
tests:
- unique:
config:
where: "date_column > __three_days_ago__" # placeholder string for static config
{% macro get_where_subquery(relation) -%}
{% set where = config.get('where') %}
{% if where %}
{% if "__three_days_ago__" in where %}
{# replace placeholder string with result of custom macro #}
{% set three_days_ago = dbt.dateadd('day', -3, current_timestamp()) %}
{% set where = where | replace("__three_days_ago__", three_days_ago) %}
{% endif %}
{%- set filtered -%}
(select * from {{ relation }} where {{ where }}) dbt_subquery
{%- endset -%}
{% do return(filtered) %}
{%- else -%}
{% do return(relation) %}
{%- endif -%}
{%- endmacro %}