Unit testing in dbt – from an experiment to an open-source framework

In the following article, I want to share our journey to introduce the concept of unit testing in the framework dbt. There were a couple of existing efforts in the community but none as we envisioned – writing unit tests in SQL with a fast-feedback loop so that we can even use it for tdd.

Fellow colleague Pedro Sousa and I have published a couple of articles about our journey – we described our first experiment and shared our second and polished approach. After the blogs, a couple of teams at Equal Experts started to use our strategy and give us feedback.

As mentioned in one of the articles, we always thought dbt should have support for unit tests. We asked dbt’s team about the roadmap to support unit tests and we found it unlikely that it was going to happen. Also, they think it makes more sense as an external framework. Personally, it doesn’t make much sense to me, one can argue that if we look at programming languages, we are used to having the testing capabilities as external libraries, but dbt is not a programming language and it already supports other types of tests.

After a couple of conversations with the other teams, we were encouraged to use our work to create the dbt-unit-testing framework under the Equal Experts Github.

We released the framework three months ago, and since then we’ve started to have traction on our Github. Currently, we have 47 stars and 45 closed pull requests, and we have approximately 120 unique visitors per 14 days. The best outcome is having people collaborating with us, giving feedback, creating issues and developing pull requests. We already have four community contributors and we are proud to say that we appreciate all the work and the effort – @halvorlu, @darist, @charleslr and @gnilrets.

Community collaboration and feedback are crucial to improve the framework and prioritise what should be done. We have a couple of ideas in the backlog, such as adding support for more data sources, but we don’t yet have a clear roadmap. We prefer to listen to the feedback and work based on that. Continuous improvement through continuous user feedback perfectly describes our mindset.

This post shares our journey, mindset and appreciation for the open-source community engagement in such small projects.

You can check the framework here: https://github.com/EqualExperts/dbt-unit-testing

Contributing to tech communities is very much part of our mission at Equal Experts.

 

In a previous blog post, we shared our journey to create a framework to support unit testing in dbt. We’ve started to use the approach in different projects, so we needed a strategy to share, evolve and maintain the code composed of a set of dbt macros. The appropriate way to do it is by creating a dbt package. In this article, we want to share the process of creating a dbt package, which is reasonably straight-forward after knowing a couple of details.

To create a dbt package, you need to work on three main tasks:

  • Package the code you want to share
  • Create tests
  • Create a CI pipeline

Package the code

To package the code, you need to create a root folder for the package, containing a folder for macros and a dbt_project.yml to declare the package configurations. The dbt_project.yml could have a configuration similar to this:

name: 'sample-package'

version: '0.1.0'

config-version: 2



require-dbt-version: [">=0.20.2"]



target-path: "target"

macro-paths: ["macros"]

Name for the package, version, require-dbt-version and the path for the macros. If your package also contains models, you will need to have the model’s folder configured.

With this setup, the package is installable via dbt deps.

Tests

Currently, dbt doesn’t provide a tool to unit test macros, so we need to rely on integration tests. Integration tests in dbt mean the package code will run within a test dbt project. You need to think about the macros’ usage and how to test them by creating models/tests to support the scenarios that you want to cover. The integration tests project is usually at the root of the package code, and it includes a packages.yml referring to the package code as a dbt package:

packages:

 - local: ../

You can run the tests in multiple different environments (BigQuery, Snowflake, etc.) to ensure the interoperability of your package. Also, you could and should have the integration tests running in your CI pipeline.

CI pipeline

The goal of the CI pipeline is to introduce monitoring and automation to improve your package development process, particularly running the integration tests after code changes. Also, as mentioned before, the CI pipeline should run the tests against the different environments you want to support.

We’ve chosen GitHub actions to create a CI pipeline for our package because it checked all our requirements, and it’s free for public repositories. In Github Actions, the concept of workflow represents a pipeline and is configurable by a YAML file. Each workflow has one or more jobs that run in separated runners. Each job has one or more steps to define the pipeline, and there are a set of predefined steps available to use that are known by actions.

We wanted to have one pipeline to run the integration tests, and inside of the pipeline, different jobs per environment (BigQuery, Snowflake, etc.). Let’s assume we want to create just one workflow to run the integration tests with Snowflake to keep the following example simple.

Snowflake authentication requires credentials (account, user, password, role and warehouse) – the way to store credentials on Github is by using Action Secrets

Setting up the workflow is relatively simple with the credentials in place – we need to set up a runner/container with the repository code. We also need to install dbt inside the container to run and test dbt against Snowflake using the credentials. That being said, let’s take a look at the workflow:

name: Dbt Unit Testing


on:

 push:

   branches: [master]


jobs:

 integration-tests-snowflake:

   runs-on: ubuntu-latest

   environment: test

   env:

     SNOWFLAKE_TEST_ACCOUNT: ${{ secrets.SNOWFLAKE_TEST_ACCOUNT }}

     SNOWFLAKE_TEST_USER: ${{ secrets.SNOWFLAKE_TEST_USER }}

     SNOWFLAKE_TEST_PASSWORD: ${{ secrets.SNOWFLAKE_TEST_PASSWORD }}

     SNOWFLAKE_TEST_ROLE: ${{ secrets.SNOWFLAKE_TEST_ROLE }}

     SNOWFLAKE_TEST_WAREHOUSE: ${{ secrets.SNOWFLAKE_TEST_WAREHOUSE }}


   steps:

     - uses: "actions/checkout@v2"




     - name: test snowflake

       run: ./run_test.sh snowflake

 

This sample workflow is triggered when a push to master happens (lines 3-7) and contains one job that runs on the ubuntu-latest runner. The job starts with an environment configuration reading secrets from the action secrets into environment variables inside the container. Then there are two steps – the first step uses a predefined action that checks out the repository into the container, and the second step (test snowflake), runs a bash command run_test. The run_test script hides the details of a shell script that installs dbt on the runner using pip, runs ‘dbt deps’ to install the package code, tests and runs dbt itself. 

The workflow covered what was needed to keep the example simple. However, suppose you want to accept external contributions as we do. In that case, the workflow needs a slightly complex triggering strategy on pull requests that we will share in a follow-up article.

Final notes

Our goal was to give you a high-level overview of the work involved in creating a dbt package from a set of macros or models. We hope you understand the complexities of the package creation process after reading the article. If you want to explore it further, you can look at our unit testing package https://github.com/EqualExperts/dbt-unit-testing where we are testing our dbt package against multiple environments.

 

Inspired by our previous post on unit testing dbt and discussion in the dbt forums, I looked into how to get dbt to help me test and deploy UDFs. Although dbt doesn’t directly support UDFs, it turns out it’s actually fairly straightforward to define, test and reference them in your models. I’ll be using BigQuery in this post, but I think the techniques will apply to other warehouses that support UDFs like Snowflake, Redshift and Spark, too.

What’s a UDF and what does it have to do with dbt?

A UDF is a User-Defined Function. Like any other type of function, they’re really helpful to capture and name chunks of logic that you can apply in queries, and I talked about them before, starting from Tip 4 in our Data Warehouse Tips post. dbt takes care of deploying the tables and views that you might use your UDFs in, so robustly referencing a function from your models is important. When I say ‘robustly’, I mean ensuring that the function exists and that you’re calling the function from the appropriate location – for example, avoiding hard-coding your production functions everywhere. There’s no native support for referencing a UDF like there is with `source` or `ref`, nor any native way to deploy a function in a project. This post explains how you can solve those problems.

The example

I have a BigQuery dataset that I use for my personal development work. This is referenced in my own dbt profile. When I run dbt commands on my machine, they take effect here.

There are test and production datasets that my CI system has profiles for. I cannot run ad-hoc dbt commands in these datasets. When I push my changes, the CI system will run any dbt commands necessary to deploy my updates and run tests. I’ll not be talking about that in this post, but it’s important that I’m referencing the UDF that is deployed in the same environment that I’m working in – otherwise I can’t iterate safely, relying on my tests.

The example UDF I’ll use is a simple predicate function, located in `my_schema`, from string to bool that returns true if and only if the string value can be interpreted as a positive integer:

CREATE OR REPLACE FUNCTION my_schema.is_positive_int(a_string STRING)

RETURNS BOOLEAN

AS (

  REGEXP_CONTAINS(a_string, r'^[0-9]+$')

);

Defining a UDF in dbt

We need to write a dbt “macro” – a Jinja template – that parameterises where the UDF should go. In `macros/is_positive_int_udf.sql`:

{% macro is_positive_int_udf() %}

CREATE OR REPLACE FUNCTION {{ target.schema }}.is_positive_int(a_string STRING)

RETURNS BOOLEAN

AS (

  REGEXP_CONTAINS(a_string, r'^[0-9]+$')

);

{% endmacro %}

Note the macro name matches the file and that `target.schema` is templated out.

To have dbt actually deploy this for us, we add a hook in our `dbt_project.yml`:

on-run-start:

- '{{ is_positive_int_udf() }}'

Now, when we `dbt run`, the UDF will be deployed.

11:11:52  Running 1 on-run-start hooks

11:11:52  1 of 1 START hook: my_project.on-run-start.0.................... [RUN]

11:11:52  1 of 1 OK hook: my_project.on-run-start.0....................... [OK in 0.49s]

Kind of yucky in the console output with just list indices to identify what’s what –  but not the end of the world.

Testing a UDF in dbt

Now our UDF is schema-aware and is being deployed by dbt, we can write tests for it with minimal ceremony. In `tests/is_positive_int.sql`:

WITH examples AS (

  SELECT 'foo' the_string, FALSE expected

  UNION ALL SELECT '1111', TRUE

  UNION ALL SELECT '5 not this', FALSE

  UNION ALL SELECT 'still not this 5', FALSE

  UNION ALL SELECT NULL, NULL

)


, test AS (

  SELECT

 *,

  {{ target.schema }}.is_positive_int(the_string) actual

  FROM examples

)



SELECT * FROM test WHERE TO_JSON_STRING(actual) != TO_JSON_STRING(expected)

Running `dbt test` runs the pre-hooks and then the tests, perfect.

...

11:25:22  Running 1 on-run-start hooks

11:25:22  1 of 1 START hook: my_project.on-run-start.0.................... [RUN]

11:25:23  1 of 1 OK hook: my_project.on-run-start.0....................... [OK in 0.58s]

...

11:25:24  1 of 1 START test is_positive_int............................... [RUN]

11:25:25  1 of 1 PASS is_positive_int..................................... [PASS in 1.02s]



11:25:26  Finished running 1 tests, 1 hooks in 1.46s.

Using the UDF in a Model

We’ve actually already seen how to do this, in the test.

In `models/my_model.sql`:

SELECT {{ target.schema }}.is_positive_int(maybe_positive_int_column)

FROM {{ ref(...) }}

Summary

There you have it. In summary, to incorporate UDFs neatly into your dbt models:

  • Write a macro for your UDF, templating the target schema
  • Add a call to macro to `on-run-start` in your `dbt_project.yaml`
  • Write some tests, of course
  • Use the UDF in your models, templating the target schema

This blog describes the exploration that Cláudio Diniz and I have been doing around the dbt tool, trying to find a way to write unit tests using TDD. You can check this post to understand how it all started.

This is not a tutorial on dbt, and it’s not a detailed technical document on our framework. You can check our demo dbt project here if you want to see the details.

Note: What we are calling a framework is a set of Jinja macros. Throughout this document, we will call it a framework for simplicity.

Why do we need unit tests in dbt?

One of the software engineering practices that we follow is TDD. We strongly believe that it helps us create better code and achieve a good level of confidence in its correctness. So, why not use this in our dbt projects?

Dbt is code

A dbt project, like a “regular” software project, consists of several files of SQL code. And we want to be sure that the logic of this code is correct before it jumps to production.

We think that using TDD can also help us to write better SQL code. Using TDD, we write the test, make it pass with the simplest SQL code we can think of, and then refactor it, knowing we have a test to assure it still works. It would be awesome if we could do this on our dbt projects!

But wait, dbt already has tests

That’s right, but dbt tests were mainly designed for data tests. They are used to check the correctness of the data produced by the models. We want to write unit tests on our SQL code. We want to be sure that the logic of the code is correct.

A word on unit tests

The line between unit and integration tests is sometimes a bit blurred. This is also true with these tests in dbt.

We can think of a dbt model as a function, where the inputs are other models, and the output is the result of its SQL. A unit test in dbt would be a test where we provide fake inputs to a model, and then we check the results against some expectations.

However, a model in dbt can belong to a long chain of models (see image below), each transforming the data in its own rules. We could test a model by providing fake inputs to the first models in that chain and asserting the results on the final model. We would be checking all the intermediate transformations along the way. This, on the other end, could be called an integration test.

A complex dbt model hierarchy

These integration tests are harder to write because we have to think of how the data is transformed throughout all those models until it reaches the model we want to test. However, they provide an extra level of confidence. As usual, we need to keep a good balance between these two types of tests.

As we will see, using this definition, our framework will allow us to create both unit and integration tests.

What do you want to achieve?

We came up with a list of requirements that we would like to fulfill when writing tests in dbt:

  • Use fake inputs on any model
  • Focus the test on what’s important
  • Provide fast and valuable feedback
  • Write more than one test on a dbt test file

We want to write dbt tests using TDD and receive fast feedback on the results. Running one or more dbt models each time we change them, as we were doing on the previous approach, was not the best way to do it, and we wanted to remove this step.

The goal is to write the test, write the model, and then run the test (with “dbt test”).

Use fake inputs on any model.

In a typical dbt model hierarchy, a model is dependent on other models.

We can test our models by providing fake inputs to its “parent” models. However, sometimes we want to go a bit higher in the hierarchy (up to the sources, if we wish to) to have a more integrated test.

Focus the test on what’s important

We want to test our models by providing fake input only to the relevant models for the scenario being tested. It should be possible to keep other models with no data as long as they don’t interfere with the test (for instance, when using a LEFT JOIN on some model).

It is also essential that we shouldn’t have to specify all the columns when we provide fake input data. Only the columns that are important for the test should be set.

Provide fast and valuable feedback

We want the test feedback to be fast and helpful. We need to understand which test is failing and why. It should present the name of the model that failed along with its description.

Write more than one test on a dbt test file

A dbt test file usually contains just one test. It would be nice to have more than one test on the same file if they are somehow related.

How did you do it?

In a word, with some pain (two words, I know…). Many experiments and iterations were developed and tested until we reached the state where we are now. Fortunately, the powerful Jinja macros and the dbt engine helped us achieve what we were looking for.  

Before we run into the details of how it is done, let’s have a look at how a test looks like:

In a nutshell:

First, we define the model to be tested with the “test” macro.

Then we define the fake inputs using the “mock_ref” or “mock_source” macros.

Finally, we describe our expectations for the model with the “expect” macro.

That’s it! 

If we now run this test with the “dbt test” command (assuming we have a “covid19_cases_per_day” model), we will see this report:

 

Now imagine we made a mistake and typed 24 instead of 25 on the expectations. We rerun the test, and now we see the test failing:

The test output displays the name of the model being tested and the description of the test. This is helpful when we have several tests running with the “dbt test” command.

How does it work?

Our strategy is supported by these four items:

  1. The dbt “render” function.
  2. The dbt “graph.nodes” internal structure.
  3. SQL Common Table Expressions (CTE).
  4. The ability to override the dbt “ref” macros functionality.

Let’s check one by one:

1. The dbt “render” function

This function returns the compiled SQL for a model. It allows us to fetch the SQL of the model being tested and the SQL for all the other models.

2. The dbt “graph.nodes” internal structure

This structure keeps track of a lot of information related to each of the dbt models in runtime. It contains the raw SQL for the model (the SQL before compilation, with the ref and other dbt macros) and the list of models from which this one depends (it contains a lot of different things, but they are not essential for us right now).

3. SQL Common Table Expressions (CTE)

We use CTEs to build a fully compiled SQL of the model being tested. The dependencies of the model are transverse recursively to generate a group of nested CTEs that builds up the entire model.

4. The ability to override the dbt ref macro functionality

We can override the ref macro on dbt. This allows us to change the default behavior of emitting a fully qualified table name, like “dbt_test_demo.models.countries”, to a simplified one that will match the name of the CTEs.

Let’s see an example. Imagine we have the following (oversimplified and dumb) 3 models:

If we are testing “model3”, our framework builds the following SQL:

Notice how the SQL for “model3” now depends only on the CTEs defined before the final SELECT. This will be the SQL that dbt will execute when running the test and later compare its results with the expectations.

What about fake inputs?

Good question. We need to be able to inject our fake inputs into this SQL. How? Again, using CTEs.

Suppose we want to insert fake inputs on “model2”. And assume we don’t care about “model1” because it’s on a LEFT JOIN, and our test does not depend on “model1” columns. So we don’t need to define fake inputs for this model (focus the test on what’s important).

The SQL for “model3” defined above is wrapped with a CTE to define the fake inputs for “model2”, like this:

However, this wouldn’t work. There are two CTEs named “model2” in this SQL, and the model being tested will use the innermost one and not the one that defines the fake inputs. An extra step is needed to remove from the model SQL the CTEs used as faked inputs.

The final SQL will then be:

Now the fake inputs for “model2” will be used in the test.

(this is a simplified version of the actual SQL used for testing. For instance, a few extra steps are necessary to prevent the need to specify all the columns in the fake inputs. We simplified it here for explanation purposes) 

More than one test on the same file

It’s possible to specify more than one test on a single dbt test file. We just need to join all the tests with a UNION ALL statement:

The test output will show which tests have failed, along with their name and description. 

What about the cons?

Nothing comes for free. Using this framework has its downsides.

The first one would be a slightly more complicated syntax to describe our tests. We did our best to keep it as simple and straightforward as possible, but the level of boilerplate is still a bit high.

We are relying on some dbt internal details. The dbt render function is not documented. If it changes in the future, it could break our framework.

Conclusion

We are always trying to improve, and there is still some space for improvements here. Currently, our macros are running against a BigQuery database. We know we need to change a few things to make them compatible with Postgres, for instance.

All in all, we can create our dbt tests comfortably, using TDD with fast feedback, and we are gaining confidence in the correctness of our models.

If you want to learn more about the technical details of how this works, you can check our dbt test demo project on Github

We hope we are helping other teams to write unit tests for dbt projects.