Test-Driven Database Development

How can we safely and efficiently develop databases by applying principles such as test driven development and continuous integration?

Motivation

In my previous post, Business Logic Versus Data Logic, I talked about why we might want to enrich our database with triggers, stored procedures and functions, and when that is (or isn’t) a good idea. I alluded to not letting concerns about risk or inefficiency deter us too much because there are tools and techniques at our disposal. This post explains those tools and techniques.

If you’re unsure why we’d want business logic in our database in the first place, read this first:

Business Logic versus Data Logic

You will need…

The broad principles I’m going to walk through apply to any of the major SQL databases. I’ve chosen to use PostgreSQL, but similar principles apply to SQL Server, Oracle, MySQL, DB/2 and more.

You will also need a suitable version control system — I’m using Git — and CI/CD tooling, for which I’m using Azure DevOps, formerly known as Visual Studio Online, formerly known as Visual Studio Team Services, formerly known as Team Foundation Service.

Finally, you’ll need one (or more) programming environments capable of running database migrations and unit tests. I’m using FluentMigrator and MSTest in .NET Core, but could just as easily have used PHP / Laravel. Many other platforms have similar capabilities.

The Goal

Suppose we are writing a small accounting system with rules. For example, calculating tax on sales. Initially our customers may be UK based (with flat rates of 0%, 5% and 20% based on the nature of the goods) but as we push into the US it becomes increasingly complex with city, county and state taxes levied at different rates or fixed amounts depending on the goods, their price and the seller and purchaser’s locations.

We’ve decided to put the tax logic in the database — possibly because we read my previous post asserting it’s sensible, safe and not (too) inefficient for rules which change infrequently.

This article seeks to give us the tools to make the process safe(r) and (more) efficient.

We’re going to start off by looking at versioning our database changes. We will then take a look at testing those changes. Finally, we’ll look at using CI/CD to deploy DB updates and add some quality gates and feedback to the process. Finally, we will touch on making this a team sport.

There are great tools out there which will do all this for us. But, you don’t have to spend a fortune, so we’re going to do all this with open source or “free tier” tools and resources.

Versioning the Database

Our starting point is that we want some form of version control for the database.

One approach would be to version control the database itself — the collection of SQL describing tables, SPs, views, functions and more. I did trial that, but it’s not super effective, not least because we invariably can’t create artifacts in any order, so need some form of sequencing.

So, we’re going to represent our database in a series of migrations — steps which modify our database from a blank shell to finished (for now) product. Our first step might be to create the sale table, our second to create the tax rate table, our third to create the calculate tax amount procedure etc.

I’m using .NET Core with FluentMigrator and the Npgsql connector for the purposes of this example. This stack allows me to write C# code which describes many database artifacts, or embed SQL as needed. Many other platforms have similar capabilities — for a web only product I might well use Laravel.

Whilst this isn’t a full tutorial on using these tools, a minimal console app to run migrations looks like this:

if (args.Length == 0)
{
    Console.WriteLine("Usage: dotnet run -- <connection-string>");
    return;
}

var connectionString = args[0];

using var connection = new NpgsqlConnection(connectionString);
connection.Open();

// Pass the connection string to the migration service
var migrationService = new MigrationService(args[0]);

migrationService.MigrateUp();

And an example migration:

[Migration(1, description: "Add Country Table")]
public class AddCountryTable : AutoReversingMigration
{
    public override void Up()
    {
        Create.Table("country")
            .WithColumn("code").AsString(2).PrimaryKey()
            .WithColumn("name").AsString(255)
            .WithColumn("full_name").AsString(255)
            .WithColumn("iso3").AsString(3)
            .WithColumn("numeric_code").AsString(3)
            .WithColumn("continent_code").AsString(2)
              .ForeignKey("continent", "code");
    }
}

Running our migrations to update the database is as simple as calling

dotnet run MigrationRunner <connectionString>

Of course, because our migrations are (or are embedded in) application code, they can now be stored in the version control system of your choice. For me, that’s Git.

Test-Driving Development

The next technique we need to apply is test driven development. I am assuming you already understand TDD, and the principles of red-green-refactor, so I won’t go into detail on how and why we use TDD beyond saying we write a failing test in code, write code to make the test pass, refactor our code to improve its design and then repeat until done. My focus here is on applying this to the database.

There are direct libraries for unit testing database artifacts — tSQLt for SQL Server, for example — but not all database platforms have such libraries. I wasn’t able to find a suitable library for PostgreSQL.

As we’ve written our migrations in C#, we could also think of writing our unit tests in C#. Our unit test suite will need to:

Create a new database
Run the migrations against the db
Run tests against the db
Drop the db again

In our .NET example, we can use Dapper — a lightweight ORM — within our tests to wrap the database, and MSTest or NUnit to assert against the output.

Running database tests is undoubtedly slower than running unit tests, but there are a few things we can do to mitigate that:

Create base test data in our migrations, using tags to run test-specific migrations
Modularise our tests so we can focus on 20 or 30 tax tests whilst actively working on our new sales tax calculation logic rather than the full suite every time
Using config flags to conditionally replace full db build and teardown between test runs with partial build and teardown of the artifacts of interest

Whilst this will never be as quick as test-driving application logic, we can make it quick enough to be usable, and the extra time spent is justified by the criticality of the database across multiple calling applications.

Programming with Forward and Backward Compatibility In Mind

When we code in the database, we typically need to be mindful of multiple calling applications with differing release cycles, priorities and development teams. So, as with any API, it’s important to be aware of release cycles.

Each time we release a DB update, it should be internally consistent. So, if I’m changing how tax is calculated in 3 areas of the DB, I’ll package these 3 changes into a single pull request so they can be merged together.

Similarly, when I need change a method signature in a breaking way (for example by adding required params), I’ll do so by creating a new method rather than updating the old, and deprecate the old for deletion later. Typically my deprecation will last 3+ release cycles for the slowest calling application, to give application developers time to implement whilst recognising they may have different high priority issues.

Finally I might relax the principle of YAGNI (“you aren’t gonna need it”) here. If I’ve a strong idea I will need 10 parameters in my stored procedure, I will put a bit more effort into getting them right up front — it’s a bit self indulgent to add the 10 parameters one by one in successive breaking-change updates, because of the downstream impact on app developers. So with the database, a bit more design up front is necessary.

Continuous Integration / Continuous Deployment

I’m a huge fan of Azure DevOps for CI/CD. I’ve been using their pipelines feature since long before Azure DevOps was Azure DevOps, and have a fairly rich collection of pipelines.

From a CI perspective, every pull request I or my team opens triggers a pipeline of tests and other quality checks. Depending on the project, this includes unit testing, code coverage, static analysis, style checking and more. So it is logical to fit database unit testing into this process.

My database PR quality gate pipeline:

Downloads and runs a PostgreSQL docker image
Creates a blank DB
Builds the migration project and runs from the bash shell against the new DB
Builds the database test project and runs against the DB
Blocks merging the PR if the process fails

This ensures “bad” updates don’t make it onto user test, let alone production.

At the CD stage, I rerun the same tests in case interaction between features on test causes a regression, then — once approved — automatically deploy DB updates across the staging or production estate.

The “how” for deployment is a little more complex. For smaller projects with a few database instances I might script updates from the Azure DevOps pipeline directly. For more complex deployments, I might use a separate DB Deployment server or AWS Lambda function to run in migrations on multiple hosts.

The guiding principle, though, is that whether I’m deploying to one database or 1,000, it’s a “two click” process (once to dry run, once to confirm and run for real).

Conclusion

All of the tools I’ve run here are indicative rather than “the only way to do this”. I can and do run similar tools for PHP in Laravel, and most other platforms have similar capabilities.

The main thing I’ve tried to illustrate here is process — version controlling our database schema, test driving our database development, coding for forward- and backward compatability and having effective CI quality gates and CD for seamless deployment.

Using this or a similar process, we can make database development less risky and more efficient in line with our application development practices.

Looking for a Quote?

Site Links

Get In Touch