Getting started with Testcontainers for Python
Testcontainers is an open-source framework for provisioning throwaway, on-demand containers for development and testing use cases. Testcontainers make it easy to work with databases, message brokers, web browsers, or just about anything that can run in a Docker container.
Using Testcontainers, you can write tests talking to the same type of services you use in production without mocks or in-memory services.
If you are new to Testcontainers then please read What is Testcontainers, and why should you use it? to learn more about Testcontainers. |
Let us create a simple Python application that uses PostgreSQL database to store customers information. Then we will learn how to use Testcontainers for testing with a real Postgres database.
Create a Python application
Let’s create a Python project and use the venv module to create a virtual environment for our project. By using a virtual environment, we can avoid installing dependencies globally, and also we can use different versions of the same package in different projects.
$ mkdir tc-python-demo
$ cd tc-python-demo
$ python3 -m venv venv
$ source venv/bin/activate
We are going to use psycopg3 for talking to the Postgres database, pytest for testing, and testcontainers-python for running a PostgreSQL database in a container.
Once the virtual environment is activated, we can install the required dependencies using pip as follows:
$ pip install psycopg pytest testcontainers[postgres]
$ pip freeze > requirements.txt
Once the dependencies are installed, we have used pip freeze command to generate the requirements.txt file so that others can install the same versions of packages simply using pip install -r requirements.txt.
Implement Database Helper
Let’s create db/connection.py file and create a function to get a database connection as follows:
import os
import psycopg
def get_connection():
host = os.getenv("DB_HOST", "localhost")
port = os.getenv("DB_PORT", "5432")
username = os.getenv("DB_USERNAME", "postgres")
password = os.getenv("DB_PASSWORD", "postgres")
database = os.getenv("DB_NAME", "postgres")
return psycopg.connect(f"host={host} dbname={database} user={username} password={password} port={port}")
Instead of hard-coding the database connection parameters, we are using environment variables to get the database connection parameters. This will help us to run the application in different environments without changing the code.
Implement business logic
Let’s create customers/customers.py file and create Customer class as follows:
class Customer:
def __init__(self, cust_id, name, email):
self.id = cust_id
self.name = name
self.email = email
def __str__(self):
return f"Customer({self.id}, {self.name}, {self.email})"
Now, let’s add create_table() function in customers/customers.py file to create customers table as follows:
from db.connection import get_connection
def create_table():
with get_connection() as conn:
with conn.cursor() as cur:
cur.execute("""
CREATE TABLE customers (
id serial PRIMARY KEY,
name varchar not null,
email varchar not null unique)
""")
conn.commit()
We have obtained a new database connection using get_connection() function and created the customers table. We have used Python context manager with statement to automatically close the database connection.
Let’s implement create_customer(), get_all_customers(), get_customer_by_email(), and delete_all_customers() functions in customers/customers.py file as follows:
def create_customer(name, email):
with get_connection() as conn:
with conn.cursor() as cur:
cur.execute(
"INSERT INTO customers (name, email) VALUES (%s, %s)", (name, email))
conn.commit()
def get_all_customers() -> list[Customer]:
with get_connection() as conn:
with conn.cursor() as cur:
cur.execute("SELECT * FROM customers")
return [Customer(cid, name, email) for cid, name, email in cur]
def get_customer_by_email(email) -> Customer:
with get_connection() as conn:
with conn.cursor() as cur:
cur.execute("SELECT id, name, email FROM customers WHERE email = %s", (email,))
(cid, name, email) = cur.fetchone()
return Customer(cid, name, email)
def delete_all_customers():
with get_connection() as conn:
with conn.cursor() as cur:
cur.execute("DELETE FROM customers")
conn.commit()
We have implemented various functions to insert, fetch, and delete customer records from the database using Python’s DB-API.
To keep it simple for the purpose of this guide, we are creating a new connection for every database operation. In a real-world application, it is recommended to use a connection pool to reuse connections. |
Write tests using Testcontainers
We will create an instance of PostgreSQL database container using Testcontainers and use the same database for all the tests. Also, we will delete all the customer records before every test so that our tests will run with a clean database.
We are going to use pytest fixtures for implementing the setup and teardown logic. A recommended approach to implement the setup and teardown logic is to use yield fixtures.
@pytest.fixture
def setup():
# setup code
yield some_value
# teardown code
However, with this approach, if there is an exception occurs in the setup code, the teardown code will not be executed. So, a better approach is to use finalizers as follows:
@pytest.fixture
def setup(request):
# setup code
def cleanup():
# teardown code
request.addfinalizer(cleanup)
return some_value
Let’s create tests/test_customers.py file and implement the fixtures as follows:
import os
import pytest
from testcontainers.postgres import PostgresContainer
from customers import customers
postgres = PostgresContainer("postgres:16-alpine")
@pytest.fixture(scope="module", autouse=True)
def setup(request):
postgres.start()
def remove_container():
postgres.stop()
request.addfinalizer(remove_container)
os.environ["DB_CONN"] = postgres.get_connection_url()
os.environ["DB_HOST"] = postgres.get_container_host_ip()
os.environ["DB_PORT"] = postgres.get_exposed_port(5432)
os.environ["DB_USERNAME"] = postgres.POSTGRES_USER
os.environ["DB_PASSWORD"] = postgres.POSTGRES_PASSWORD
os.environ["DB_NAME"] = postgres.POSTGRES_DB
customers.create_table()
@pytest.fixture(scope="function", autouse=True)
def setup_data():
customers.delete_all_customers()
We have used module scoped fixture to start a PostgreSQL container using Testcontainers so that it will only run once for all the tests in the module. In the setup() fixture function, we are starting the PostgreSQL container and creating the customers table. We have added a finalizer to remove the container at the end of all the tests.
In the setup_data() fixture function, we are deleting all the records in the customers table. This is a function scoped fixture, which will be executed before running every test.
As of now Testcontainers for Python does not yet implement automatic resource cleanup using Ryuk, so we are explicitly removing the created container using a finalizer. |
Now let’s implement the tests as follows:
def test_get_all_customers():
customers.create_customer("Siva", "siva@gmail.com")
customers.create_customer("James", "james@gmail.com")
customers_list = customers.get_all_customers()
assert len(customers_list) == 2
def test_get_customer_by_email():
customers.create_customer("John", "john@gmail.com")
customer = customers.get_customer_by_email("john@gmail.com")
assert customer.name == "John"
assert customer.email == "john@gmail.com"
In the test_get_all_customers() test, we are inserting two customer records into the database, fetching all the existing customers, and asserting the number of customers.
In the test_get_customer_by_email() test, we are inserting a customer record into the database, fetch the customer by email, and asserting the customer details.
As we are deleting all the customer records before every test, the tests can be run in any order.
Run tests
To enable the Pytest auto-discovery mechanism, create __init__.py file under tests directory with empty content.
Now let’s run the tests using pytest as follows:
$ pytest
You should see the following output:
pytest
=============== test session starts ==============
platform darwin -- Python 3.12.0, pytest-7.4.3, pluggy-1.3.0
rootdir: /Users/siva/dev/tc-python-demo
collected 2 items
tests/test_customers.py .. [100%]
============== 2 passed in 3.02s =================
The tests are executed using a real PostgreSQL database instead of mocks which gives more confidence in our implementation.
Conclusion
We have explored how to use testcontainers-python library for testing a Python application using a PostgreSQL database. In addition to PostgreSQL, testcontainers-python provides dedicated modules to many commonly used SQL databases, NoSQL databases, messaging queues, etc. You can use Testcontainers to run any containerized dependency for your tests!
You can explore more about Testcontainers at https://www.testcontainers.com/.