Pytest Tutorial: Unit Testing in Python Using the Pytest Framework

Two summers ago I shipped a billing service that sent payment reminders on the wrong day for a handful of customers. The bug was tiny: a timezone conversion used local midnight instead of UTC. Manual checks passed because I tested with my own time zone. A six‑line unit test would have caught it before release. That moment pushed me to treat tests as part of the design, not cleanup.

When I reach for pytest, I get fast feedback, readable assertions, and a workflow that fits how I build Python apps in 2026: src layouts, pyproject.toml, and CI that runs on every push. You can start with a single file, then grow into fixtures, plugins, and parallel runs without rewriting tests. If you already know Python, pytest reads like plain English, which keeps tests approachable for the whole team. This guide follows the path I use with clients and internal teams so you can ship with confidence and fewer late‑night surprises.

Why I Reach for pytest in 2026

pytest keeps the ceremony low. I write a function named test_* and use Python’s assert statement. When an assertion fails, pytest shows the values on each side, the surrounding code, and the failing line. That detail cuts debugging time because I can see what went wrong without adding print statements. The tests read like behavior statements, which makes them good documentation for future me.

I also like how pytest fits the 2026 toolchain. It works well with pyproject.toml, isolated environments from uv or pipx, and editors that understand type hints. The plugin system gives me coverage reports, async support, and parallel execution without changing test code. I can keep a small suite for a tiny library and grow into a larger suite for a service, all while using the same patterns.

Failure output is another reason I stick with pytest. Instead of a bare AssertionError, I see exactly which field or list item mismatched. When I’m reviewing a teammate’s change, that diff view is often all I need to reason about the behavior without rerunning tests locally.

Install it once and you are ready to go:

pip install pytest

Here is the practical difference I see between older unittest‑style projects and pytest‑style projects:

Practice

Traditional unittest style

pytest style I use —

— Test structure

Classes with setup/teardown

Plain functions with fixtures Assertions

self.assertEqual and friends

assert with helpful failure output Configuration

Runner flags and custom scripts

pyproject.toml options and markers Extensibility

Limited built‑ins

Plugins for coverage, async, parallel Readability

Boilerplate heavy

Behavior‑focused, short tests

Project Setup and Test Discovery Rules That Save Time

pytest discovers tests by naming. It looks in files that start with test or end with test.py, and it collects functions that start with test_. That convention keeps discovery fast and removes configuration. I tell teams to adopt a consistent layout so the test tree mirrors the source tree, because that cuts search time and makes it obvious where a new test should live.

A layout I reach for on most apps looks like this:

my_app/

pyproject.toml

src/

my_app/

init.py

billing.py

scheduling.py

tests/

test_billing.py

test_scheduling.py

With that structure, pytest can be run from the repo root with pytest and it will automatically collect tests under tests/. I also configure a few defaults in pyproject.toml so my local runs and CI runs behave the same:

[tool.pytest.ini_options]

addopts = ["-ra", "--strict-markers"]

testpaths = ["tests"]

pythonpath = ["src"]

-ra prints a summary of skipped or xfailed tests, which helps catch silent gaps. --strict-markers keeps marker typos from silently creating new categories, which I’ve seen cause entire integration suites to be skipped.

I also like to define a tests/conftest.py at the root for shared fixtures. That file is auto‑discovered by pytest and makes fixtures available to the entire test suite without imports.

A Small Example App We’ll Test

I find it easier to learn pytest by testing something small but realistic. Imagine a tiny billing module that calculates invoice totals, applies discounts, and schedules reminders. It’s intentionally simple but still includes time zones, money, and external dependencies (a notification sender). Those are the areas where bugs love to hide.

# src/my_app/billing.py

from dataclasses import dataclass

from datetime import datetime, timezone

from decimal import Decimal

@dataclass(frozen=True)

class Invoice:

subtotal: Decimal

tax_rate: Decimal

discount_rate: Decimal = Decimal("0")

def total(self) -> Decimal:

taxable = self.subtotal * (Decimal("1") - self.discount_rate)

return (taxable * (Decimal("1") + self.tax_rate)).quantize(Decimal("0.01"))

def nextreminderutc(now: datetime) -> datetime:

if now.tzinfo is None:

raise ValueError("now must be timezone-aware")

tomorrow = (now + timedelta(days=1)).date()

return datetime(tomorrow.year, tomorrow.month, tomorrow.day, 0, 0, tzinfo=timezone.utc)

# src/my_app/notifications.py

class Notifier:

def send(self, user_id: str, message: str) -> None:

# Imagine this posts to an external service.

raise NotImplementedError

# src/my_app/scheduling.py

from .billing import nextreminderutc

def schedulereminder(notifier, userid: str, now):

reminder = nextreminderutc(now)

notifier.send(user_id, f"Your payment is due on {reminder:%Y-%m-%d}")

return reminder

This is enough for a rich pytest tutorial: we have deterministic logic, floating money values, and time zones. I can test it without relying on a real notification system, and I can show mocking, fixtures, and parametrization in a practical way.

Writing Your First Tests

I start with straight‑line tests, which also act as runnable documentation. The first tests should be so simple that they tell you if the testing infrastructure is working.

# tests/test_billing.py

from decimal import Decimal

from my_app.billing import Invoice

def testinvoicetotalwithtax_only():

invoice = Invoice(subtotal=Decimal("100"), tax_rate=Decimal("0.10"))

assert invoice.total() == Decimal("110.00")

def testinvoicetotalwithdiscountandtax():

invoice = Invoice(

subtotal=Decimal("200"),

tax_rate=Decimal("0.05"),

discount_rate=Decimal("0.10"),

)

assert invoice.total() == Decimal("189.00")

If I run pytest and both tests pass, I know discovery and import paths are working. That’s my baseline. From there, I expand to cover behavior and edge cases.

Assertions That Actually Help You Debug

The reason I prefer pytest assertions is the introspection. When a test fails, pytest tells me the left and right values and often shows a diff. If I compare lists, dicts, or dataclasses, I get structured context instead of a raw AssertionError.

I also embrace descriptive asserts over complicated expressions. A single assert per behavior is usually best. For example, if I want to test rounding, I’ll write one test for rounding rather than stacking it into the discount test. It makes failure output clean and avoids “mystery failures” where multiple behaviors are bundled into one assert.

Here’s a test that checks rounding explicitly:

def testinvoicetotalroundsto_cents():

invoice = Invoice(subtotal=Decimal("10.005"), tax_rate=Decimal("0"))

assert invoice.total() == Decimal("10.01")

That test catches currency issues early and documents the rounding rule for anyone new to the codebase.

Parametrization: One Behavior, Many Inputs

Parametrization is a big part of my pytest style because it lets me keep tests short without hiding logic. If a behavior is the same across a set of inputs, I group them in a parameter list. That keeps the test readable and reduces copy‑paste errors.

import pytest

from decimal import Decimal

@pytest.mark.parametrize(

"subtotal,tax,discount,expected",

[

("100", "0.10", "0", "110.00"),

("50", "0", "0.20", "40.00"),

("19.99", "0.07", "0", "21.39"),

],

)

def testinvoicetotal_parametrized(subtotal, tax, discount, expected):

invoice = Invoice(

subtotal=Decimal(subtotal),

tax_rate=Decimal(tax),

discount_rate=Decimal(discount),

)

assert invoice.total() == Decimal(expected)

When this fails, pytest shows which parameter set failed, which is exactly what I want for a data‑driven check. Parametrization also works with fixtures, which becomes powerful for larger tests.

Fixtures: The Backbone of Maintainable Tests

Fixtures are what allow me to scale pytest beyond a handful of files. I treat fixtures like a dependency injection system for tests. If I need a standard invoice, a fake notifier, or a temporary directory, I define it once and reuse it.

Here’s a fixture that provides a sample invoice:

# tests/conftest.py

import pytest

from decimal import Decimal

from my_app.billing import Invoice

@pytest.fixture

def sample_invoice():

return Invoice(subtotal=Decimal("100"), tax_rate=Decimal("0.08"))

And a test that uses it:

def testinvoicetotalusingfixture(sample_invoice):

assert sample_invoice.total() == Decimal("108.00")

Fixtures can also create test doubles. Instead of importing a mocking library for every simple use case, I often create a tiny fake object that records calls:

@pytest.fixture

def fake_notifier():

class FakeNotifier:

def init(self):

self.calls = []

def send(self, user_id, message):

self.calls.append((user_id, message))

return FakeNotifier()

That fake is enough for 80% of notification tests. It’s clearer than a generic mock and easier to inspect when debugging.

Fixture Scopes and Autouse Behavior

I keep fixture scope narrow by default. Most fixtures are function‑scoped (the default), which gives me isolation and keeps tests deterministic. When I need expensive setup, I’ll expand scope to module or session, but I document why so future me doesn’t accidentally make tests dependent on order.

Examples of when I expand scope:

  • session scope for database containers or external services started once.
  • module scope for expensive test data generation.
  • function scope for anything that should be isolated or mutated during a test.

I also use autouse=True sparingly. It can be great for setting up global environment flags, but it hides dependencies and makes tests harder to reason about. When I use autouse, I add a comment in conftest.py explaining why.

Testing Exceptions and Error Paths

Successful flows are the easy part. The most expensive bugs I’ve seen come from error paths that no one tested. pytest makes exception testing simple with pytest.raises.

import pytest

from datetime import datetime

from myapp.billing import nextreminder_utc

def testnextreminderrequirestimezone():

with pytest.raises(ValueError):

nextreminderutc(datetime(2026, 1, 1, 10, 0))

This test also documents expectations: the function should reject naive datetimes. That’s the kind of behavior I want to lock in.

Time, Time Zones, and Determinism

Time is a trap for test suites. I’ve watched teams lose hours chasing “flaky” tests that only failed around midnight or when daylight saving time changed. I avoid that by passing time as a parameter instead of calling datetime.now() inside functions. When I can’t, I use monkeypatching to control time.

For example, if a function calls datetime.now(timezone.utc) internally, I can patch it like this:

from datetime import datetime, timezone

def testnextreminderutcis_midnight():

now = datetime(2026, 2, 21, 18, 30, tzinfo=timezone.utc)

reminder = nextreminderutc(now)

assert reminder.hour == 0

assert reminder.tzinfo == timezone.utc

The pattern is simple: inject time as input whenever possible. If I can’t, I isolate the time logic in one place so I only need to patch a single function.

Mocking and monkeypatch: Choosing the Right Level

I prefer simple fakes and fixtures over mocks, but mocking is still important when a function calls external services. pytest’s monkeypatch fixture is my go‑to tool. It lets me swap functions and objects at runtime without adding new dependencies.

Here’s how I use it to prevent a real network call:

def testscheduleremindersendsmessage(monkeypatch, fake_notifier):

from my_app import scheduling

from datetime import datetime, timezone

now = datetime(2026, 2, 22, 9, 0, tzinfo=timezone.utc)

scheduling.schedulereminder(fakenotifier, "user_123", now)

assert fake_notifier.calls

userid, message = fakenotifier.calls[0]

assert userid == "user123"

assert "2026-02-23" in message

If I need to patch a function, I use monkeypatch.setattr and then assert behavior. I keep patches as small as possible, because over‑mocking can make tests useless. The goal is to isolate the unit, not to rewrite the entire app in fake objects.

Working With Temporary Files and Directories

Many apps read or write files. pytest’s tmp_path fixture gives me a temporary directory that’s automatically cleaned up. It keeps tests from polluting the project tree and makes failures reproducible.

def testexportinvoice(tmppath, sampleinvoice):

path = tmp_path / "invoice.txt"

path.writetext(f"total={sampleinvoice.total()}")

assert path.read_text() == "total=108.00"

This pattern is simple but powerful. It encourages tests that model file behavior without requiring test data committed to the repo.

Testing HTTP Clients Without Making Real Requests

In service‑heavy systems, network calls are the largest source of flakiness. I prefer to keep unit tests network‑free, and I use fakes or request‑mocking libraries for integration tests. For unit tests, I often pass a client dependency into the function so I can stub it easily.

A pattern I use is to pass a http_client argument to a function, then use a fake in tests. It keeps the production code decoupled and makes test intent clear.

Async Testing With pytest

Async Python is now normal for many APIs and workers. If you’re using asyncio, pytest works well with plugins such as pytest-asyncio. I keep async tests explicit by marking them, so it’s obvious which parts of the suite are async.

import pytest

@pytest.mark.asyncio

async def testasynchandlerreturnspayload():

result = await handler({"user_id": "u1"})

assert result["status"] == "ok"

I also watch for hidden event‑loop issues. If tests leak tasks or leave pending coroutines, they can become flaky. I add a small helper fixture to assert that no tasks are left pending, especially in long‑lived services.

Database Testing Without Tears

Databases can make tests slow and brittle, so I layer my testing strategy:

  • Unit tests for pure business logic that don’t hit the database.
  • Integration tests that use a temporary database or container.
  • Fewer end‑to‑end tests that validate the full workflow.

In pytest, I often use a session‑scoped fixture that spins up a test database and then a function‑scoped fixture that opens a transaction for each test and rolls it back. That keeps tests isolated and fast without recreating the database every time.

Markers for Test Selection and Focus

Markers allow me to run a subset of tests quickly. I keep these categories lightweight and meaningful. My usual setup includes markers like unit, integration, and slow.

In pyproject.toml:

[tool.pytest.ini_options]

markers = [

"unit: fast, isolated tests",

"integration: tests that touch external systems",

"slow: tests that take longer than a few seconds",

]

Then I can run:

  • pytest -m unit for fast feedback.
  • pytest -m "not slow" for normal CI.
  • pytest -m integration for staging checks.

This is a simple system, but it keeps CI times reasonable and helps developers target the right tests.

Skipping and xfail: Being Explicit About Known Issues

Sometimes a test can’t run in a particular environment (for example, a feature isn’t supported on Windows). In that case I use pytest.skip. If a bug is known but not yet fixed, I use xfail to record the expectation. This prevents red CI while still keeping the test in place as a reminder.

import pytest

@pytest.mark.xfail(reason="Bug #492: timezone conversion error")

def testtimezoneedge_case():

...

The important part is to keep xfail tests temporary. I treat them like TODOs, not permanent features of the suite.

Organizing a Growing Test Suite

As test counts grow, structure matters. I keep tests close to the domain they exercise, and I use clear naming. I’ll create folders like tests/billing/ or tests/scheduling/ once a module has more than a couple tests. That keeps each file focused and makes it easier to navigate.

I also avoid test interdependencies. If a test depends on another test’s side effects, it’s a smell. pytest runs tests in a consistent but not guaranteed order. If order matters, that’s a sign I should refactor the code or the tests.

Test Data Strategies That Scale

One of the fastest ways to slow down a test suite is to create huge, noisy fixtures. I prefer small, focused data and then use helper functions to build variants. I often define a factory function like make_invoice inside a fixture module. That way each test can build the exact object it needs without mutating shared state.

This pattern also improves readability: when a test creates an invoice with makeinvoice(discountrate="0.20"), the intent is obvious. It’s more precise than a huge, generic fixture with many optional parameters.

Performance Considerations: Keeping Feedback Fast

pytest is fast by default, but a slow suite still hurts. I watch for these common bottlenecks:

  • Over‑using integration tests for unit‑level checks.
  • Loading large fixtures for simple assertions.
  • Rebuilding expensive objects per test instead of using session‑scoped fixtures.
  • Hidden network calls.

A healthy pytest suite for a mid‑sized app should finish in seconds locally and in a couple minutes in CI. If it takes longer, I split slow tests into a separate marker and run them less frequently. I also use pytest -q for a quieter output in CI, which reduces log noise and makes failures easier to spot.

Parallel Test Execution With pytest‑xdist

When the suite grows, I enable pytest-xdist for parallel execution. This is one of my favorite plugins because it makes large suites feel responsive again. The only rule is that tests must be isolated; if they depend on shared state, parallel execution will surface those issues.

Typical usage is simple:

pytest -n auto

This uses the number of CPU cores available. I usually start with auto and then adjust if I see contention in CI.

Coverage: Measuring What You Actually Test

I treat coverage as a guide rather than a goal. I want to know what I’m missing, but I don’t want developers chasing 100% at the expense of meaningful tests. The pytest‑cov plugin makes it easy to generate coverage reports.

A basic run looks like:

pytest --cov=src --cov-report=term-missing

The term-missing report points to untested lines right in the output, which is perfect for incremental improvements.

Continuous Integration: Making pytest the Default Gate

A good pytest suite should run in CI on every push. I aim for a pipeline where unit tests always run and integration tests run on main or in nightly jobs. That keeps signal high and cost controlled.

A minimal CI flow looks like this:

- name: Install deps

run: python -m pip install -r requirements.txt

  • name: Run tests

run: pytest -m "not slow" --maxfail=1

I also set --maxfail=1 for faster failure feedback. If a test fails, the pipeline stops and I can fix it before running the full suite again.

Common Pitfalls I See (and How I Avoid Them)

  • Testing implementation details instead of behavior. I try to assert what the user cares about, not the exact structure of internal objects.
  • Over‑mocking. If the test only checks that mocks were called, it may be missing actual behavior. I use fakes or small integration tests instead.
  • Non‑deterministic tests. Time, randomness, and parallelism can cause flakiness. I control time and seed randomness when needed.
  • Huge fixtures. Massive fixtures make tests slow and hard to reason about. I prefer minimal data and helper functions.
  • Ignoring skipped tests. If a test is skipped, I check why. Skips should be rare and intentional.

When I Don’t Use pytest (or When Unit Tests Aren’t Enough)

pytest is my default, but not every problem is a unit test problem. If I’m validating performance, I reach for benchmarks. If I’m validating user flows, I need end‑to‑end tests with a browser. If I’m debugging production issues, logs and tracing may give me faster answers than tests.

I also avoid writing tests that only assert trivial things. For example, if a function is a one‑line wrapper around a library call, I might skip the unit test and rely on integration tests. The goal is value, not test count.

Alternative and Complementary Approaches

pytest plays well with other tools:

  • Property‑based testing (like Hypothesis) to explore edge cases you didn’t think of.
  • Type checking (mypy, pyright) to catch category errors before runtime.
  • Linters (ruff, flake8) to keep code clean and consistent.
  • Contract testing for APIs when multiple services interact.

I see these as layered defenses. pytest handles behavior, type checkers handle structure, and linters handle style. Together they keep teams moving fast without surprises.

A Practical Testing Flow I Use With Teams

When I start a new project, I establish a simple testing rhythm:

  • Write the function or endpoint.
  • Write a failing pytest for the main behavior.
  • Add a test for one edge case.
  • Implement or fix the code until tests pass.
  • Add fixtures or parametrization once patterns repeat.

This approach keeps tests small and ensures I’m always testing what the code is meant to do. It also keeps refactors safe because the tests describe behavior rather than implementation details.

Example: Testing the Reminder Scheduling End‑to‑End

Here’s a test that ties together invoice logic, scheduling, and notification. It’s not an end‑to‑end test with a real network call, but it does validate the business flow:

from datetime import datetime, timezone

from myapp.scheduling import schedulereminder

def testschedulereminderincludesdate(fake_notifier):

now = datetime(2026, 2, 22, 9, 0, tzinfo=timezone.utc)

reminder = schedulereminder(fakenotifier, "user_42", now)

assert reminder.strftime("%Y-%m-%d") in fake_notifier.calls[0][1]

This test checks the contract between scheduling and notification without using a real external service. It’s a perfect “integration‑lite” test: fast and meaningful.

Checklist: My Pytest Practices That Pay Off

  • Keep tests small and behavior‑focused.
  • Use fixtures for shared setup, but keep them lean.
  • Parametrize repeated patterns instead of copy‑paste.
  • Isolate time, randomness, and I/O.
  • Mark slow or integration tests clearly.
  • Run pytest in CI on every push.
  • Keep coverage as a guide, not a goal.

Closing Thoughts

pytest has stayed my default testing framework because it scales from a single file to a complex application without forcing me to rewrite anything. It makes writing tests feel like writing Python, which lowers the barrier for teams and helps the test suite become part of the development culture. The more I use pytest, the more I see tests as the place where design decisions become explicit.

If you take one thing from this guide, let it be this: write the simplest test that locks in the behavior you care about, and let pytest’s clarity and tooling do the rest. Your future self — and your users — will thank you.

Scroll to Top