8.2 KiB
Test Performance Optimization
Last Updated: 2026-01-28
Status: Implemented
This document records the test-suite performance work and — most importantly — the conventions that govern how tests are tagged and run. The seeds-test rationale in §1 is the canonical reference linked from test/seeds_test.exs.
Baseline result: the standard (fast) suite runs ~368 s (~6.1 min) vs. ~445 s before; the full suite (all tests) ~7.4 min. Slow-tagged tests (~25, >1 s each, ~77 s total) are excluded from standard runs and executed via promotion before merge.
1. Seeds Test Suite — coverage mapping
The seeds tests were reduced from 13 to 4. The 9 removed tests were dropped because their assertions are already covered by domain-specific test suites — this mapping is the justification and must be preserved:
| Removed seeds test | Covered by |
|---|---|
"at least one member has no membership fee type assigned" |
membership_fees/*_test.exs |
"each membership fee type has at least one member" |
membership_fees/*_test.exs |
"members with fee types have cycles with various statuses" |
cycle_generator_test.exs |
"creates all 5 authorization roles with correct permission sets" |
authorization/*_test.exs |
"all roles have valid permission_set_names" |
authorization/permission_sets_test.exs |
"does not change role of users who already have a role" |
merged into general idempotency test |
"role creation is idempotent" (detailed) |
merged into general idempotency test |
4 retained critical-bootstrap tests (and why)
These guard deployment-critical invariants that nothing else covers and must stay in the fast suite:
- Smoke test — seeds run successfully and create basic data.
- Idempotency — seeds can be re-run without duplicating data.
- Admin bootstrap — admin user exists with Admin role (critical for initial access).
- System-role bootstrap —
Mitgliedsystem role exists (critical for user registration).
If a new critical bootstrap requirement appears, add a test to the "Critical bootstrap invariants" section in test/seeds_test.exs. Removed-test risk is low: a smoke-test failure surfaces broken seeds, and domain tests verify business logic independently of seeds content.
2. Tagging convention: :slow
Tests are split into fast (standard CI) and slow (run via promotion before merge). A test is tagged @tag :slow when all of:
- execution time > 1 s, and
- low risk — does not catch critical regressions in core business logic, and
- it is a UI/display/formatting test, a workflow-detail test, or an edge case with a large dataset (performance tests with 50+ records always qualify).
Never tag as :slow:
- Core CRUD (Member/User create/update/destroy)
- Basic authentication/authorization
- Critical bootstrap (admin user, system roles)
- Email synchronization
- Representative policy tests (one per permission set + action)
- A test that is merely slow due to inefficient setup or a bug — fix the setup/bug instead
- An integration test — use
@tag :integrationinstead
Use @describetag :slow (not @moduletag) for describe blocks, so unrelated tests in the same module are not tagged.
One-off isolation fix worth noting as a pattern: a test that loaded all members was slow in full runs because of cross-test data accumulation; constraining it with a search query (/members?query=Alice) made it both faster and properly isolated. Prefer query filters over loading all records.
3. Execution model
| Mode | Command | Contents | Time |
|---|---|---|---|
| Fast (default) | just test-fast / mix test --exclude slow --exclude ui |
everything except :slow and :ui |
~6 min |
| Slow only | just test-slow / mix test --only slow |
the ~25 :slow tests |
~1.3 min |
| Full | just test / mix test |
all tests | ~7.4 min |
CI: standard pipeline (check-fast) runs mix test --exclude slow --exclude ui. The full suite (check-full) is triggered by promoting a Drone build to production and is required before merging to main (branch protection).
To find the slowest tests, run mix test --slowest N ad hoc. test/test_helper.exs also carries a slowest: 10 option for ExUnit.start/1, but it is commented out by default — uncomment it to print the 10 slowest tests at the end of every run.
4. Test organization
Tests mirror the lib/ structure:
test/
├── accounts/ # Accounts domain
├── membership/ # Membership domain
├── membership_fees/ # Membership fees domain
├── mv/ # Core (accounts, membership, authorization)
├── mv_web/ # Web layer (controllers, live, components)
└── support/ # conn_case.ex, data_case.ex
5. Concurrent create_member deadlock and deferrable FKs
A class of intermittent failures (PostgreSQL deadlock_detected, SQLSTATE 40P01) was traced to concurrent create_member transactions, not to any single test. It surfaced as a MatchError on {:ok, member} = ... in member-heavy LiveView tests (e.g. FormMemberSelectionTest) and reproduced only under CPU contention (≈1 in 12 full-fast-suite runs at high async: true concurrency; effectively never on an idle machine).
Root cause. create_member writes a cascade in one transaction (member row, custom_field_values, the user link, fee-type defaulting, cycle generation). Concurrent inserts take FK FOR KEY SHARE (MultiXact) locks on shared parent rows across members / users / membership_fee_types; under contention these can form a cross-transaction lock cycle that Postgres resolves by aborting one transaction. It is a product-level concurrency property, not test-data contention, so it is not fixable by test-state isolation.
Fix. Migration …_make_member_user_fks_deferrable.exs makes the three FKs (users.member_id, users.role_id, members.membership_fee_type_id) DEFERRABLE INITIALLY DEFERRED, moving the FK check (and its lock) to commit time and breaking the cycle. Verified: 0 deadlocks in 15 full-suite runs under maximum CPU contention, versus 1/12 before. This does not weaken integrity — NOT NULL is independent of FK deferral, a real dangling reference still aborts the commit, and ON DELETE RESTRICT (e.g. users.role_id) stays immediate regardless of deferrability. Mv.DeferrableFkTest asserts the constraint state as a regression guard (a deterministic in-process concurrent reproduction is infeasible under the Ecto sandbox, which serializes connections by ownership).
This deadlock is also a latent production risk under concurrent sign-ups; the deferrable-FK fix addresses both.
Async-test-safety checklist (members/groups/custom fields)
Several member-creating test files historically used async: false with a "prevent PostgreSQL deadlocks" comment. With the deferrable-FK migration in place those files are deadlock-safe, but before flipping any such file to async: true:
- Prove isolation under load, not just one green run. Re-run the file (and the full suite) under varying
--seedand CPU contention; a single green run is not evidence (the deadlock and the isolation flakes below are load-dependent). - Watch for separate async-isolation issues beyond the deadlock.
index_groups_url_params_test.exsandmember_filter_component_test.exsshowed filtered-member-leak failures (refute html =~ name) under concurrency that are independent of the FK deadlock — these need their own per-file isolation fix before they can run async.
StreamData generator pitfall
FilterTooNarrowError appeared on unlucky seeds (e.g. 222) in a property test that built a value with a reject-filter (StreamData.filter discarding ~1/4 of generated pairs). Under full property-run counts this hits too many consecutive rejections. Fix: construct the desired value directly instead of generating-then-filtering (preserves the exact domain, no rejection). Prefer constructive generators over reject-filters in property tests.
References
- Testing Standards:
CODE_GUIDELINES.md§4 - CI/CD:
.drone.jsonnet - Test helper:
test/test_helper.exs - Just commands:
Justfile(test-fast,test-slow,test)