mitgliederverwaltung/docs/test-performance-optimization.md

# Test Performance Optimization

**Last Updated:** 2026-01-28
**Status:** Implemented

This document records the test-suite performance work and — most importantly — the conventions that govern how tests are tagged and run. The seeds-test rationale in §1 is the canonical reference linked from `test/seeds_test.exs`.

Baseline result: the standard (fast) suite runs ~368 s (~6.1 min) vs. ~445 s before; the full suite (all tests) ~7.4 min. Slow-tagged tests (~25, >1 s each, ~77 s total) are excluded from standard runs and executed via promotion before merge.

---

## 1. Seeds Test Suite — coverage mapping

The seeds tests were reduced from 13 to 4. The 9 removed tests were dropped because their assertions are already covered by domain-specific test suites — this mapping is the justification and must be preserved:

| Removed seeds test | Covered by |
|--------------------|-----------|
| `"at least one member has no membership fee type assigned"` | `membership_fees/*_test.exs` |
| `"each membership fee type has at least one member"` | `membership_fees/*_test.exs` |
| `"members with fee types have cycles with various statuses"` | `cycle_generator_test.exs` |
| `"creates all 5 authorization roles with correct permission sets"` | `authorization/*_test.exs` |
| `"all roles have valid permission_set_names"` | `authorization/permission_sets_test.exs` |
| `"does not change role of users who already have a role"` | merged into general idempotency test |
| `"role creation is idempotent"` (detailed) | merged into general idempotency test |

### 4 retained critical-bootstrap tests (and why)

These guard deployment-critical invariants that nothing else covers and must stay in the **fast** suite:

1. **Smoke test** — seeds run successfully and create basic data.
2. **Idempotency** — seeds can be re-run without duplicating data.
3. **Admin bootstrap** — admin user exists with Admin role (critical for initial access).
4. **System-role bootstrap** — `Mitglied` system role exists (critical for user registration).

If a new critical bootstrap requirement appears, add a test to the "Critical bootstrap invariants" section in `test/seeds_test.exs`. Removed-test risk is low: a smoke-test failure surfaces broken seeds, and domain tests verify business logic independently of seeds content.

---

## 2. Tagging convention: `:slow`

Tests are split into **fast** (standard CI) and **slow** (run via promotion before merge). A test is tagged `@tag :slow` when **all** of:

- execution time > 1 s, **and**
- low risk — does not catch critical regressions in core business logic, **and**
- it is a UI/display/formatting test, a workflow-detail test, or an edge case with a large dataset (performance tests with 50+ records always qualify).

**Never** tag as `:slow`:

- Core CRUD (Member/User create/update/destroy)
- Basic authentication/authorization
- Critical bootstrap (admin user, system roles)
- Email synchronization
- Representative policy tests (one per permission set + action)
- A test that is merely slow due to inefficient setup or a bug — fix the setup/bug instead
- An integration test — use `@tag :integration` instead

Use **`@describetag :slow`** (not `@moduletag`) for describe blocks, so unrelated tests in the same module are not tagged.

One-off isolation fix worth noting as a pattern: a test that loaded *all* members was slow in full runs because of cross-test data accumulation; constraining it with a search query (`/members?query=Alice`) made it both faster and properly isolated. Prefer query filters over loading all records.

---

## 3. Execution model

| Mode | Command | Contents | Time |
|------|---------|----------|------|
| Fast (default) | `just test-fast` / `mix test --exclude slow --exclude ui` | everything except `:slow` and `:ui` | ~6 min |
| Slow only | `just test-slow` / `mix test --only slow` | the ~25 `:slow` tests | ~1.3 min |
| Full | `just test` / `mix test` | all tests | ~7.4 min |

CI: standard pipeline (`check-fast`) runs `mix test --exclude slow --exclude ui`. The full suite (`check-full`) is triggered by promoting a Drone build to `production` and is required before merging to `main` (branch protection).

To find the slowest tests, run `mix test --slowest N` ad hoc. `test/test_helper.exs` also carries a `slowest: 10` option for `ExUnit.start/1`, but it is commented out by default — uncomment it to print the 10 slowest tests at the end of every run.

---

## 4. Test organization

Tests mirror the `lib/` structure:

```
test/
├── accounts/              # Accounts domain
├── membership/            # Membership domain
├── membership_fees/       # Membership fees domain
├── mv/                    # Core (accounts, membership, authorization)
├── mv_web/                # Web layer (controllers, live, components)
└── support/               # conn_case.ex, data_case.ex
```

---

## 5. Concurrent `create_member` deadlock and deferrable FKs

A class of intermittent failures (PostgreSQL `deadlock_detected`, SQLSTATE `40P01`) was traced to **concurrent `create_member` transactions**, not to any single test. It surfaced as a `MatchError` on `{:ok, member} = ...` in member-heavy LiveView tests (e.g. `FormMemberSelectionTest`) and reproduced only under CPU contention (≈1 in 12 full-fast-suite runs at high `async: true` concurrency; effectively never on an idle machine).

**Root cause.** `create_member` writes a cascade in one transaction (member row, `custom_field_values`, the `user` link, fee-type defaulting, cycle generation). Concurrent inserts take FK `FOR KEY SHARE` (MultiXact) locks on shared parent rows across `members` / `users` / `membership_fee_types`; under contention these can form a cross-transaction lock cycle that Postgres resolves by aborting one transaction. It is a product-level concurrency property, **not** test-data contention, so it is not fixable by test-state isolation.

**Fix.** Migration `…_make_member_user_fks_deferrable.exs` makes the three FKs (`users.member_id`, `users.role_id`, `members.membership_fee_type_id`) `DEFERRABLE INITIALLY DEFERRED`, moving the FK check (and its lock) to commit time and breaking the cycle. Verified: **0 deadlocks in 15 full-suite runs under maximum CPU contention**, versus 1/12 before. This does **not** weaken integrity — `NOT NULL` is independent of FK deferral, a real dangling reference still aborts the commit, and `ON DELETE RESTRICT` (e.g. `users.role_id`) stays immediate regardless of deferrability. `Mv.DeferrableFkTest` asserts the constraint state as a regression guard (a deterministic in-process concurrent reproduction is infeasible under the Ecto sandbox, which serializes connections by ownership).

This deadlock is also a latent **production** risk under concurrent sign-ups; the deferrable-FK fix addresses both.

### Async-test-safety checklist (members/groups/custom fields)

Several member-creating test files historically used `async: false` with a "prevent PostgreSQL deadlocks" comment. With the deferrable-FK migration in place those files are deadlock-safe, but before flipping any such file to `async: true`:

- **Prove isolation under load, not just one green run.** Re-run the file (and the full suite) under varying `--seed` **and** CPU contention; a single green run is not evidence (the deadlock and the isolation flakes below are load-dependent).
- **Watch for separate async-isolation issues beyond the deadlock.** `index_groups_url_params_test.exs` and `member_filter_component_test.exs` showed filtered-member-leak failures (`refute html =~ name`) under concurrency that are independent of the FK deadlock — these need their own per-file isolation fix before they can run async.

### StreamData generator pitfall

`FilterTooNarrowError` appeared on unlucky seeds (e.g. 222) in a property test that built a value with a reject-filter (`StreamData.filter` discarding ~1/4 of generated pairs). Under full property-run counts this hits too many consecutive rejections. Fix: **construct the desired value directly** instead of generating-then-filtering (preserves the exact domain, no rejection). Prefer constructive generators over reject-filters in property tests.

---

## References

- Testing Standards: `CODE_GUIDELINES.md` §4
- CI/CD: `.drone.jsonnet`
- Test helper: `test/test_helper.exs`
- Just commands: `Justfile` (`test-fast`, `test-slow`, `test`)