# CSV Member Import Reference for how the CSV member import actually behaves. The end-to-end LiveView test (`test/mv_web/live/import_live_test.exs`) and future maintenance depend on the rules documented here. **Status:** implemented (backend + LiveView UI). Implementation: - `lib/mv/membership/import/csv_parser.ex` — BOM stripping, delimiter detection, physical line numbering - `lib/mv/membership/import/header_mapper.ex` — header normalization + column mapping - `lib/mv/membership/import/column_resolver.ex` — read-only resolution of groups + fee-type columns (preview) - `lib/mv/membership/import/member_csv.ex` — `prepare/2`, `process_chunk/4`, validation, member creation - `lib/mv/membership/import/import_runner.ex` — orchestration glue - `lib/mv_web/live/import_live.ex` (+ `import_live/components.ex`) — UI, state machine, chunk driving - `lib/mv_web/controllers/import_template_controller.ex` — on-the-fly template generation ## Scope Admin-only bulk creation of members from an uploaded CSV. - **Create only** — no upsert/update of existing members. - **No deduplication** — a duplicate email fails its row (unique constraint) and is reported as an error. - **Best-effort, row-by-row** — no transactional rollback; a failed row does not abort the import. - **No background jobs** — progress is driven via LiveView `handle_info` chunk messages. - **Errors shown in UI only** — no error-CSV export. Out of scope: upsert, mapping wizard, transactional all-or-nothing, error export, import history/audit. ## UI Flow - **Route:** `/admin/import` (LiveView `MvWeb.ImportLive`). Template downloads: `/admin/import/template/en` and `/admin/import/template/de` (dynamic controller, not static files). - **Authorization:** requires `can?(:create, Mv.Membership.Member)`. Non-admins are redirected with a "don't have permission" flash. The import section, the template controller, and the `start_import` event all enforce this. - **Upload:** `allow_upload(:csv_file, accept: .csv, max_entries: 1, auto_upload: true)`. File size limit enforced by `max_file_size`. - **State machine** (`@import_status`): `idle → preview → running → done|error`. - **start_import** parses + resolves the file and transitions to **preview**. This step is **read-only**: no members are created yet. The preview shows the column mapping, sample rows, groups that exist vs. would be created, and fee-type/unknown-column warnings. - **confirm_import** begins processing and creates members chunk by chunk. - **Results:** success count, failure count, error list (each with CSV line number, message, optional field), warnings, and a truncation notice when errors exceed the cap. ## Limits - **Max file size:** configurable via `config :mv, csv_import: [max_file_size_mb: ...]` (enforced by `allow_upload`). - **Max rows:** configurable via `config :mv, csv_import: [max_rows: ...]`, default **1000**, excluding header. Enforced in `MemberCSV.prepare/2`; exceeding it yields an error containing `"exceeds"`. - **Chunk size:** 200 rows per chunk. - **Error cap:** 50 errors collected per import overall (`failed` count stays accurate; `errors_truncated?` flag set when exceeded). ## Parsing (`CsvParser.parse/1`) - Content must be **valid UTF-8** (else error). Empty content / empty header row are errors. - **UTF-8 BOM is stripped first**, before any header handling. - Line endings normalized: `\r\n`, `\r`, `\n` all handled. - **Delimiter auto-detection:** parse the header with both `;` and `,` parsers (NimbleCSV, quote-aware), count non-empty fields each yields, pick the higher; **`;` wins ties**; default `;`. - **Quoting:** double-quote quoting; `""` inside a quoted field is a literal `"`. Newlines inside quoted fields are supported — the record keeps its **start** line number. - **Physical line numbers:** rows are returned as `{csv_line_number, values}` where the line number is the physical 1-based line in the file (header is line 1, first data row is line 2). **Empty lines are skipped but do not shift numbering** — downstream code must use the parser's line numbers, never recompute from row index. (Test asserts an invalid row after a skipped empty line still reports its true physical line, e.g. `Line 4`.) - Completely empty rows are skipped. An unparsable row produces an error naming its line number. ## Header Mapping & Normalization (`HeaderMapper`) **`normalize_header/1`** (applied identically to incoming headers, mapping variants, custom field names, group names, and fee-type names): 1. trim, lowercase 2. transliterate German chars: `ß → ss`, `ä → ae`, `ö → oe`, `ü → ue` (and uppercase forms) 3. unify hyphen variants (en dash U+2013, em dash U+2014, minus U+2212 → `-`) 4. punctuation to spaces: `_`, `()[]{}`, `/`, `\` → space 5. **remove all whitespace** (so `first name` == `firstname`) 6. final trim Matching is on the fully normalized string. **Required field:** `email`. Missing it aborts `prepare` with a "Missing required header" error. **Unknown member-field columns:** ignored (no error). If an unknown column looks like it could be a custom field that does not exist, a **warning** is emitted (import continues). **Duplicate headers** mapping to the same canonical field (or same custom field) are an error. ### Supported member fields and header variants Source of truth is `@member_field_variants_raw` in `header_mapper.ex`. Variants below are illustrative; matching is via normalization, so casing/hyphen/whitespace differences all collapse. | Canonical | Example accepted headers (EN / DE) | Notes | |---|---|---| | `email` (required) | email, e-mail, e_mail, mail, e-mail-adresse / E-Mail | | | `first_name` | first name, firstname / Vorname | | | `last_name` | last name, lastname, surname / Nachname, Familienname | | | `join_date` | join date / Beitrittsdatum | ISO-8601 date | | `exit_date` | exit date / Austrittsdatum | ISO-8601 date | | `notes` | notes / Notizen, Bemerkungen | | | `street` | street, address / Straße, Strasse | | | `house_number` | house number, house no / Hausnummer, Nr, Nr., Nummer | | | `postal_code` | postal code, zip, postcode / PLZ, Postleitzahl | | | `city` | city, town / Stadt, Ort | | | `country` | country / Land, Staat | | | `membership_fee_start_date` | membership fee start date, fee start / Beitragsbeginn | ISO-8601 date | ### Special relationship columns - **groups** (headers `Groups` / `Gruppen` / `Gruppe`) — comma-separated group names. Names matched case-insensitively against existing groups; **missing groups are auto-created** during processing. A group-assignment failure fails that row (the member was already created). - **membership_fee_type** (headers `Fee Type`, `fee_type`, `membership_fee_type` / `Beitragsart`) — name matched to an existing `MembershipFeeType`. **Empty cell → default fee type** (no warning). **Matched name → that fee type.** **Unmatched name → default fee type + warning** naming the value. These columns are resolved against the DB read-only in `prepare` (`ColumnResolver`) for the preview; the actual writes happen in `process_chunk`. ### Fields not importable (explicitly ignored) - **membership_fee_status** — computed from fee cycles; not stored. Fee-status header variants (`Membership Fee Status`, `Bezahlstatus`, `Mitgliedsbeitragsstatus`) and the DE export label `Startdatum Mitgliedsbeitrag` are placed in the `ignored` list and never mapped. (The UI notice names `Groups`/`Gruppen`, `Fee Type`/`Beitragsart`, and the always-ignored `Bezahlstatus`.) ## Custom Fields - Custom field columns are matched by the custom field **name** (not slug), using the same normalization. Member fields take priority on a name collision. - **Custom fields must exist in Mila before import.** Unknown custom-field columns are ignored with a warning; the import still runs. - Empty custom-field cells create no value. Values are trimmed; type-validated per the custom field's `value_type`: - **string** — any text (trimmed). - **integer** — must parse fully (`Integer.parse` with no remainder); e.g. `42`, `-10`. - **boolean** — case-insensitive `true/false`, `1/0`, `yes/no`, `ja/nein`. - **date** — ISO-8601 `YYYY-MM-DD`. - **email** — validated with `EctoCommons.EmailValidator` (same checks as member email). - A value failing type validation fails the row. Error message format: `custom_field: – expected , got: ` (type label is the human-readable `FieldTypes.label/1`, with format hints for boolean/date). ## Validation & Member Creation (`process_chunk/4` → `process_row`) Per row: validate → create member → create custom-field values → assign groups. Sequential. - **Email** is required and format-validated (`EctoCommons.EmailValidator`, `Mv.Constants.email_validator_checks()`) on a trimmed value. All string member values are trimmed. - **Date fields** (`join_date`, `exit_date`, `membership_fee_start_date`): empty/blank strings are converted to `nil` so Ash accepts them. - Member created via `Mv.Membership.create_member/2`. Custom field values are passed as `custom_field_values` (Ash union `_union_type`/`_union_value` format), omitted when none. - **Errors** are `%MemberCSV.Error{csv_line_number, field, message}`: - `csv_line_number` is the physical line (1-based); never recomputed in this layer. - Validation errors get `field: :email`; Ash errors prefer the field-level error. - **Duplicate email** (unique constraint) is surfaced as a friendly `"email has already been taken"` message. - **Error capping** (`max_errors`, default 50, tracked across chunks via `existing_error_count`): once the cap is hit, no further errors are collected but **all rows are still processed** and the `failed` count stays accurate; `errors_truncated?` is set and the UI shows a truncation notice. ## Templates (`ImportTemplateController`) - Generated on the fly (not static files), gated by `can?(:create, Member)`. - One header row: standard member columns (localized EN/DE) + `Groups`/`Gruppen` + `Fee Type`/`Beitragsart` + **every existing custom field name** appended, then one example row. - Semicolon-delimited, RFC-4180 quoting; fields run through `MembersCSV.safe_cell/1` to neutralize spreadsheet formula injection (e.g. a custom-field name like `=HYPERLINK(...)`).