mitgliederverwaltung/docs/csv-member-import-v1.md
carla 720a43a38c
Some checks failed
continuous-integration/drone/push Build is failing
feat: added csv templates
2026-01-12 17:36:15 +01:00

24 KiB

CSV Member Import v1 - Implementation Plan

Version: 1.0
Date: 2025-01-XX
Status: Ready for Implementation
Related Documents:


Table of Contents


Overview & Scope

What We're Building

A basic CSV member import feature that allows administrators to upload a CSV file and import new members into the system. This is a v1 minimal implementation focused on establishing the import structure without advanced features.

Core Functionality (v1 Minimal):

  • Upload CSV file via LiveView file upload
  • Parse CSV with bilingual header support for core member fields (English/German)
  • Auto-detect delimiter (; or ,) using header recognition
  • Map CSV columns to core member fields (first_name, last_name, email, street, postal_code, city)
  • Import custom field values - Map CSV columns to existing custom fields by name (unknown custom field columns will be ignored with a warning)
  • Validate each row (required field: email)
  • Create members via Ash resource (one-by-one, no background jobs, processed in chunks of 200 rows via LiveView messages)
  • Display import results: success count, error count, and error details
  • Provide static CSV templates (EN/DE)

Key Constraints (v1):

  • Admin-only feature
  • No upsert (create only)
  • No deduplication (duplicate emails fail and show as errors)
  • No mapping wizard (fixed header mapping via bilingual variants)
  • No background jobs (progress via LiveView handle_info)
  • Best-effort import (row-by-row, no rollback)
  • UI-only error display (no error CSV export)
  • Safety limits (10 MB, 1,000 rows, chunks of 200)

Out of Scope (v1)

Deferred to Future Versions:

  • Upsert/update existing members
  • Advanced deduplication strategies
  • Column mapping wizard UI
  • Background job processing (Oban/GenStage)
  • Transactional all-or-nothing import
  • Error CSV export/download
  • Batch validation preview before import
  • Dynamic template generation
  • Import history/audit log
  • Import templates for other entities

UX Flow

Access & Location

Entry Point:

  • Location: Global Settings page (/settings)
  • UI Element: New section "Import Members (CSV)" below "Custom Fields" section
  • Access Control: Admin-only (enforced at LiveView event level, not entire /settings route)

User Journey

  1. Navigate to Global Settings
  2. Access Import Section
    • Important notice: Custom fields should be created in Mila before importing CSV files with custom field columns (unknown columns will be ignored with a warning)
    • Upload area (drag & drop or file picker)
    • Template download links (English / German)
    • Help text explaining CSV format and custom field requirements
  3. Ensure Custom Fields Exist (if importing custom fields)
    • Navigate to Custom Fields section and create required custom fields
    • Note the name/identifier for each custom field (used as CSV header)
  4. Download Template (Optional)
  5. Prepare CSV File
    • Include custom field columns using the custom field name as header (e.g., membership_number, birth_date)
  6. Upload CSV
  7. Start Import
    • Runs server-side via LiveView messages (may take up to ~30 seconds for large files)
    • Warning messages if custom field columns reference non-existent custom fields (columns will be ignored)
  8. View Results
    • Success count
    • Error count
    • First 50 errors, each with:
      • CSV line number (header is line 1, first data record begins at line 2)
      • Error message
      • Field name (if applicable)

Error Handling

  • File too large: Flash error before upload starts
  • Too many rows: Flash error before import starts
  • Invalid CSV format: Error shown in results
  • Partial success: Results show both success and error counts

CSV Specification

Delimiter

Recommended: Semicolon (;)
Supported: ; and ,

Auto-Detection (Header Recognition):

  • Remove UTF-8 BOM first
  • Extract header record and try parsing with both delimiters
  • For each delimiter, count how many recognized headers are present (via normalized variants)
  • Choose delimiter with higher recognition; prefer ; if tied
  • If neither yields recognized headers, default to ;

Quoting Rules

  • Fields may be quoted with double quotes (")
  • Escaped quotes: "" inside quoted field represents a single "
  • v1 assumption: CSV records do not contain embedded newlines inside quoted fields. (If they do, parsing may fail or line numbers may be inaccurate.)

Column Headers

v1 Supported Fields:

Core Member Fields:

  • first_name / Vorname (optional)
  • last_name / Nachname (optional)
  • email / E-Mail (required)
  • street / Straße (optional)
  • postal_code / PLZ / Postleitzahl (optional)
  • city / Stadt (optional)

Custom Fields:

  • Any custom field column using the custom field's name as the header (e.g., membership_number, birth_date)
  • Important: Custom fields must be created in Mila before importing. The CSV header must match the custom field name exactly (same normalization as member fields).
  • Behavior: If the CSV contains custom field columns that don't exist in Mila, a warning message will be shown and those columns will be ignored during import.

Member Field Header Mapping:

Canonical Field English Variants German Variants
first_name first_name, firstname Vorname, vorname
last_name last_name, lastname, surname Nachname, nachname, Familienname
email email, e-mail, e_mail E-Mail, e-mail, e_mail
street street, address Straße, strasse, Strasse
postal_code postal_code, zip, postcode PLZ, plz, Postleitzahl, postleitzahl
city city, town Stadt, stadt, Ort

Header Normalization (used consistently for both input headers AND mapping variants):

  • Trim whitespace
  • Convert to lowercase
  • Normalize Unicode: ßss (e.g., Straßestrasse)
  • Replace hyphens/whitespace with underscores: E-Maile_mail, phone numberphone_number
  • Collapse multiple underscores: e__maile_mail
  • Case-insensitive matching

Unknown columns: ignored (no error)

Required fields: email

Custom Field Columns:

  • Custom field columns are identified by matching the normalized CSV header to the custom field name (not slug)
  • Same normalization rules apply as for member fields (trim, lowercase, Unicode normalization, underscore replacement)
  • Unknown custom field columns (non-existent names) will be ignored with a warning message

CSV Template Files

Location:

  • priv/static/templates/member_import_en.csv
  • priv/static/templates/member_import_de.csv

Content:

  • Header row with required + common optional fields
  • Note: Custom field columns are not included in templates by default (users add them based on their custom field configuration)
  • One example row
  • Uses semicolon delimiter (;)
  • UTF-8 encoding with BOM (Excel compatibility)

Template Access:

  • Templates are static files in priv/static/templates/
  • Served at:
    • /templates/member_import_en.csv
    • /templates/member_import_de.csv
  • In LiveView, link them using Phoenix static path helpers (e.g. ~p or Routes.static_path/2, depending on Phoenix version).

Example Usage in LiveView Templates:

<!-- Using ~p sigil (Phoenix 1.7+) -->
<.link href={~p"/templates/member_import_en.csv"} download>
  <%= gettext("Download English Template") %>
</.link>

<.link href={~p"/templates/member_import_de.csv"} download>
  <%= gettext("Download German Template") %>
</.link>

<!-- Alternative: Using Routes.static_path/2 -->
<.link href={Routes.static_path(MvWeb.Endpoint, "/templates/member_import_en.csv")} download>
  <%= gettext("Download English Template") %>
</.link>

Note: The templates directory must be included in MvWeb.static_paths() (configured in lib/mv_web.ex) for the files to be served.

File Limits

  • Max file size: 10 MB
  • Max rows: 1,000 rows (excluding header)
  • Processing: chunks of 200 (via LiveView messages)
  • Encoding: UTF-8 (BOM handled)

Technical Design Notes

Architecture Overview

┌─────────────────┐
│  LiveView UI    │  (GlobalSettingsLive or component)
│  - Upload area  │
│  - Progress     │
│  - Results      │
└────────┬────────┘
         │ prepare
         ▼
┌─────────────────────────────┐
│ Import Service              │  (Mv.Membership.Import.MemberCSV)
│ - parse + map + limit checks│  -> returns import_state
│ - process_chunk(chunk)      │  -> returns chunk results
└────────┬────────────────────┘
         │ create
         ▼
┌─────────────────┐
│  Ash Resource   │  (Mv.Membership.Member)
│  - Create       │
└─────────────────┘

Technology Stack

  • Phoenix LiveView: file upload via allow_upload/3
  • NimbleCSV: CSV parsing (add explicit dependency if missing)
  • Ash Resource: member creation via Membership.create_member/1
  • Gettext: bilingual UI/error messages

Module Structure

New Modules:

  • lib/mv/membership/import/member_csv.ex - import orchestration + chunk processing + custom field handling
  • lib/mv/membership/import/csv_parser.ex - delimiter detection + parsing + BOM handling
  • lib/mv/membership/import/header_mapper.ex - normalization + header mapping (core fields + custom fields)

Modified Modules:

  • lib/mv_web/live/global_settings_live.ex - render import section, handle upload/events/messages

Data Flow

  1. Upload: LiveView receives file via allow_upload
  2. Consume: consume_uploaded_entries/3 reads file content
  3. Prepare: MemberCSV.prepare/2
    • Strip BOM
    • Detect delimiter (header recognition)
    • Parse header + rows
    • Map headers to canonical fields (core member fields)
    • Query existing custom fields and map custom field columns by name (using same normalization as member fields)
    • Warn about unknown custom field columns (non-existent names will be ignored with warning)
    • Early abort if required headers missing
    • Row count check
    • Return import_state containing chunks, column_map, and custom_field_map
  4. Process: LiveView drives chunk processing via handle_info
    • For each chunk: validate + create member + create custom field values + collect errors
  5. Results: LiveView shows progress + final summary

Types & Key Consistency

  • Raw CSV parsing: returns headers as list of strings, and rows with csv line numbers
  • Header mapping: operates on normalized strings; mapping table variants are normalized once
  • Ash attrs: built as atom-keyed map (%{first_name: ..., ...})

Error Model

%{
  csv_line_number: 5,       # physical line number in the CSV file
  field: :email,            # optional
  message: "is not a valid email"
}

CSV Line Numbers (Important)

To keep error reporting user-friendly and accurate, row errors must reference the physical line number in the original file, even if empty lines are skipped.

Design decision: the parser returns rows as:

rows :: [{csv_line_number :: pos_integer(), row_map :: map()}]

Downstream logic must not recompute line numbers from row indexes.

Authorization

Enforcement points:

  1. LiveView event level: check admin permission in handle_event("start_import", ...)
  2. UI level: render import section only for admin users
  3. Static templates: public assets (no authorization needed)

Use Mv.Authorization.PermissionSets (preferred) instead of hard-coded string checks where possible.

Safety Limits

  • File size enforced by allow_upload (max_file_size)
  • Row count enforced in MemberCSV.prepare/2 before processing starts
  • Chunking is done via LiveView handle_info loop (sequential, cooperative scheduling)

Implementation Issues

Issue #1: CSV Specification & Static Template Files

Dependencies: None

Goal: Define CSV contract and add static templates.

Tasks:

  • Finalize header mapping variants
  • Document normalization rules
  • Document delimiter detection strategy
  • Create templates in priv/static/templates/ (UTF-8 with BOM)
  • Document template URLs and how to link them from LiveView
  • Document line number semantics (physical CSV line numbers)

Definition of Done:

  • Templates open cleanly in Excel/LibreOffice
  • CSV spec section complete

Issue #2: Import Service Module Skeleton

Dependencies: None

Goal: Create service API and error types.

API (recommended):

  • prepare/2 — parse + map + limit checks, returns import_state
  • process_chunk/3 — process one chunk (pure-ish), returns per-chunk results

Tasks:

  • Create lib/mv/membership/import/member_csv.ex
  • Define public function: prepare/2 (file_content, opts \\ [])
  • Define public function: process_chunk/3 (chunk_rows_with_lines, column_map, opts \\ [])
  • Define error struct: %MemberCSV.Error{csv_line_number: integer, field: atom | nil, message: String.t}
  • Document module + API

Issue #3: CSV Parsing + Delimiter Auto-Detection + BOM Handling

Dependencies: Issue #2

Goal: Parse CSV robustly with correct delimiter detection and BOM handling.

Tasks:

  • Verify/add NimbleCSV dependency ({:nimble_csv, "~> 1.0"})
  • Create lib/mv/membership/import/csv_parser.ex
  • Implement strip_bom/1 and apply it before any header handling
  • Handle \r\n and \n line endings (trim \r on header record)
  • Detect delimiter via header recognition (try ; and ,)
  • Parse CSV and return:
    • headers :: [String.t()]
    • rows :: [{csv_line_number, [String.t()]}] or directly [{csv_line_number, row_map}]
  • Skip completely empty records (but preserve correct physical line numbers)
  • Return {:ok, headers, rows} or {:error, reason}

Definition of Done:

  • BOM handling works (Excel exports)
  • Delimiter detection works reliably
  • Rows carry correct csv_line_number

Issue #4: Header Normalization + Per-Header Mapping (No Language Detection)

Dependencies: Issue #3

Goal: Map each header individually to canonical fields (normalized comparison).

Tasks:

  • Create lib/mv/membership/import/header_mapper.ex
  • Implement normalize_header/1
  • Normalize mapping variants once and compare normalized strings
  • Build column_map (canonical field -> column index)
  • Early abort if required headers missing (email)
  • Ignore unknown columns (member fields only)
  • Separate custom field column detection (by name, with normalization)

Definition of Done:

  • English/German headers map correctly
  • Missing required columns fails fast

Issue #5: Validation (Required Fields) + Error Formatting

Dependencies: Issue #4

Goal: Validate each row and return structured, translatable errors.

Tasks:

  • Implement validate_row/3 (row_map, csv_line_number, opts)
  • Required field presence (email)
  • Email format validation (EctoCommons.EmailValidator)
  • Trim values before validation
  • Gettext-backed error messages

Issue #6: Persistence via Ash Create + Per-Row Error Capture (Chunked Processing)

Dependencies: Issue #5

Goal: Create members and capture errors per row with correct CSV line numbers.

Tasks:

  • Implement process_chunk/3 in service:
    • Input: [{csv_line_number, row_map}]
    • Validate + create sequentially
    • Collect counts + first 50 errors (per import overall; LiveView enforces cap across chunks)
  • Implement Ash error formatter helper:
    • Convert Ash.Error.Invalid into %MemberCSV.Error{}
    • Prefer field-level errors where possible (attach field atom)
    • Handle unique email constraint error as user-friendly message
  • Map row_map to Ash attrs (%{first_name: ..., ...})

Important: Do not recompute line numbers in this layer—use the ones provided by the parser.


Dependencies: Issue #6

Goal: UI section with upload, progress, results, and template links.

Tasks:

  • Render import section only for admins
  • Add prominent UI notice about custom fields:
    • Display alert/info box: "Custom fields must be created in Mila before importing CSV files with custom field columns"
    • Explain: "Use the custom field name as the CSV column header (same normalization as member fields applies)"
    • Add link to custom fields management section
  • Configure allow_upload/3:
    • .csv only, max_entries: 1, max_file_size: 10MB, auto_upload: false
  • handle_event("start_import", ...):
    • Admin permission check
    • Consume upload -> read file content
    • Call MemberCSV.prepare/2
    • Store import_state in assigns (chunks + column_map + metadata)
    • Initialize progress assigns
    • send(self(), {:process_chunk, 0})
  • handle_info({:process_chunk, idx}, socket):
    • Fetch chunk from import_state
    • Call MemberCSV.process_chunk/3
    • Merge counts/errors into progress assigns (cap errors at 50 overall)
    • Schedule next chunk (or finish and show results)
  • Results UI:
    • Success count
    • Failure count
    • Error list (line number + message + field)
    • Warning messages for unknown custom field columns (non-existent names) shown in results

Template links:

  • Link /templates/member_import_en.csv and /templates/member_import_de.csv via Phoenix static path helpers.

Issue #8: Authorization + Limits

Dependencies: None (can be parallelized)

Goal: Ensure admin-only access and enforce limits.

Tasks:

  • Admin check in start import event handler
  • File size enforced in upload config
  • Row limit enforced in MemberCSV.prepare/2 (max_rows from config)
  • Configuration:
    config :mv, csv_import: [
      max_file_size_mb: 10,
      max_rows: 1000
    ]
    

Issue #9: End-to-End LiveView Tests + Fixtures

Dependencies: Issue #7 and #8

Tasks:

  • Fixtures:
    • valid EN/DE (core fields only)
    • valid with custom fields
    • invalid
    • unknown custom field name (non-existent, should show warning)
    • too many rows (1,001)
    • BOM + ; delimiter fixture
    • fixture with empty line(s) to validate correct line numbers
  • LiveView tests:
    • admin sees section, non-admin does not
    • upload + start import
    • success + error rendering
    • row limit + file size errors
    • custom field import success
    • custom field import warning (non-existent name, column ignored)

Issue #10: Documentation Polish (Inline Help Text + Docs)

Dependencies: Issue #9

Tasks:

  • UI help text + translations
  • CHANGELOG entry
  • Ensure moduledocs/docs

Issue #11: Custom Field Import

Dependencies: Issue #6 (Persistence)

Priority: High (Core v1 Feature)

Goal: Support importing custom field values from CSV columns. Custom fields should exist in Mila before import for best results.

Important Requirements:

  • Custom fields should be created in Mila first - Unknown custom field columns will be ignored with a warning message
  • CSV headers for custom fields must match the custom field name exactly (same normalization as member fields applies)
  • Custom field values are validated according to the custom field type (string, integer, boolean, date, email)
  • Unknown custom field columns (non-existent names) will be ignored with a warning - import continues

Tasks:

  • Extend header_mapper.ex to detect custom field columns by name (using same normalization as member fields)
  • Query existing custom fields during prepare/2 to map custom field columns
  • Collect unknown custom field columns and add warning messages (don't fail import)
  • Map custom field CSV values to CustomFieldValue creation in process_chunk/3
  • Handle custom field type validation (string, integer, boolean, date, email)
  • Create CustomFieldValue records linked to members during import
  • Update error messages to include custom field validation errors
  • Add UI help text explaining custom field requirements:
    • "Custom fields must be created in Mila before importing"
    • "Use the custom field name as the CSV column header (same normalization as member fields)"
    • Link to custom fields management section
  • Update CSV templates documentation to explain custom field columns
  • Add tests for custom field import (valid, invalid name, type validation, warning for unknown)

Definition of Done:

  • Custom field columns are recognized by name (with normalization)
  • Warning messages shown for unknown custom field columns (import continues)
  • Custom field values are created and linked to members
  • Type validation works for all custom field types
  • UI clearly explains custom field requirements
  • Tests cover custom field import scenarios (including warning for unknown names)

Rollout & Risks

Rollout Strategy

  • Dev → Staging → Production (with anonymized real-world CSV tests)

Risks & Mitigations

Risk Impact Likelihood Mitigation
Large import timeout High Medium 10 MB + 1,000 rows, chunking via handle_info
Encoding issues Medium Medium BOM stripping, templates with BOM
Invalid CSV format Medium High Clear errors + templates
Duplicate emails Low High Ash constraint error -> user-friendly message
Performance (no background jobs) Medium Low Small limits, sequential chunk processing
Admin access bypass High Low Event-level auth + UI hiding
Data corruption High Low Per-row validation + best-effort

Appendix

Module File Structure

lib/
├── mv/
│   └── membership/
│       └── import/
│           ├── member_csv.ex          # prepare + process_chunk
│           ├── csv_parser.ex          # delimiter detection + parsing + BOM handling
│           └── header_mapper.ex       # normalization + header mapping
└── mv_web/
    └── live/
        └── global_settings_live.ex    # add import section + LV message loop

priv/
└── static/
    └── templates/
        ├── member_import_en.csv
        └── member_import_de.csv

test/
├── mv/
│   └── membership/
│       └── import/
│           ├── member_csv_test.exs
│           ├── csv_parser_test.exs
│           └── header_mapper_test.exs
└── fixtures/
    ├── member_import_en.csv
    ├── member_import_de.csv
    ├── member_import_invalid.csv
    ├── member_import_large.csv
    └── member_import_empty_lines.csv

Example Usage (LiveView)

def handle_event("start_import", _params, socket) do
  assert_admin!(socket.assigns.current_user)

  [{_name, content}] =
    consume_uploaded_entries(socket, :csv_file, fn %{path: path}, _entry ->
      {:ok, File.read!(path)}
    end)

  case Mv.Membership.Import.MemberCSV.prepare(content) do
    {:ok, import_state} ->
      socket =
        socket
        |> assign(:import_state, import_state)
        |> assign(:import_progress, %{processed: 0, inserted: 0, failed: 0, errors: []})
        |> assign(:importing?, true)

      send(self(), {:process_chunk, 0})
      {:noreply, socket}

    {:error, reason} ->
      {:noreply, put_flash(socket, :error, reason)}
  end
end

def handle_info({:process_chunk, idx}, socket) do
  %{chunks: chunks, column_map: column_map} = socket.assigns.import_state

  case Enum.at(chunks, idx) do
    nil ->
      {:noreply, assign(socket, importing?: false)}

    chunk_rows_with_lines ->
      {:ok, chunk_result} =
        Mv.Membership.Import.MemberCSV.process_chunk(chunk_rows_with_lines, column_map)

      socket = merge_progress(socket, chunk_result) # caps errors at 50 overall

      send(self(), {:process_chunk, idx + 1})
      {:noreply, socket}
  end
end

End of Implementation Plan