Merge pull request 'Implements error capping' (#356) from feature/334_Ash_persistence into main

Reviewed-on: #356
This commit is contained in:
carla 2026-01-19 13:14:59 +01:00
commit fef2cce283
4 changed files with 280 additions and 69 deletions

View file

@ -2,10 +2,29 @@
**Version:** 1.0
**Date:** 2025-01-XX
**Status:** Ready for Implementation
**Status:** In Progress (Backend Complete, UI Pending)
**Related Documents:**
- [Feature Roadmap](./feature-roadmap.md) - Overall feature planning
## Implementation Status
**Completed Issues:**
- ✅ Issue #1: CSV Specification & Static Template Files
- ✅ Issue #2: Import Service Module Skeleton
- ✅ Issue #3: CSV Parsing + Delimiter Auto-Detection + BOM Handling
- ✅ Issue #4: Header Normalization + Per-Header Mapping
- ✅ Issue #5: Validation (Required Fields) + Error Formatting
- ✅ Issue #6: Persistence via Ash Create + Per-Row Error Capture (with Error-Capping)
- ✅ Issue #11: Custom Field Import (Backend)
**In Progress / Pending:**
- ⏳ Issue #7: Admin Global Settings LiveView UI (Upload + Start Import + Results)
- ⏳ Issue #8: Authorization + Limits
- ⏳ Issue #9: End-to-End LiveView Tests + Fixtures
- ⏳ Issue #10: Documentation Polish
**Latest Update:** Error-Capping in `process_chunk/4` implemented (2025-01-XX)
---
## Table of Contents
@ -332,19 +351,24 @@ Use `Mv.Authorization.PermissionSets` (preferred) instead of hard-coded string c
**Dependencies:** None
**Status:** ✅ **COMPLETED**
**Goal:** Define CSV contract and add static templates.
**Tasks:**
- [ ] Finalize header mapping variants
- [ ] Document normalization rules
- [ ] Document delimiter detection strategy
- [ ] Create templates in `priv/static/templates/` (UTF-8 with BOM)
- [ ] Document template URLs and how to link them from LiveView
- [ ] Document line number semantics (physical CSV line numbers)
- [x] Finalize header mapping variants
- [x] Document normalization rules
- [x] Document delimiter detection strategy
- [x] Create templates in `priv/static/templates/` (UTF-8 with BOM)
- `member_import_en.csv` with English headers
- `member_import_de.csv` with German headers
- [x] Document template URLs and how to link them from LiveView
- [x] Document line number semantics (physical CSV line numbers)
- [x] Templates included in `MvWeb.static_paths()` configuration
**Definition of Done:**
- [ ] Templates open cleanly in Excel/LibreOffice
- [ ] CSV spec section complete
- [x] Templates open cleanly in Excel/LibreOffice
- [x] CSV spec section complete
---
@ -352,18 +376,20 @@ Use `Mv.Authorization.PermissionSets` (preferred) instead of hard-coded string c
**Dependencies:** None
**Status:** ✅ **COMPLETED**
**Goal:** Create service API and error types.
**API (recommended):**
- `prepare/2` — parse + map + limit checks, returns import_state
- `process_chunk/3` — process one chunk (pure-ish), returns per-chunk results
- `process_chunk/4` — process one chunk (pure-ish), returns per-chunk results
**Tasks:**
- [ ] Create `lib/mv/membership/import/member_csv.ex`
- [ ] Define public function: `prepare/2 (file_content, opts \\ [])`
- [ ] Define public function: `process_chunk/3 (chunk_rows_with_lines, column_map, opts \\ [])`
- [ ] Define error struct: `%MemberCSV.Error{csv_line_number: integer, field: atom | nil, message: String.t}`
- [ ] Document module + API
- [x] Create `lib/mv/membership/import/member_csv.ex`
- [x] Define public function: `prepare/2 (file_content, opts \\ [])`
- [x] Define public function: `process_chunk/4 (chunk_rows_with_lines, column_map, custom_field_map, opts \\ [])`
- [x] Define error struct: `%MemberCSV.Error{csv_line_number: integer, field: atom | nil, message: String.t}`
- [x] Document module + API
---
@ -371,24 +397,26 @@ Use `Mv.Authorization.PermissionSets` (preferred) instead of hard-coded string c
**Dependencies:** Issue #2
**Status:** ✅ **COMPLETED**
**Goal:** Parse CSV robustly with correct delimiter detection and BOM handling.
**Tasks:**
- [ ] Verify/add NimbleCSV dependency (`{:nimble_csv, "~> 1.0"}`)
- [ ] Create `lib/mv/membership/import/csv_parser.ex`
- [ ] Implement `strip_bom/1` and apply it **before** any header handling
- [ ] Handle `\r\n` and `\n` line endings (trim `\r` on header record)
- [ ] Detect delimiter via header recognition (try `;` and `,`)
- [ ] Parse CSV and return:
- [x] Verify/add NimbleCSV dependency (`{:nimble_csv, "~> 1.0"}`)
- [x] Create `lib/mv/membership/import/csv_parser.ex`
- [x] Implement `strip_bom/1` and apply it **before** any header handling
- [x] Handle `\r\n` and `\n` line endings (trim `\r` on header record)
- [x] Detect delimiter via header recognition (try `;` and `,`)
- [x] Parse CSV and return:
- `headers :: [String.t()]`
- `rows :: [{csv_line_number, [String.t()]}]` or directly `[{csv_line_number, row_map}]`
- [ ] Skip completely empty records (but preserve correct physical line numbers)
- [ ] Return `{:ok, headers, rows}` or `{:error, reason}`
- `rows :: [{csv_line_number, [String.t()]}]` with correct physical line numbers
- [x] Skip completely empty records (but preserve correct physical line numbers)
- [x] Return `{:ok, headers, rows}` or `{:error, reason}`
**Definition of Done:**
- [ ] BOM handling works (Excel exports)
- [ ] Delimiter detection works reliably
- [ ] Rows carry correct `csv_line_number`
- [x] BOM handling works (Excel exports)
- [x] Delimiter detection works reliably
- [x] Rows carry correct `csv_line_number`
---
@ -396,20 +424,22 @@ Use `Mv.Authorization.PermissionSets` (preferred) instead of hard-coded string c
**Dependencies:** Issue #3
**Status:** ✅ **COMPLETED**
**Goal:** Map each header individually to canonical fields (normalized comparison).
**Tasks:**
- [ ] Create `lib/mv/membership/import/header_mapper.ex`
- [ ] Implement `normalize_header/1`
- [ ] Normalize mapping variants once and compare normalized strings
- [ ] Build `column_map` (canonical field -> column index)
- [ ] **Early abort if required headers missing** (`email`)
- [ ] Ignore unknown columns (member fields only)
- [ ] **Separate custom field column detection** (by name, with normalization)
- [x] Create `lib/mv/membership/import/header_mapper.ex`
- [x] Implement `normalize_header/1`
- [x] Normalize mapping variants once and compare normalized strings
- [x] Build `column_map` (canonical field -> column index)
- [x] **Early abort if required headers missing** (`email`)
- [x] Ignore unknown columns (member fields only)
- [x] **Separate custom field column detection** (by name, with normalization)
**Definition of Done:**
- [ ] English/German headers map correctly
- [ ] Missing required columns fails fast
- [x] English/German headers map correctly
- [x] Missing required columns fails fast
---
@ -417,14 +447,16 @@ Use `Mv.Authorization.PermissionSets` (preferred) instead of hard-coded string c
**Dependencies:** Issue #4
**Status:** ✅ **COMPLETED**
**Goal:** Validate each row and return structured, translatable errors.
**Tasks:**
- [ ] Implement `validate_row/3 (row_map, csv_line_number, opts)`
- [ ] Required field presence (`email`)
- [ ] Email format validation (EctoCommons.EmailValidator)
- [ ] Trim values before validation
- [ ] Gettext-backed error messages
- [x] Implement `validate_row/3 (row_map, csv_line_number, opts)`
- [x] Required field presence (`email`)
- [x] Email format validation (EctoCommons.EmailValidator)
- [x] Trim values before validation
- [x] Gettext-backed error messages
---
@ -432,21 +464,32 @@ Use `Mv.Authorization.PermissionSets` (preferred) instead of hard-coded string c
**Dependencies:** Issue #5
**Status:** ✅ **COMPLETED**
**Goal:** Create members and capture errors per row with correct CSV line numbers.
**Tasks:**
- [ ] Implement `process_chunk/3` in service:
- [x] Implement `process_chunk/4` in service:
- Input: `[{csv_line_number, row_map}]`
- Validate + create sequentially
- Collect counts + first 50 errors (per import overall; LiveView enforces cap across chunks)
- [ ] Implement Ash error formatter helper:
- **Error-Capping:** Supports `existing_error_count` and `max_errors` in opts (default: 50)
- **Error-Capping:** Only collects errors if under limit, but continues processing all rows
- **Error-Capping:** `failed` count is always accurate, even when errors are capped
- [x] Implement Ash error formatter helper:
- Convert `Ash.Error.Invalid` into `%MemberCSV.Error{}`
- Prefer field-level errors where possible (attach `field` atom)
- Handle unique email constraint error as user-friendly message
- [ ] Map row_map to Ash attrs (`%{first_name: ..., ...}`)
- [x] Map row_map to Ash attrs (`%{first_name: ..., ...}`)
- [x] Custom field value processing and creation
**Important:** **Do not recompute line numbers** in this layer—use the ones provided by the parser.
**Implementation Notes:**
- `process_chunk/4` accepts `opts` with `existing_error_count` and `max_errors` for error capping across chunks
- Error capping respects the limit per import overall (not per chunk)
- Processing continues even after error limit is reached (for accurate counts)
---
### Issue #7: Admin Global Settings LiveView UI (Upload + Start Import + Results + Template Links)
@ -546,6 +589,8 @@ Use `Mv.Authorization.PermissionSets` (preferred) instead of hard-coded string c
**Priority:** High (Core v1 Feature)
**Status:** ✅ **COMPLETED** (Backend Implementation)
**Goal:** Support importing custom field values from CSV columns. Custom fields should exist in Mila before import for best results.
**Important Requirements:**
@ -555,27 +600,32 @@ Use `Mv.Authorization.PermissionSets` (preferred) instead of hard-coded string c
- Unknown custom field columns (non-existent names) will be ignored with a warning - import continues
**Tasks:**
- [ ] Extend `header_mapper.ex` to detect custom field columns by name (using same normalization as member fields)
- [ ] Query existing custom fields during `prepare/2` to map custom field columns
- [ ] Collect unknown custom field columns and add warning messages (don't fail import)
- [ ] Map custom field CSV values to `CustomFieldValue` creation in `process_chunk/3`
- [ ] Handle custom field type validation (string, integer, boolean, date, email)
- [ ] Create `CustomFieldValue` records linked to members during import
- [ ] Update error messages to include custom field validation errors
- [ ] Add UI help text explaining custom field requirements:
- [x] Extend `header_mapper.ex` to detect custom field columns by name (using same normalization as member fields)
- [x] Query existing custom fields during `prepare/2` to map custom field columns
- [x] Collect unknown custom field columns and add warning messages (don't fail import)
- [x] Map custom field CSV values to `CustomFieldValue` creation in `process_chunk/4`
- [x] Handle custom field type validation (string, integer, boolean, date, email)
- [x] Create `CustomFieldValue` records linked to members during import
- [ ] Update error messages to include custom field validation errors (if needed)
- [ ] Add UI help text explaining custom field requirements (pending Issue #7):
- "Custom fields must be created in Mila before importing"
- "Use the custom field name as the CSV column header (same normalization as member fields)"
- Link to custom fields management section
- [ ] Update CSV templates documentation to explain custom field columns
- [ ] Add tests for custom field import (valid, invalid name, type validation, warning for unknown)
- [ ] Update CSV templates documentation to explain custom field columns (pending Issue #1)
- [x] Add tests for custom field import (valid, invalid name, type validation, warning for unknown)
**Definition of Done:**
- [ ] Custom field columns are recognized by name (with normalization)
- [ ] Warning messages shown for unknown custom field columns (import continues)
- [ ] Custom field values are created and linked to members
- [ ] Type validation works for all custom field types
- [ ] UI clearly explains custom field requirements
- [ ] Tests cover custom field import scenarios (including warning for unknown names)
- [x] Custom field columns are recognized by name (with normalization)
- [x] Warning messages shown for unknown custom field columns (import continues)
- [x] Custom field values are created and linked to members
- [x] Type validation works for all custom field types
- [ ] UI clearly explains custom field requirements (pending Issue #7)
- [x] Tests cover custom field import scenarios (including warning for unknown names)
**Implementation Notes:**
- Custom field lookup is built in `prepare/2` and passed via `custom_field_lookup` in opts
- Custom field values are formatted according to type in `format_custom_field_value/2`
- Unknown custom field columns generate warnings in `import_state.warnings`
---

View file

@ -70,7 +70,8 @@ defmodule Mv.Membership.Import.MemberCSV do
@type chunk_result :: %{
inserted: non_neg_integer(),
failed: non_neg_integer(),
errors: list(Error.t())
errors: list(Error.t()),
errors_truncated?: boolean()
}
alias Mv.Membership.Import.CsvParser
@ -258,7 +259,18 @@ defmodule Mv.Membership.Import.MemberCSV do
- `row_map` - Map with `:member` and `:custom` keys containing field values
- `column_map` - Map of canonical field names (atoms) to column indices (for reference)
- `custom_field_map` - Map of custom field IDs (strings) to column indices (for reference)
- `opts` - Optional keyword list for processing options
- `opts` - Optional keyword list for processing options:
- `:custom_field_lookup` - Map of custom field IDs to metadata (default: `%{}`)
- `:existing_error_count` - Number of errors already collected in previous chunks (default: `0`)
- `:max_errors` - Maximum number of errors to collect per import overall (default: `50`)
## Error Capping
Errors are capped at `max_errors` per import overall. When the limit is reached:
- No additional errors are collected in the `errors` list
- Processing continues for all rows
- The `failed` count continues to increment correctly for all failed rows
- The `errors_truncated?` flag is set to `true` to indicate that additional errors were suppressed
## Returns
@ -272,6 +284,11 @@ defmodule Mv.Membership.Import.MemberCSV do
iex> custom_field_map = %{}
iex> MemberCSV.process_chunk(chunk, column_map, custom_field_map)
{:ok, %{inserted: 1, failed: 0, errors: []}}
iex> chunk = [{2, %{member: %{email: "invalid"}, custom: %{}}}]
iex> opts = [existing_error_count: 25, max_errors: 50]
iex> MemberCSV.process_chunk(chunk, %{}, %{}, opts)
{:ok, %{inserted: 0, failed: 1, errors: [%Error{}], errors_truncated?: false}}
"""
@spec process_chunk(
list({pos_integer(), map()}),
@ -281,20 +298,34 @@ defmodule Mv.Membership.Import.MemberCSV do
) :: {:ok, chunk_result()} | {:error, String.t()}
def process_chunk(chunk_rows_with_lines, _column_map, _custom_field_map, opts \\ []) do
custom_field_lookup = Keyword.get(opts, :custom_field_lookup, %{})
existing_error_count = Keyword.get(opts, :existing_error_count, 0)
max_errors = Keyword.get(opts, :max_errors, 50)
{inserted, failed, errors, _collected_error_count, truncated?} =
Enum.reduce(chunk_rows_with_lines, {0, 0, [], 0, false}, fn {line_number, row_map},
{acc_inserted, acc_failed, acc_errors, acc_error_count, acc_truncated?} ->
current_error_count = existing_error_count + acc_error_count
{inserted, failed, errors} =
Enum.reduce(chunk_rows_with_lines, {0, 0, []}, fn {line_number, row_map},
{acc_inserted, acc_failed, acc_errors} ->
case process_row(row_map, line_number, custom_field_lookup) do
{:ok, _member} ->
{acc_inserted + 1, acc_failed, acc_errors}
{acc_inserted + 1, acc_failed, acc_errors, acc_error_count, acc_truncated?}
{:error, error} ->
{acc_inserted, acc_failed + 1, [error | acc_errors]}
new_acc_failed = acc_failed + 1
# Only collect errors if under limit
{new_acc_errors, new_error_count, new_truncated?} =
if current_error_count < max_errors do
{[error | acc_errors], acc_error_count + 1, acc_truncated?}
else
{acc_errors, acc_error_count, true}
end
{acc_inserted, new_acc_failed, new_acc_errors, new_error_count, new_truncated?}
end
end)
{:ok, %{inserted: inserted, failed: failed, errors: Enum.reverse(errors)}}
{:ok, %{inserted: inserted, failed: failed, errors: Enum.reverse(errors), errors_truncated?: truncated?}}
end
@doc """

View file

@ -325,6 +325,136 @@ defmodule Mv.Membership.Import.MemberCSVTest do
# Check that @doc exists by reading the module
assert function_exported?(MemberCSV, :process_chunk, 4)
end
test "error capping collects exactly 50 errors" do
# Create 50 rows with invalid emails
chunk_rows_with_lines =
1..50
|> Enum.map(fn i ->
{i + 1, %{member: %{email: "invalid-email-#{i}"}, custom: %{}}}
end)
column_map = %{email: 0}
custom_field_map = %{}
opts = [existing_error_count: 0, max_errors: 50]
assert {:ok, chunk_result} =
MemberCSV.process_chunk(chunk_rows_with_lines, column_map, custom_field_map, opts)
assert chunk_result.inserted == 0
assert chunk_result.failed == 50
assert length(chunk_result.errors) == 50
end
test "error capping collects only first 50 errors when more than 50 errors occur" do
# Create 60 rows with invalid emails
chunk_rows_with_lines =
1..60
|> Enum.map(fn i ->
{i + 1, %{member: %{email: "invalid-email-#{i}"}, custom: %{}}}
end)
column_map = %{email: 0}
custom_field_map = %{}
opts = [existing_error_count: 0, max_errors: 50]
assert {:ok, chunk_result} =
MemberCSV.process_chunk(chunk_rows_with_lines, column_map, custom_field_map, opts)
assert chunk_result.inserted == 0
assert chunk_result.failed == 60
assert length(chunk_result.errors) == 50
end
test "error capping respects existing_error_count" do
# Create 30 rows with invalid emails
chunk_rows_with_lines =
1..30
|> Enum.map(fn i ->
{i + 1, %{member: %{email: "invalid-email-#{i}"}, custom: %{}}}
end)
column_map = %{email: 0}
custom_field_map = %{}
opts = [existing_error_count: 25, max_errors: 50]
assert {:ok, chunk_result} =
MemberCSV.process_chunk(chunk_rows_with_lines, column_map, custom_field_map, opts)
assert chunk_result.inserted == 0
assert chunk_result.failed == 30
# Should only collect 25 errors (25 existing + 25 new = 50 limit)
assert length(chunk_result.errors) == 25
end
test "error capping collects no errors when limit already reached" do
# Create 10 rows with invalid emails
chunk_rows_with_lines =
1..10
|> Enum.map(fn i ->
{i + 1, %{member: %{email: "invalid-email-#{i}"}, custom: %{}}}
end)
column_map = %{email: 0}
custom_field_map = %{}
opts = [existing_error_count: 50, max_errors: 50]
assert {:ok, chunk_result} =
MemberCSV.process_chunk(chunk_rows_with_lines, column_map, custom_field_map, opts)
assert chunk_result.inserted == 0
assert chunk_result.failed == 10
assert length(chunk_result.errors) == 0
end
test "error capping with mixed success and failure" do
# Create 100 rows: 30 valid, 70 invalid
valid_rows =
1..30
|> Enum.map(fn i ->
{i + 1, %{member: %{email: "valid#{i}@example.com"}, custom: %{}}}
end)
invalid_rows =
31..100
|> Enum.map(fn i ->
{i + 1, %{member: %{email: "invalid-email-#{i}"}, custom: %{}}}
end)
chunk_rows_with_lines = valid_rows ++ invalid_rows
column_map = %{email: 0}
custom_field_map = %{}
opts = [existing_error_count: 0, max_errors: 50]
assert {:ok, chunk_result} =
MemberCSV.process_chunk(chunk_rows_with_lines, column_map, custom_field_map, opts)
assert chunk_result.inserted == 30
assert chunk_result.failed == 70
# Should only collect 50 errors (limit reached)
assert length(chunk_result.errors) == 50
end
test "error capping with custom max_errors" do
# Create 20 rows with invalid emails
chunk_rows_with_lines =
1..20
|> Enum.map(fn i ->
{i + 1, %{member: %{email: "invalid-email-#{i}"}, custom: %{}}}
end)
column_map = %{email: 0}
custom_field_map = %{}
opts = [existing_error_count: 0, max_errors: 10]
assert {:ok, chunk_result} =
MemberCSV.process_chunk(chunk_rows_with_lines, column_map, custom_field_map, opts)
assert chunk_result.inserted == 0
assert chunk_result.failed == 20
assert length(chunk_result.errors) == 10
end
end
describe "validate_row/3" do