perf: determinant hot-path optimizations

## Motivation

Determinant calculations dominate performance in downstream consumers (e.g. Delaunay flip/repair algorithms). These are safe-code-only changes that should meaningfully reduce latency for small fixed-size matrices.

## Proposed Changes

### 1. Closed-form `det` for small dimensions
Add specialized paths in `Matrix::det` (or a new `det_fast`) for D=1, 2, 3, and 4 that bypass LU factorization entirely:
- **1×1**: return `a[0][0]`
- **2×2**: `ad - bc` via a single `mul_add`
- **3×3**: Sarrus rule (6 `mul_add` terms)
- **4×4**: Laplace/cofactor expansion on first row (reduces to four 3×3 sub-determinants)

Expected: 3–5× speedup for these common dimensions.

### 2. Combine non-finite check with pivot search in `lu.rs`
The current pivot search (lines ~30–46) makes two conceptual passes: scanning for max-abs and separately checking `is_finite`. Merge them into a single loop body so each entry is touched once.

### 3. FMA consistency in the elimination loop
`lu.rs` already uses `mul_add` in the inner elimination loop but not uniformly across all arithmetic. Audit all multiply-then-add patterns in hot loops and apply `mul_add` consistently.

### 4. Replace `get`/`set` with direct indexing in hot paths
`Lu::factor` and `Lu::solve_vec` call `get`/`set` through wrappers that do bounds checks on every access. The indices are in-bounds by construction in those loops — use `self.rows[r][c]` directly and let the compiler elide the redundant checks.

### 5. Release profile tuning
Add to `Cargo.toml`:
```toml
[profile.release]
lto = "fat"
codegen-units = 1
```
This enables whole-crate inlining and lets LLVM specialize/merge const-generic monomorphizations more aggressively.

## Acceptance Criteria
- All existing tests pass (`just test`)
- Benchmarks show improvement for `d2`–`d5` determinant cases (`cargo bench`)
- No `unsafe` code introduced

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: determinant hot-path optimizations #27

Motivation

Proposed Changes

1. Closed-form `det` for small dimensions

2. Combine non-finite check with pivot search in `lu.rs`

3. FMA consistency in the elimination loop

4. Replace `get`/`set` with direct indexing in hot paths

5. Release profile tuning

Acceptance Criteria

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

perf: determinant hot-path optimizations #27

Description

Motivation

Proposed Changes

1. Closed-form det for small dimensions

2. Combine non-finite check with pivot search in lu.rs

3. FMA consistency in the elimination loop

4. Replace get/set with direct indexing in hot paths

5. Release profile tuning

Acceptance Criteria

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

1. Closed-form `det` for small dimensions

2. Combine non-finite check with pivot search in `lu.rs`

4. Replace `get`/`set` with direct indexing in hot paths