Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
a835062
new
ttlequals0 Apr 10, 2026
e433a7b
new
ttlequals0 Apr 10, 2026
b949ca3
feat: add batch processing, keyword prefix, and Plex webhook support
ttlequals0 Apr 10, 2026
aa65934
resolve merge conflicts with origin/main
ttlequals0 Apr 10, 2026
9485bb8
fix: remove remaining emoji/unicode, improve webhook payload handling
ttlequals0 Apr 10, 2026
5ffee30
improve: Radarr/Sonarr matching and API call efficiency
ttlequals0 Apr 10, 2026
4882ae5
fix: remove last emojis from plex/client.go
ttlequals0 Apr 10, 2026
a59632d
improve: clean up verbose logging for TMDb ID lookup failures
ttlequals0 Apr 10, 2026
c59950a
improve: rewrite CLAUDE.md with build commands, architecture, convent…
ttlequals0 Apr 10, 2026
7d1c6cc
feat: add version tracking, log v1.2.0 on startup
ttlequals0 Apr 10, 2026
6fe9e0f
Merge pull request #1 from ttlequals0/feature/batch-webhook-prefix
ttlequals0 Apr 10, 2026
145b2e1
feat: webhook processes single items instead of full library scan
ttlequals0 Apr 11, 2026
e4685a1
fix: webhook items wait for in-flight scan; add library exclude and w…
ttlequals0 Apr 12, 2026
9b1b338
Merge pull request #2 from ttlequals0/fix/webhook-busy-wait
ttlequals0 Apr 12, 2026
d73efe9
fix(webhook): log reason for 400 responses
ttlequals0 Apr 12, 2026
d6ee9b4
fix(webhook): reject caused by GUID type mismatch; remove unused field
ttlequals0 Apr 12, 2026
0809005
Merge pull request #3 from ttlequals0/fix/webhook-400-logging
ttlequals0 Apr 12, 2026
e44ad7d
feat(webhook): add POST /scan endpoint for manual scan triggers
ttlequals0 Apr 12, 2026
16d3b48
Merge pull request #4 from ttlequals0/feature/manual-scan-endpoint
ttlequals0 Apr 12, 2026
3e474a2
fix(security): resolve CodeQL alerts (TLS verify, path injection, wor…
ttlequals0 Apr 21, 2026
1dba386
chore(security): bump Go to 1.26, Alpine to 3.22, harden .dockerignore
ttlequals0 Apr 21, 2026
9d4d518
fix(plex): redact X-Plex-Token (and api_key) from error messages - v1…
ttlequals0 Apr 21, 2026
cd03e08
Merge pull request #5 from ttlequals0/fix/codeql-security-alerts
ttlequals0 Apr 21, 2026
1329be9
Update docker-compose.yml
ttlequals0 May 8, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -1,10 +1,24 @@
# Git
.git
.gitignore
.github

# IDE
.vscode
.idea
.claude

# Secrets / env files
.env
.env.*
*.pem
*.key
*.pfx

# Local-only docs / scratch
CLAUDE.md
PR_DESCRIPTION.md
tmp/

# Go build artifacts
*.exe
Expand Down
5 changes: 4 additions & 1 deletion .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,9 @@ on:
- '.gitignore'
- 'example/**'

permissions:
contents: read

jobs:
check-changes:
runs-on: ubuntu-latest
Expand Down Expand Up @@ -212,7 +215,7 @@ jobs:
- name: Set up Go
uses: actions/setup-go@v5
with:
go-version: '1.21'
go-version: '1.26'

- name: Build binary
env:
Expand Down
286 changes: 129 additions & 157 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,180 +1,152 @@
# Changelog

## [Unreleased] - 2025-07-05
## [1.3.2] - 2026-04-21

### Security
- Plex client error messages now redact `X-Plex-Token`, `apikey`, and `api_key` query-string values. Previously a transport failure (e.g. TLS cert rejection) would bubble up Go's `*url.Error`, which embeds the full request URL including the Plex token, leaking it into logs/Loki. All `c.httpClient.Do` calls are now routed through a `safeDo` wrapper that scrubs the error string before returning.

## [1.3.1] - 2026-04-21

### Security
- Plex TLS certificate verification is now enabled by default. Previously, `InsecureSkipVerify` was hardcoded to `true`, which silently trusted any server certificate. Set `PLEX_INSECURE_SKIP_VERIFY=true` to opt back into skip-verify (e.g., for self-signed Plex certs). A `[WARN]` line is logged at startup when the flag is on. Resolves CodeQL `go/disabled-certificate-check`.
- Export paths are now validated to stay within `EXPORT_LOCATION`. Library names that resolve to `.`, `..`, or empty after sanitization are rejected, and a `safeJoin` helper verifies every export path stays inside the export root before writing. Resolves CodeQL `go/path-injection`.
- `.github/workflows/release.yml` now declares a least-privilege top-level `permissions: contents: read`. Per-job elevated permissions on `create-release`, `build-binaries`, and `publish-docker` are unchanged. Resolves CodeQL `actions/missing-workflow-permissions`.

### Added
- `PLEX_INSECURE_SKIP_VERIFY` environment variable (default `false`).

#### Radarr/Sonarr Integration
- ✅ Created Radarr API client module (`internal/radarr/`) with full API support
- Movie search by title, year, TMDb ID, IMDb ID, and file path
- Automatic TMDb ID extraction from Radarr database
- Connection testing and system status endpoints

- ✅ Created Sonarr API client module (`internal/sonarr/`) with full API support
- TV series search by title, year, TMDb ID, TVDb ID, IMDb ID, and file path
- Episode fetching and file path matching
- Connection testing and system status endpoints

- ✅ Updated configuration system to support Radarr/Sonarr
- Added `USE_RADARR` and `USE_SONARR` environment variables
- Added `RADARR_URL`, `RADARR_API_KEY`, `SONARR_URL`, `SONARR_API_KEY` configuration
- Added validation for Radarr/Sonarr settings when enabled

- ✅ Enhanced TMDb ID extraction to use multiple sources
- Primary: Plex metadata (existing functionality preserved)
- Secondary: Radarr/Sonarr API matching (new)
- Fallback: File path regex matching (existing functionality preserved)
- Added source tracking to show where TMDb ID was found

- ✅ Updated media processor to integrate with Radarr/Sonarr
- Modified `extractMovieTMDbID` to query Radarr when enabled
- Modified `extractTVShowTMDbID` to query Sonarr when enabled
- Added multiple matching strategies: title/year, IMDb ID, TVDb ID, file path

- ✅ Updated main application to initialize Radarr/Sonarr clients
- Added connection testing on startup
- Graceful handling when Radarr/Sonarr are not configured

- ✅ Created comprehensive docker-compose.yml example
- Includes all existing configuration options
- Added Radarr/Sonarr configuration examples with defaults

#### Verbose Logging & Debugging

- ✅ Added verbose logging feature
- New `VERBOSE_LOGGING` environment variable (default: false)
- Shows detailed TMDb ID lookup process for each item
- Displays all available Plex GUIDs (IMDb, TMDb, TVDb)
- Shows Radarr lookup attempts with title, file path, and IMDb ID matching
- Shows Sonarr lookup attempts with title, TVDb ID, IMDb ID, and file path matching
- Indicates source of successful TMDb ID matches
- Helps troubleshoot matching issues

- ✅ Added progress tracking for large libraries
- Shows percentage progress for libraries with >100 items
- Displays current processing status
- Shows summary of skipped items in verbose mode

- ✅ Enhanced label/genre application logging
- Shows when keywords are being applied to Plex
- Displays Plex API call timing in verbose mode
- Shows current and new keywords being merged
- Confirms successful application to Plex
### Changed
- Bumped Docker build base images to `golang:1.26-alpine` (builder) and `alpine:3.22` (runtime) to pick up patched Go stdlib and Alpine packages. Clears 1 CRITICAL + 8 HIGH CVEs surfaced by Trivy on the previous `golang:1.23-alpine` / `alpine:3.21` base (CVE-2025-68121, CVE-2025-58183, CVE-2025-61726/28/29, CVE-2026-25679/32280/32281/32283).
- `go.mod` now requires Go 1.26; `release.yml` `actions/setup-go` pinned to `1.26`.
- `.dockerignore` hardened to exclude `.env*`, `*.pem`, `*.key`, `.claude`, `.github`, `CLAUDE.md`, `PR_DESCRIPTION.md`, `tmp/`.

#### Persistent Storage
## [1.3.0] - 2026-04-12

- ✅ Added persistent storage for processed items
- Prevents reprocessing items after container restarts
- JSON file-based storage with atomic writes
- Tracks which field (label/genre) was updated for each item
- Configurable data directory via DATA_DIR environment variable
- Docker volume support for data persistence
- Storage directory defaults to `/data` in container
### Added
- `POST /scan` endpoint on the webhook server for manual scan triggers.
- `POST /scan` runs a full scan across all non-excluded libraries.
- `POST /scan?library=<id|name>` scans a single library; `library`
accepts the numeric Plex section ID or a case-insensitive library
title.
- Returns `202 Accepted` immediately; the scan runs in the
background. Returns `409 Conflict` if a scan is already in
progress, `404 Not Found` if the library param does not match, and
`405 Method Not Allowed` for non-POST requests.
- Works in `WEBHOOK_ONLY=true` mode — enables ad-hoc catchup scans
without toggling environment variables.

## [1.2.3] - 2026-04-12

### Fixed
- Every Plex webhook delivery was being rejected with `HTTP 400` because
`PlexWebhookPayload.Metadata.GUID` was typed as `string`, while Plex
sends it as an array of objects (`[{id:"imdb://..."}, ...]`) for
multi-provider items. `json.Unmarshal` failed before reaching the
event handler. The unused field has been removed; unknown JSON fields
are ignored by the decoder, so the shape no longer matters.

## [1.2.2] - 2026-04-12

### Changed
- Webhook 400 responses now log the specific failure reason (parse error,
missing payload field, or JSON unmarshal error) along with
`Content-Type`, form/file part keys, and a payload snippet. Previously
all three paths returned 400 with no log, making Plex delivery failures
invisible.

#### Error Handling & Connection Testing
## [1.2.1] - 2026-04-12

- ✅ Added TMDb API connection testing on startup
- Validates API token before processing begins
- Provides clear error messages for authentication failures
- Shows detailed error responses for debugging
### Added
- `MOVIE_LIBRARY_EXCLUDE` / `TV_LIBRARY_EXCLUDE`: comma-separated library
IDs to skip when `*_PROCESS_ALL=true`. Excluded libraries are filtered
from both timer-driven processing and webhook routing.
- `WEBHOOK_ONLY=true`: skips the startup full scan and the periodic timer
entirely, leaving the webhook server as the only trigger. Requires
`WEBHOOK_ENABLED=true`.

### Fixed
- Webhook items are no longer silently dropped when a full library scan is
in progress. `ProcessSingleItem` now waits for the per-library slot to
free up (polling every 5s, bounded to a 2-hour deadline) instead of
logging a fake "queuing for next cycle" and returning early.

- ✅ Improved error handling throughout
- Better error messages for TMDb API failures
- Clear indication of authentication vs other errors
- Verbose mode shows why items are skipped
## [1.2.0] - 2026-04-10

### Added

#### Plex Webhook Support
- Webhook listener for real-time processing (WEBHOOK_ENABLED, WEBHOOK_PORT)
- Handles library.new and library.on.deck events
- Configurable debounce window (WEBHOOK_DEBOUNCE, default 30s)
- Prevents concurrent processing of the same library
- Health check endpoint at /health
- Runs alongside the existing timer

#### Keyword Prefix
- KEYWORD_PREFIX env var to prepend text to keywords (e.g. "- ")
- Useful when UPDATE_FIELD=genre to separate TMDb keywords from real genres

#### Batch Processing
- BATCH_SIZE (default 100) and BATCH_DELAY (default 10s) env vars
- Prevents API flooding on large libraries (4000+ items)
- ITEM_DELAY (default 500ms) controls per-item pacing

#### Version Tracking
- Version constant in internal/version/version.go
- Logs version on startup

### Changed
- Removed all emoji from log output; replaced with bracketed tags
- Extracted Clients struct for processor initialization
- Added keyword cache by TMDb ID to avoid redundant API calls
- Eliminated redundant Plex API call after keyword sync for export

- Modified `NewProcessor` to accept optional Radarr/Sonarr clients and return error
- Enhanced TMDb ID detection to show source (Plex metadata, Radarr, Sonarr, or file path)
- Processor initialization now includes persistent storage setup
- Main application now tests all API connections on startup
## 2025-07-05

### Documentation
### Added

- ✅ Updated README.md with comprehensive documentation
- Added Radarr/Sonarr Integration section with benefits and configuration
- Added Verbose Logging section with examples
- Updated environment variables documentation
- Added persistent storage information
- Updated docker-compose examples
#### Radarr/Sonarr Integration
- Radarr API client (internal/radarr/) -- movie lookup by title, year, TMDb ID, IMDb ID, file path
- Sonarr API client (internal/sonarr/) -- series lookup by title, year, TMDb ID, TVDb ID, IMDb ID, file path
- USE_RADARR, USE_SONARR, RADARR_URL, RADARR_API_KEY, SONARR_URL, SONARR_API_KEY env vars
- TMDb ID extraction chain: Plex metadata -> Radarr/Sonarr -> file path regex
- Connection testing on startup for all enabled services

- ✅ Created detailed CHANGELOG.md
- Comprehensive list of all changes
- Organized by feature area
- Technical implementation details
#### Verbose Logging
- VERBOSE_LOGGING env var (default false)
- Shows TMDb ID lookup source, Plex GUIDs, matching attempts
- Progress percentage for libraries over 100 items

#### Keyword Normalization
#### Persistent Storage
- JSON file storage for processed items (DATA_DIR env var)
- Tracks rating key, TMDb ID, update field, last processed time
- Skips already-processed items unless FORCE_UPDATE=true
- Runs in ephemeral mode when DATA_DIR is not set

- ✅ Added intelligent keyword normalization feature
- Automatically normalizes TMDb keywords for consistent formatting
- Pattern-based recognition for dynamic handling without hardcoding
- Smart title casing with proper article and preposition handling
- Automatic duplicate removal after normalization

- ✅ Pattern Recognition Features
- **Critical Replacements**: Known abbreviations (sci-fi → Sci-Fi, romcom → Romantic Comedy)
- **Acronym Detection**: Automatically uppercases known acronyms (FBI, CIA, DEA, etc.)
- **Agency Patterns**: Detects agency roles (dea agent → DEA Agent)
- **Parenthetical Acronyms**: Handles acronyms in parentheses (central intelligence agency (cia) → Central Intelligence Agency (CIA))
- **Century Patterns**: Properly formats centuries (5th century bc → 5th Century BC)
- **City/State Patterns**: Handles location formatting (san francisco, california → San Francisco, California)
- **Relationship Patterns**: Adds "Relationship" where appropriate (father daughter → Father Daughter Relationship)
- **Credit Stinger Terms**: Expands compound terms (duringcreditsstinger → During Credits Stinger)

- ✅ Added comprehensive test suite
- 90+ test cases covering various normalization scenarios
- Tests for edge cases, mixed case preservation, and pattern matching
- Ensures consistent behavior across different keyword types

- ✅ Smart duplicate cleaning functionality
- Automatically removes old unnormalized keywords when adding normalized versions
- Preserves manually set keywords in Plex
- Prevents accumulation of duplicate keywords (e.g., both "sci-fi" and "Sci-Fi")
- Shows cleaning activity in verbose logging mode
#### Keyword Normalization
- Pattern-based normalization: sci-fi -> Sci-Fi, romcom -> Romantic Comedy
- Acronym detection (FBI, CIA, DEA, etc.)
- Century formatting (5th century bc -> 5th Century BC)
- City/state, relationship, and credit stinger patterns
- 90+ test cases
- Duplicate cleaning: removes old unnormalized keywords when normalized versions are added

#### Force Update Mode
- FORCE_UPDATE env var (default false)
- Reprocesses all items regardless of storage state

- ✅ Added force update functionality
- New `FORCE_UPDATE` environment variable (default: false)
- Reprocesses all items regardless of previous processing status
- Useful for applying keyword normalization to existing libraries
- Shows clear indication when force update mode is active
- Bypasses both storage checks and "already has keywords" logic
#### Export Functionality
- EXPORT_LABELS, EXPORT_LOCATION, EXPORT_MODE (txt/json) env vars
- Generates file lists per label per library
- JSON mode outputs a single structured export.json
- TXT mode creates per-library subdirectories with summary.txt

### Changed

- Modified `NewProcessor` to accept optional Radarr/Sonarr clients and return error
- Enhanced TMDb ID detection to show source (Plex metadata, Radarr, Sonarr, or file path)
- Processor initialization now includes persistent storage setup
- Main application now tests all API connections on startup
- TMDb client now normalizes all keywords before returning them
- Updated keyword display to show normalization in verbose mode
- Enhanced keyword synchronization with smart duplicate cleaning
- Force update mode bypasses all previous processing checks

### Documentation

- ✅ Updated README.md with comprehensive documentation
- Added Radarr/Sonarr Integration section with benefits and configuration
- Added Verbose Logging section with examples
- Added Keyword Normalization section with pattern examples
- Added Force Update Mode section with use cases and examples
- Added Smart Duplicate Cleaning documentation
- Updated environment variables documentation
- Added persistent storage information
- Updated docker-compose examples

- ✅ Created detailed CHANGELOG.md
- Comprehensive list of all changes
- Organized by feature area
- Technical implementation details

### Technical Details
- Radarr/Sonarr clients use API v3 endpoints
- Implemented robust error handling and fallback mechanisms
- No breaking changes - Radarr/Sonarr integration is fully optional
- Maintains backward compatibility with existing file path matching
- Verbose logging provides detailed insights without affecting normal operation
- Keyword normalization uses regex patterns for scalability
- All features are designed to be non-breaking and backward compatible
- NewProcessor accepts optional Radarr/Sonarr clients and returns error
- TMDb client normalizes keywords before returning them
- Removal delay reduced from 500ms to configurable ITEM_DELAY

### Technical Notes
- Radarr/Sonarr use API v3
- All new features are optional and backward compatible
- No breaking changes to existing configuration
Loading