Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
2d47dda
chore: update release configuration and workflow conditions
disafronov Dec 6, 2025
9733599
chore: sync from template v1.0.2
Dec 6, 2025
e9ab53e
refactor(logging): update DataValidationError handling and logging be…
disafronov Dec 6, 2025
8c88a31
refactor(logging): remove DataValidationError and improve error logging
disafronov Dec 6, 2025
e421871
refactor(logging): simplify error messages in schema validation
disafronov Dec 6, 2025
e573abf
refactor(logging): streamline error message formatting in schema logging
disafronov Dec 6, 2025
720b828
docs(logging): clarify stacklevel usage in SchemaLogger
disafronov Dec 6, 2025
80b70c4
refactor(logging): enhance caller information retrieval in SchemaLogger
disafronov Dec 6, 2025
a2d53b4
chore(release): 0.1.2-rc.1
semantic-release-bot Dec 6, 2025
dc6d966
refactor(logging): update SchemaLogger error handling and remove Sche…
disafronov Dec 6, 2025
b663658
refactor(logging): remove deprecated SchemaValidationError and update…
disafronov Dec 6, 2025
19f47d5
chore(release): 0.1.2-rc.2
semantic-release-bot Dec 6, 2025
d68b055
docs(logging): update README and tests for valid empty schema handling
disafronov Dec 6, 2025
13107d1
refactor(logging): improve schema validation handling in SchemaLogger
disafronov Dec 6, 2025
064c043
docs(logging): clarify stack handling in SchemaLogger comments
disafronov Dec 6, 2025
44c67d3
docs(logging): refine comments on caller information retrieval in Sch…
disafronov Dec 6, 2025
958db09
docs(logging): update comment for cache invalidation in schema loader
disafronov Dec 6, 2025
55f9dba
refactor(logging): update import statements in schema applier and loader
disafronov Dec 6, 2025
13790f3
refactor(tests): update import statements for _write_schema in test f…
disafronov Dec 6, 2025
ec48280
docs(logging): enhance comment for excessive nesting depth check in s…
disafronov Dec 6, 2025
421afcc
refactor(logging): enhance caller information retrieval in SchemaLogger
disafronov Dec 6, 2025
3687958
docs(README): clarify definitions of inner and leaf nodes in schema
disafronov Dec 6, 2025
0832d27
docs(README): update error message format for schema validation
disafronov Dec 6, 2025
25453fb
refactor(logging): replace Lock with RLock for path cache synchroniza…
disafronov Dec 6, 2025
83b4c53
chore(release): 0.1.2-rc.3
semantic-release-bot Dec 6, 2025
1cffebc
chore(workflow): add release branch to lint and test workflow
disafronov Dec 6, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/workflows/lint_and_test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ name: Lint and test
pull_request:
branches:
- main
- release

jobs:
lint_and_test:
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/release-stable.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ name: Release (stable)
jobs:
release:
name: "Release"
if: ${{ !startsWith(github.event.head_commit.message, 'chore(release):') }}
runs-on: ubuntu-latest
permissions:
contents: write
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/release.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ name: Release
jobs:
release:
name: "Release"
if: ${{ !startsWith(github.event.head_commit.message, 'chore(release):') }}
runs-on: ubuntu-latest
permissions:
contents: write
Expand Down
2 changes: 1 addition & 1 deletion .releaserc.json
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@
}],
["@semantic-release/git", {
"assets": ["pyproject.toml", "uv.lock"],
"message": "chore(release): ${nextRelease.version}\n\n${nextRelease.notes}"
"message": "chore(release): ${nextRelease.version}\n\n${nextRelease.notes}\n\nSigned-off-by: Release Bot <noreply@github.com>"
}],
["@semantic-release/github", {}]
]
Expand Down
143 changes: 65 additions & 78 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ to appear in logs and which types they must have.

- Application code must only send `extra` fields that are described in the schema
and match the declared Python types. Any deviation (unknown fields, wrong types,
`None` values, disallowed list elements) is reported via `DataValidationError`
`None` values, disallowed list elements) is logged as an ERROR message
*after* the log record has been emitted.
- The schema file (`logging_objects_with_schema.json`) is a shared, versioned
artifact that defines the shape of structured log payloads for all downstream
Expand All @@ -39,9 +39,9 @@ to appear in logs and which types they must have.
- Any mismatch between runtime values and the declared types is also treated as
a data error.
- All validation problems (unknown fields, wrong types, disallowed list
elements, `None` values, etc.) are aggregated into a single
`DataValidationError` that is raised **after** the log record has been
emitted.
elements, `None` values, etc.) are aggregated and logged as a single
ERROR message **after** the log record has been emitted, ensuring 100%
compatibility with standard logger behavior (no exceptions are raised).
- The schema is treated as the only source of truth for which `extra` fields
are allowed to appear in logs. Any deviation from the schema is considered a
contract violation between the producer of `extra` and the schema author.
Expand Down Expand Up @@ -132,34 +132,19 @@ logger.info(
### Error handling example

```python
from logging_objects_with_schema import SchemaLogger, DataValidationError, SchemaValidationError

try:
# This will raise SchemaValidationError if schema file is missing or invalid
logging.setLoggerClass(SchemaLogger)
logger = logging.getLogger("service")
except SchemaValidationError as e:
print(f"Schema validation failed: {e}")
for problem in e.problems:
print(f" - {problem.message}")
# Decide whether to abort or continue with default logger
raise
except (OSError, ValueError, RuntimeError) as e:
# System-level errors (e.g., inaccessible working directory when os.getcwd()
# fails, lock failures, path resolution issues) are raised directly, not
# wrapped in SchemaValidationError. Note: OSError when reading the schema
# file is converted to SchemaValidationError and handled above.
print(f"System error during logger initialization: {e}")
raise

# When logging, handle DataValidationError
try:
logger.info("processing", extra={"user_id": "not-an-int"}) # Wrong type
except DataValidationError as e:
print(f"Data validation failed: {e}")
for problem in e.problems:
print(f" - {problem.message}")
# Note: the valid part of the log was already emitted before the exception
from logging_objects_with_schema import SchemaLogger

# SchemaLogger is a drop-in replacement - no exception handling needed.
# If the schema has problems, the application will be terminated after
# logging schema problems to stderr.
logging.setLoggerClass(SchemaLogger)
logger = logging.getLogger("service")

# When logging with invalid data, validation errors are automatically
# logged as ERROR messages. No exception handling is needed.
logger.info("processing", extra={"user_id": "not-an-int"}) # Wrong type
# The valid part of the log is emitted, and validation errors are logged
# as ERROR messages with details about the problems.
```

### API compatibility with ``logging.Logger``
Expand All @@ -181,12 +166,11 @@ except DataValidationError as e:
- Schema tree depth is limited to a maximum nesting level (currently 100). Any
branch that exceeds this depth is ignored and reported as a schema problem.

If the schema file is not found, or cannot be read/parsed/validated, a
`SchemaValidationError("Schema has problems", problems=[...])` is raised when
a `SchemaLogger` instance is created. In this case the logger instance is not
created at all. The `problems` list contains a detailed description of the
issue, including the path where the file was expected (based on the current
working directory at schema discovery time).
If the schema file is not found, or cannot be read/parsed/validated, the
logger instance is not created, schema problems are logged to stderr, and
the application is terminated via `os._exit(1)`. The error message contains
a detailed description of all issues, including the path where the file was
expected (based on the current working directory at schema discovery time).

An example schema:

Expand All @@ -207,8 +191,21 @@ An example schema:
}
```

- An inner node is an object without `type` and `source`.
- A leaf node is an object with both `type` and `source`.
An example of a valid empty schema (no leaves, no problems):

```json
{}
```

An empty schema is valid and does not cause errors. When using an empty schema,
no `extra` fields will be included in log records, and any attempt to log with
`extra` fields will result in validation errors being logged as ERROR messages.

- An inner node is an object without `type` and `source` (neither field is present).
- A leaf node is an object that has at least one of `type` or `source` fields.
However, a valid leaf node must have both `type` and `source` fields. If a
leaf node is missing either field or has an empty value, it will be reported
as a schema problem during validation.
- `type` is one of the allowed Python type names: `"str"`, `"int"`, `"float"`,
`"bool"`, or `"list"`.
- For `"list"` type, an additional `item_type` field is required to declare
Expand Down Expand Up @@ -270,7 +267,8 @@ message similar to:

> Field 'tags' is a list but contains elements with types [...]; expected all elements to be of type str

and a `DataValidationError` is raised **after** the log record has been emitted.
and an ERROR message is logged **after** the log record has been emitted with
the format: `"Log data does not match schema: {problem1}; {problem2}; ..."`.

### Multiple leaves with the same source

Expand All @@ -284,7 +282,7 @@ When a `source` is referenced by multiple leaves:
- The value is written only to those leaf locations where the runtime type
matches the expected type.
- For leaf locations where the type does not match, a `DataProblem` is added
to the `DataValidationError` that is raised after logging.
to the ERROR message that is logged after logging.

Example schema with duplicate source usage:

Expand Down Expand Up @@ -320,30 +318,23 @@ consistent type expectations when reusing a `source` field.
- If there are **any** problems with the schema (missing file, broken JSON,
invalid `type` values, conflicting root fields that match system logging
fields, malformed structure, etc.):
- a `SchemaValidationError("Schema has problems", problems=[...])` is raised;
- the logger instance is not created.
- the logger instance is not created (schema validation happens before
the logger is initialized);
- schema problems are logged to stderr in the format:
`"Schema has problems: {problem1}; {problem2}; ..."`;
- the application is terminated via `os._exit(1)`.
- If there are no problems:
- the schema is compiled into a `CompiledSchema`;
- the logger is created and starts using this schema to validate `extra`
fields.
- A valid empty schema (e.g., `{}` or a schema with only inner nodes and no
leaves) is treated as valid and does not cause errors. The logger is created
successfully, but no `extra` fields will be included in log records.

The application decides whether `SchemaValidationError` is a fatal error that
should abort startup, or whether it should be logged and ignored.

**Note**: In rare cases, system-level errors may be raised directly instead of
being wrapped in `SchemaValidationError`. Specifically:

- `OSError` when the current working directory is inaccessible or deleted (e.g.,
when `os.getcwd()` fails) is propagated as-is to indicate environmental issues.
However, `OSError` that occurs when reading the schema file (e.g., permission
denied, I/O errors) is converted to `SchemaValidationError` and reported as a
schema problem.
- `ValueError` when path resolution fails due to invalid characters or malformed
paths during schema file discovery is propagated as-is.
- `RuntimeError` when thread lock acquisition fails is propagated as-is.

Applications should handle these exceptions separately if they need to
distinguish between schema validation failures and system-level errors.
**Note**: System-level errors (OSError, ValueError, RuntimeError) that occur
during schema compilation are converted to `SchemaProblem` instances and
handled the same way as schema validation problems - the application is
terminated after logging the error to stderr.

## Schema caching and thread safety

Expand Down Expand Up @@ -395,8 +386,9 @@ distinguish between schema validation failures and system-level errors.
**it is simply not included in the final log record**.

In all of these cases a `DataProblem` is recorded for each offending field, and
if at least one problem is present a single `DataValidationError` is raised
**after** the log record has been emitted.
if at least one problem is present, a single ERROR message is logged
**after** the log record has been emitted. The error message format is:
`"Log data does not match schema: {problem1}; {problem2}; ..."`.

- When a `source` is used in multiple leaves (see "Multiple leaves with the same
source" above), the value is validated and written independently for each leaf
Expand All @@ -412,18 +404,13 @@ High-level algorithm inside `SchemaLogger`:
2. A new structured payload is built from the schema and the given `extra`.
3. Only this structured payload is passed to the underlying stdlib logger.
4. After logging, if any validation problems were detected, a single
`DataValidationError` is raised with the full `problems` list.

## Exceptions

- **`SchemaValidationError`**:
- Any problem with the schema (missing file, broken JSON, invalid types,
conflicting root fields, malformed structure, etc.).
- Exposes a `problems` list describing each violation.
- **`DataValidationError`**:
- Any problem with a specific `extra` payload during logging:
- the runtime type does not match the expected type;
- a list contains non-primitive elements (only str, int, float, bool allowed);
- the field is redundant and not described in the schema.
- Raised **after** the valid part of the payload has been logged.
- Contains a `problems` list describing the offending fields.
ERROR message is logged with the format:
`"Log data does not match schema: {problem1}; {problem2}; ..."`
(no exception is raised, ensuring 100% compatibility with standard logger behavior).

## Error handling

- Schema problems are handled internally: errors are logged to stderr and
the application is terminated via `os._exit(1)`.
- No exceptions are raised by `SchemaLogger` during initialization, making
it a true drop-in replacement for `logging.Logger`.
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ build-backend = "uv_build"

[project]
name = "logging-objects-with-schema"
version = "0.1.1"
version = "0.1.2rc3"
description = "Proxy logging wrapper that validates extra fields against a JSON schema."
readme = "README.md"
requires-python = ">=3.10"
Expand Down
3 changes: 0 additions & 3 deletions src/logging_objects_with_schema/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,8 @@

from __future__ import annotations

from .errors import DataValidationError, SchemaValidationError
from .schema_logger import SchemaLogger

__all__ = [
"SchemaLogger",
"SchemaValidationError",
"DataValidationError",
]
57 changes: 0 additions & 57 deletions src/logging_objects_with_schema/errors.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,35 +16,6 @@ class SchemaProblem:
message: str


class SchemaValidationError(Exception):
"""Raised when there are problems with the JSON schema definition.

This exception is raised during SchemaLogger initialization when the schema
file cannot be loaded, parsed, or validated. The logger instance will not
be created if this exception is raised.

The human-readable summary is stored in the exception message, while
detailed information about each violation is available in the ``problems``
attribute.

Attributes:
problems: List of SchemaProblem instances describing each validation issue.

Example:
>>> try:
... logger = SchemaLogger("my_logger")
... except SchemaValidationError as e:
... for problem in e.problems:
... print(f"Schema error: {problem.message}")
"""

def __init__(
self, message: str, problems: list[SchemaProblem] | None = None
) -> None:
super().__init__(message)
self.problems: list[SchemaProblem] = problems or []


@dataclass
class DataProblem:
"""Describes a single problem encountered while validating log data.
Expand All @@ -54,31 +25,3 @@ class DataProblem:
"""

message: str


class DataValidationError(Exception):
"""Raised when log record data does not satisfy the configured schema.

This exception is raised *after* the valid part of the log record
has already been formatted and sent to the underlying handler. This means
that even if validation fails, the valid fields will still appear in the log.

The exception message provides a summary description of the validation
failure, while detailed information about individual problems is exposed
via the ``problems`` attribute.

Attributes:
problems: List of DataProblem instances describing each validation issue.

Example:
>>> try:
... logger.info("processing", extra={"user_id": "not-an-int"})
... except DataValidationError as e:
... for problem in e.problems:
... print(f"Validation error: {problem.message}")
... # Note: valid fields were already logged before this exception
"""

def __init__(self, message: str, problems: list[DataProblem] | None = None) -> None:
super().__init__(message)
self.problems: list[DataProblem] = problems or []
14 changes: 7 additions & 7 deletions src/logging_objects_with_schema/schema_applier.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,8 @@
from __future__ import annotations

from collections import defaultdict
from typing import Any, Mapping, MutableMapping
from collections.abc import Mapping, MutableMapping
from typing import Any

from .errors import DataProblem
from .schema_loader import CompiledSchema, SchemaLeaf
Expand All @@ -34,8 +35,7 @@ def _validate_list_value(
"""
if item_expected_type is None:
return DataProblem(
f"Field '{source}' is declared as list in schema but "
f"has no item type configured",
f"Field '{source}' is list but has no item type configured",
)

if len(value) == 0:
Expand Down Expand Up @@ -195,8 +195,8 @@ def _apply_schema_internal(
where the type matches; mismatched locations produce ``DataProblem``
entries, but do not affect successful locations.

The function itself does not raise ``DataValidationError``; it only
accumulates :class:`DataProblem` instances for the caller to handle.
The function itself does not raise exceptions; it only accumulates
:class:`DataProblem` instances for the caller to handle.

Note:
This function is used internally by :class:`SchemaLogger` and is not
Expand Down Expand Up @@ -227,7 +227,7 @@ def _apply_schema_internal(
if value is None:
problems.append(
DataProblem(
f"Field '{source}' is None, but None values are not allowed",
f"Field '{source}' is None",
),
)
continue
Expand All @@ -246,7 +246,7 @@ def _apply_schema_internal(
for key in redundant_keys:
problems.append(
DataProblem(
f"Field '{key}' is not defined in schema and will be ignored",
f"Field '{key}' is not defined in schema",
),
)

Expand Down
Loading