Skip to content

IHS-193 Add support for file upload/download for CoreFileObject#792

Merged
gmazoyer merged 21 commits intoinfrahub-developfrom
gma-20260202-ihs193
Feb 6, 2026
Merged

IHS-193 Add support for file upload/download for CoreFileObject#792
gmazoyer merged 21 commits intoinfrahub-developfrom
gma-20260202-ihs193

Conversation

@gmazoyer
Copy link
Contributor

@gmazoyer gmazoyer commented Feb 2, 2026

This change adds support to create, update and retrieve nodes which schemas implement CoreFileObject.

The proposed user exposed API follows these key points:

  • node.upload_from_path(path) to select a file from disk for upload (streams during upload)
  • node.upload_from_bytes(content, name) to set content for upload (supports bytes or BinaryIO)
  • node.download_file(dest) to download a file to memory or stream to disk

Being able to stream a file to disk or from a disk is important in order to support large files and to avoid them being loaded completely into memory (which would be problematic for +1GB files in general).

The choice of using upload_from_path and upload_from_bytes is to prevent a collision with a potential attribute or relationship called file in the schema. That is also the reason why the file GraphQL parameter is outside the data one instead of inside it.

Here we introduce a couple of components to try to make our code SOLID (it's not much for now, but it's honest work):

  • FileHandler / FileHandlerSync dedicated classes for file I/O operations
  • MultipartBuilder GraphQL Multipart Request Spec payload building

It sure won't make our code SOLID but it won't add to the burden for now.

So given the user who loaded a schema, using our SDK will look like:

Upload a file when creating a node

from pathlib import Path
from infrahub_sdk import InfrahubClientSync

client = InfrahubClientSync()

contract = client.create(
    kind="NetworkCircuitContract", contract_start="2026-01-01", contract_end="2026-12-31"
)

# Option 1: Select file from disk (will be streamed during upload)
contract.upload_from_path(path=Path("/tmp/contract.pdf"))

# Option 2: Upload from memory (for small files or dynamically generated content)
contract.upload_from_bytes(content=b"file content", name="contract.pdf")

# Save as usual
contract.save()

Download a file from a node

from pathlib import Path

contract = client.get(kind="NetworkCircuitContract", id="abc123")

# Download to memory (suitable for small files)
content = contract.download_file()

# Or stream directly to disk (suitable for large files)
bytes_written = contract.download_file(dest=Path("/tmp/downloaded.pdf"))

Summary by CodeRabbit

  • New Features

    • GraphQL multipart file upload support and node-level file attach/clear APIs.
    • File download for file-like nodes (in-memory or streamed to disk).
    • Streaming GET responses and improved proxy/config handling for network requests.
  • Tests

    • Comprehensive unit tests for multipart uploads, file preparation, downloads/streaming, and node file workflows.

@coderabbitai
Copy link

coderabbitai bot commented Feb 2, 2026

Walkthrough

Adds end-to-end file upload/download support to the SDK. New GraphQL multipart helpers (MultipartBuilder, multipart POST/request builders) and extended GraphQL types support file uploads. Introduces PreparedFile, FileHandler/FileHandlerSync for streaming uploads and downloads. InfrahubClient/InfrahubClientSync gain multipart upload execution, proxy-building, and streaming GET helpers. Node classes gain file state and APIs (is_file_object, upload_from_path, upload_from_bytes, clear_file, download_file) and validation. Protocol types include CoreFileObject. Extensive unit tests and pylint exceptions were added.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely describes the main change: adding file upload/download support for CoreFileObject nodes, which is the primary objective of the changeset.
Docstring Coverage ✅ Passed Docstring coverage is 81.48% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@cloudflare-workers-and-pages
Copy link

cloudflare-workers-and-pages bot commented Feb 2, 2026

Deploying infrahub-sdk-python with  Cloudflare Pages  Cloudflare Pages

Latest commit: 66c9871
Status: ✅  Deploy successful!
Preview URL: https://6650b7f0.infrahub-sdk-python.pages.dev
Branch Preview URL: https://gma-20260202-ihs193.infrahub-sdk-python.pages.dev

View logs

@codecov
Copy link

codecov bot commented Feb 2, 2026

Codecov Report

❌ Patch coverage is 83.50515% with 64 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
infrahub_sdk/client.py 75.00% 22 Missing and 8 partials ⚠️
infrahub_sdk/file_handler.py 81.69% 20 Missing and 6 partials ⚠️
infrahub_sdk/node/node.py 90.47% 4 Missing and 4 partials ⚠️
@@                 Coverage Diff                  @@
##           infrahub-develop     #792      +/-   ##
====================================================
+ Coverage             80.36%   80.46%   +0.09%     
====================================================
  Files                   115      117       +2     
  Lines                  9865    10222     +357     
  Branches               1504     1546      +42     
====================================================
+ Hits                   7928     8225     +297     
- Misses                 1415     1462      +47     
- Partials                522      535      +13     
Flag Coverage Δ
integration-tests 40.20% <9.27%> (-1.22%) ⬇️
python-3.10 51.86% <60.05%> (+0.48%) ⬆️
python-3.11 51.84% <60.05%> (+0.48%) ⬆️
python-3.12 51.84% <60.05%> (+0.46%) ⬆️
python-3.13 51.86% <60.05%> (+0.48%) ⬆️
python-3.14 53.54% <62.88%> (+0.51%) ⬆️
python-filler-3.12 24.01% <22.42%> (-0.06%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
infrahub_sdk/ctl/branch.py 81.13% <100.00%> (ø)
infrahub_sdk/graphql/__init__.py 100.00% <100.00%> (ø)
infrahub_sdk/graphql/constants.py 100.00% <100.00%> (ø)
infrahub_sdk/graphql/multipart.py 100.00% <100.00%> (ø)
infrahub_sdk/graphql/renderers.py 92.94% <100.00%> (+0.08%) ⬆️
infrahub_sdk/node/constants.py 100.00% <100.00%> (ø)
infrahub_sdk/protocols.py 100.00% <100.00%> (ø)
infrahub_sdk/node/node.py 85.33% <90.47%> (-0.02%) ⬇️
infrahub_sdk/file_handler.py 81.69% <81.69%> (ø)
infrahub_sdk/client.py 72.62% <75.00%> (+0.62%) ⬆️

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@gmazoyer gmazoyer marked this pull request as ready for review February 4, 2026 12:35
@gmazoyer gmazoyer requested a review from a team as a code owner February 4, 2026 12:35
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@infrahub_sdk/file_handler.py`:
- Around line 98-118: The handler handle_error_response should guard against
non-JSON responses by wrapping each exc.response.json() call in a try/except
(catching json.JSONDecodeError or ValueError) and falling back to
exc.response.text; for 401/403, if json parsing fails treat the body text as a
single message (or empty) when building the AuthenticationError message, and for
404 use the fallback text as the detail when constructing NodeNotFoundError;
preserve the original HTTPStatusError as the cause (raise ... from exc).

In `@infrahub_sdk/node/node.py`:
- Around line 1160-1163: The error raised when self._file_object_support is true
and self._file_content is None references the wrong helper method name; update
the ValueError text in the block that checks self._file_object_support /
self._file_content (near usage of self._schema.kind) to instruct callers to use
the correct methods select_file_for_upload() or select_content_for_upload()
instead of set_file().
- Around line 2049-2052: The error message inside the check for
self._file_object_support and self._file_content should reference the actual
synchronous setter method name instead of the non-existent set_file(); update
the message to instruct callers to use the real sync setter (e.g.,
set_file_content()) — modify the ValueError string that currently mentions
set_file() so it references set_file_content() (or the actual sync method used
in this class) and keep the rest of the message intact; target the block that
checks _file_object_support and _file_content and update the message formatting
around _schema.kind accordingly.
🧹 Nitpick comments (1)
infrahub_sdk/client.py (1)

997-1035: Avoid mutating caller-provided variables in file-upload helpers.

Both async and sync paths mutate the incoming dict. Prefer a new dict to avoid side effects.

Suggested refactor
-        variables = variables or {}
-        variables["file"] = None
+        variables = {**(variables or {}), "file": None}
@@
-        variables = variables or {}
-        variables["file"] = None
+        variables = {**(variables or {}), "file": None}

Also applies to: 2055-2093

@wvandeun
Copy link
Contributor

wvandeun commented Feb 5, 2026

I would prefer the upload_from_path and upload_from_content names. Maybe even upload_from_bytes for the latter?

@gmazoyer gmazoyer force-pushed the gma-20260202-ihs193 branch 2 times, most recently from e9e0f9e to 1b304a6 Compare February 5, 2026 16:28
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@infrahub_sdk/protocols.py`:
- Around line 110-115: The CoreFileObject class (and the other affected
generated blocks referenced at lines 234-237, 675-680, 799-802) were edited
manually but this file is generated; revert any hand edits to these definitions
and re-run the generation task or update the source schema so the generator
emits the correct definitions (e.g., restore CoreFileObject and the other
generated classes/fields to match the generator output), do not make direct
changes in infrahub_sdk/protocols.py, and commit the regenerated file instead.
🧹 Nitpick comments (1)
infrahub_sdk/node/node.py (1)

224-288: Reuse _validate_file_object_support to avoid duplicated checks.
upload_from_path and upload_from_bytes repeat the same FeatureNotSupportedError logic; reusing the helper keeps error messaging consistent and centralized.

♻️ Suggested refactor
-        if not self._file_object_support:
-            raise FeatureNotSupportedError(
-                f"File upload is not supported for {self._schema.kind}. Only nodes inheriting from CoreFileObject support file uploads."
-            )
+        self._validate_file_object_support(
+            message=(
+                f"File upload is not supported for {self._schema.kind}. "
+                "Only nodes inheriting from CoreFileObject support file uploads."
+            )
+        )
@@
-        if not self._file_object_support:
-            raise FeatureNotSupportedError(
-                f"File upload is not supported for {self._schema.kind}. Only nodes inheriting from CoreFileObject support file uploads."
-            )
+        self._validate_file_object_support(
+            message=(
+                f"File upload is not supported for {self._schema.kind}. "
+                "Only nodes inheriting from CoreFileObject support file uploads."
+            )
+        )

Also applies to: 509-512

It was agreed to use `upload_from_path(path)` and
`upload_from_bytes(content, name)`.

Also update protocols which raised, it seems a valid error.
@gmazoyer gmazoyer force-pushed the gma-20260202-ihs193 branch from 1b304a6 to 72047a7 Compare February 5, 2026 16:31
Copy link
Contributor

@ogenstad ogenstad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good with some integration tests once we have things updated on the Infrahub side.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Fix all issues with AI agents
In `@infrahub_sdk/client.py`:
- Around line 997-1033: Summary: The method _execute_graphql_with_file mutates
the caller's variables dict by assigning variables["file"] = None; make a copy
first to avoid leaking state. Fix: inside _execute_graphql_with_file (and the
other similar block) replace the in-place modification with a shallow copy
(e.g., vars_copy = dict(variables) or variables.copy() after handling None) and
set vars_copy["file"] = None, then use vars_copy for the request; ensure you
handle variables being None before copying. Reference symbols:
_execute_graphql_with_file, variables, "file".
- Around line 1060-1090: The multipart upload retries can send an empty file
because the passed file-like object has been consumed; in the async
_post_multipart method (decorated with handle_relogin) seek the incoming
file_content to the start (file_content.seek(0)) before calling
MultipartBuilder.build_payload so the payload is built from the beginning of the
stream, and apply the identical change to the sync counterpart (the sync
_post_multipart or similarly named method used elsewhere) and any other
occurrences (the other multipart methods around the second block referenced) so
retries re-upload the full file.

In `@infrahub_sdk/ctl/branch.py`:
- Around line 122-123: The two occurrences of "pc.created_by" (used in
proposed_change_table.add_row and format_timestamp) currently use a blanket "#
type: ignore[attr-defined]" without justification; fix this by either updating
the type definitions for the object that pc is (so created_by is declared on
that model) or by replacing the ignore with an explicit, checkable
assertion/cast (e.g., assert isinstance(pc, ExpectedProposedChange) or cast to
the model type) and add a short inline comment explaining why created_by is
guaranteed to exist (e.g., "created_by is populated by the API vX response for
created proposed changes"); update the annotation for pc or add the assert/cast
next to the accesses of pc.created_by so mypy no longer needs the ignore and
future readers see the rationale.

In `@infrahub_sdk/node/node.py`:
- Around line 249-275: The docstring example for upload_from_bytes is unsafe
because it stores a file handle in self._file_content and then closes it (via
the with block) before node.save() runs, causing I/O errors; update the example
and text in upload_from_bytes to show one of two correct patterns: (a) read the
file into bytes and call upload_from_bytes(content=f.read(), name=...) outside a
with, or (b) keep the handle open and call node.save() inside the with block
(e.g., with open(..., "rb") as f: node.upload_from_bytes(content=f, name="...");
node.save()). Mention that this applies to nodes inheriting from CoreFileObject
and reference upload_from_bytes, self._file_content and save() in the docstring.

In `@tests/unit/sdk/test_file_object.py`:
- Around line 78-105: The test uses type: ignore[union-attr] on attributes like
node.contract_start and node.contract_end in
test_node_create_with_file_uses_multipart (and the other listed tests) without
justification; replace each bare ignore by either asserting/casting the node to
the correct runtime type or adding a brief justification comment: e.g., use an
explicit runtime assertion (assert isinstance(node, InfrahubNode) or cast via
typing.cast) before accessing .contract_start/.contract_end, or add a one-line
comment explaining why the union attribute is present and safe (e.g., "# safe:
test constructs node with schema that guarantees contract_start attribute");
update the occurrences around InfrahubNode/InfrahubNodeSync instantiation so
mypy no longer needs type: ignore or the ignore is accompanied by the short
justification comment.
🧹 Nitpick comments (3)
infrahub_sdk/node/node.py (3)

282-288: Consider moving async/sync file upload preparation to their respective classes.

_get_file_for_upload is an async def on InfrahubNodeBase, which means InfrahubNodeSync inherits an async method it never uses. Similarly, _get_file_for_upload_sync is inherited by InfrahubNode. This pattern follows from the existing codebase but the guideline says to never mix async/sync inappropriately. Consider moving _get_file_for_upload to InfrahubNode and _get_file_for_upload_sync to InfrahubNodeSync to keep the separation clean, similar to how _process_hierarchical_fields is defined separately on each class.


791-824: Consider adding @overload signatures for better type narrowing.

FileHandler.download already uses @overload to distinguish bytes (no dest) vs int (with dest). Without matching overloads here, callers always receive bytes | int and must manually narrow. This is a minor DX concern.

Example overloads
from typing import overload

`@overload`
async def download_file(self, dest: None = None) -> bytes: ...
`@overload`
async def download_file(self, dest: Path) -> int: ...
async def download_file(self, dest: Path | None = None) -> bytes | int:
    ...

1186-1210: Extract duplicated file-upload mutation logic into a helper.

The file-upload-or-regular-mutation block is repeated verbatim in create and update (both async and sync — 4 copies total). Consider extracting a helper like _execute_mutation(query, input_data, tracker, timeout) that encapsulates the "file" branch, try/finally cleanup, clear_file, and the else-branch. This would reduce the surface area for bugs if the logic needs to change later.

Also applies to: 1228-1252

@gmazoyer gmazoyer merged commit 504a618 into infrahub-develop Feb 6, 2026
21 checks passed
@gmazoyer gmazoyer deleted the gma-20260202-ihs193 branch February 6, 2026 12:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants