Skip to content

Conversation

@sjperkins
Copy link
Contributor

@sjperkins sjperkins commented Jan 22, 2026

…and engage the Batch before submission to the executor
@sjperkins sjperkins force-pushed the add-batch-to-virtual-chunked-ReadParameters branch from 0553ba4 to 52c0f62 Compare January 22, 2026 16:16
@sjperkins sjperkins marked this pull request as ready for review January 22, 2026 16:19
@sjperkins
Copy link
Contributor Author

Re: the failing github actions in my fork: https://github.com/sjperkins/tensorstore/actions

It looks as if the linux arm64 run is failing to obtain an arm64 runner.

image

arm-ubuntu-arm-22.04-8core may be only available on paid plans.

@sjperkins
Copy link
Contributor Author

Guess I could change the the runner type to ubuntu-22.04-arm temporarily to test the PR

@laramiel
Copy link
Collaborator

There need to be additional tests in the C++ code for virtual_chunked/virtual_chunked_test.cc

I think that a few of the Batch parameters should instead be Batch::View.

@jbms
Copy link
Collaborator

jbms commented Jan 30, 2026

There need to be additional tests in the C++ code for virtual_chunked/virtual_chunked_test.cc

I think that a few of the Batch parameters should instead be Batch::View.

Agreed on adding C++ tests. I don't know that I see anywhere that should use Batch::View --- since Batch::View behaves like string_view we can only use it where we are sure the batch will remain valid long enough.

@laramiel
Copy link
Collaborator

There need to be additional tests in the C++ code for virtual_chunked/virtual_chunked_test.cc
I think that a few of the Batch parameters should instead be Batch::View.

Agreed on adding C++ tests. I don't know that I see anywhere that should use Batch::View --- since Batch::View behaves like string_view we can only use it where we are sure the batch will remain valid long enough.

I was thinking that Batch batch() const { return batch_; } could possible return Batch::View.

@sjperkins
Copy link
Contributor Author

I was thinking that Batch batch() const { return batch_; } could possible return Batch::View.

This makes sense: I've implemented this change. After some thought I couldn't think of other cases, but I may have missed something.

@sjperkins
Copy link
Contributor Author

I've added 3 C++ test cases testing passing of the batch to tensorstore::Read

  1. No batch argument passed
  2. Batch::no_batch argument passed
  3. Passing of an actual Batch

#include "tensorstore/transaction.h"
#include "tensorstore/util/executor.h"
#include "tensorstore/util/option.h"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some pre-existing missing includes that would be nice to add.

#include

#include "absl/time/time.h"
#include "tensorstore/driver/driver.h"
#include "tensorstore/index.h"
#include "tensorstore/open_mode.h"
#include "tensorstore/rank.h"
#include "tensorstore/schema.h"
#include "tensorstore/strided_layout.h"
#include "tensorstore/util/future.h"
#include "tensorstore/util/result.h"
#include "tensorstore/util/status.h"

Along with the build dependencies for them.

Copy link
Contributor Author

@sjperkins sjperkins Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made those changes:

So generally I've been poor at handling IWYU, mostly because of the context switch of setting up bazel-compile-commands-extractor with clangd each time, but I think I've got a better handle on this now than in the past. The setup process has been recorded here:

Maybe I'll find time to add it to the docs at some point.

Copy link
Contributor Author

@sjperkins sjperkins Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Along with the build dependencies for them.

These are the bazel build deps for the virtual_chunked tensorstore_cc_library ?

Edit OK yes, I see some of those missing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added something here 98119c4

but I suspect this is not quite what you're asking for. Can you please clarify?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, my reverse mapping of build deps is somewhat broken right now, so I haven't worried about that; generally the bazel build cleaner tool should fix those if the includes are ~correct (within the same project), and that gets run on import.

Copy link
Contributor Author

@sjperkins sjperkins Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, my reverse mapping of build deps is somewhat broken right now, so I haven't worried about that; generally the bazel build cleaner tool should fix those if the includes are ~correct (within the same project), and that gets run on import.

Still not understanding you I'm afraid. The bazel build cleaner seems to be an internal tool within Google?

If this works with the tensorstore build as is, can you please provide a command? i.e. python bazelisk.py run ...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Along with the build dependencies for them"

Yes, I meant in the BUILD file to add the appropriate lines to the deps = [ ... ] variable for the virtual_chunked target appropriate for each of the added include lines.

Internally we have a build cleaner tool, which may be something like bant to add them.

Also, in order to submit I import your PR into our source-of-truth tree, which to applies a reverse mapping, which right now isn't working properly and I haven't bothered to solve that issue since regenerating the deps section is handled by aforementioned tool.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks will try work bant into my workflow in future although AFAICT tensorstore uses the older WORKSPACE style while bant uses MODULE.bazel.

@sjperkins
Copy link
Contributor Author

sjperkins commented Feb 3, 2026

Thanks, I have something similar using std::optional<Batch>, but have been side-tracked trying to find a decent solution to managing #includes (which I think I have now, but I always trip up on the #includes).

I'll take a look at the bug you've highlighted.

@copybara-service copybara-service bot merged commit 1228f7e into google:master Feb 4, 2026
1 check passed
@sjperkins sjperkins deleted the add-batch-to-virtual-chunked-ReadParameters branch February 5, 2026 13:54
@sjperkins
Copy link
Contributor Author

Tested the github action build artefacts after the merge of this PR and it seems to work. Would it be possible to release a new version of tensorstore including these changes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support passing batches to virtual_chunked

3 participants