Skip to content

Conversation

@BrianMichell
Copy link
Contributor

Supersedes #264

Resolves comments 1 and 2

Implement shim for `open_as_void` driver level flag
* Begin removing void field shim

* Fully removed void string shim

* Cleanup debug prints

* Remove shimmed validation

* Remove unnecessary comment

* Prefer false over zero for ternary clarity
* Implement a more general and portable example set

* Fix driver cache bug

* Update example for template

* Cleanup example

* Remove testing examples from source
* Use the appropriate fill value for open_as_void structured data

* Cleanup
@laramiel
Copy link
Collaborator

I'll try to get to this in about a week, before I look this one over, please double check that the prior PR works for you. Also look over this one and see if any of the suggestions from the other one applies.

Matches the pattern from zarr v2 driver (PR google#272). When both "field"
and "open_as_void" are specified in the spec, return an error since
these options are mutually exclusive - field selects a specific field
from a structured array, while open_as_void provides raw byte access
to the entire structure.
The zarr3 URL syntax cannot represent field selection or void access
mode. Following the pattern from zarr v2 driver (PR google#272), ToUrl() now
returns an error when either of these options is specified instead of
silently ignoring them.
…trip

Following the pattern from zarr v2 driver (PR google#272), override
GetBoundSpecData in ZarrDataCache to set spec.open_as_void from
ChunkCacheImpl::open_as_void_. This ensures that when you open a
store with open_as_void=true and then call spec(), the resulting
spec correctly has open_as_void=true set.

Without this fix, opening a store with open_as_void=true and then
getting its spec would lose the open_as_void flag, causing incorrect
behavior if the spec is used to re-open the store.
Add test to verify that writing through void access with compression
enabled works correctly. The test:

1. Creates an array with gzip compression
2. Initializes with zeros via typed access
3. Opens as void and writes raw bytes
4. Reads back through void access to verify the write
5. Reads back through typed access to verify byte interpretation

This test exercises the EncodeChunk path for void access with the
codec chain including compression.
Add tests to verify that GetSpecInfo correctly computes rank when
open_as_void=true (mirroring v2 test patterns):

- GetSpecInfoOpenAsVoidWithKnownRank: Verifies full_rank = chunked_rank + 1
- GetSpecInfoOpenAsVoidWithDynamicRank: Verifies dynamic rank handling
- GetSpecInfoOpenAsVoidWithoutDtype: Verifies behavior without dtype
- GetSpecInfoOpenAsVoidRankConsistency: Verifies spec rank matches opened store

Also adds TODO for OpenAsVoidFillValue test - fill_value handling for
void access requires additional implementation (similar to v2's
CreateVoidMetadata which converts fill_value to byte array).
Implement proper fill_value conversion for void access mode:

1. Add is_void_access() virtual method to DataCacheBase to expose
   whether the cache was opened with open_as_void=true.

2. Modify ZarrDriver::GetFillValue to convert fill_value to byte
   array representation when in void access mode. This copies bytes
   from each field's fill_value at their respective offsets, similar
   to v2's CreateVoidMetadata handling.

3. Add OpenAsVoidFillValue test to verify that:
   - Normal store returns the expected scalar fill_value
   - Void store returns fill_value as byte array with correct shape
   - Byte representation matches the original value (little endian)
Fix EncodeChunk to properly handle structured types:

1. For single non-structured field: encode directly (existing behavior)

2. For structured types (multiple fields): combine field arrays into
   a single byte array by copying each field's data at their respective
   byte offsets, then encode the combined byte array.

This matches the pattern in DecodeChunk which extracts fields from a
decoded byte array.

Add OpenAsVoidStructuredType test that:
- Creates an array with structured dtype (uint8 + int16 fields)
- Writes data using field access
- Opens with open_as_void=true
- Verifies rank is original_rank + 1
- Verifies bytes dimension is 3 (1 + 2 bytes)
- Verifies dtype is byte
1. OpenAsVoidStructuredType: Now actually reads and verifies byte content
   - Reads raw bytes through void access
   - Uses proper stride calculation for the returned array
   - Verifies y field bytes at all 4 positions (little-endian int16)
   - x field is 0 (fill value) since we only wrote to y field

2. Add GetSpecInfoOpenAsVoidWithStructuredDtype test
   - Verifies spec rank = chunked_rank + 1 with structured dtype
   - Tests structured dtype with int32 + uint16 fields
   - Matches v2 test coverage
Test that open_as_void correctly detects when the underlying metadata
has been changed to an incompatible dtype. ResolveBounds should fail
with kFailedPrecondition when the stored metadata has a different
bytes_per_outer_element than what was expected.

This matches the v2 test that verifies metadata consistency checking
works properly with void access.
Verifies that void access works correctly with sharded arrays:
- Void access flags propagate through sharded caches
- Reading bytes through sharded void access returns correct data
- Writing bytes through sharded void access round-trips correctly
…y with zarr2

- Remove invalid oneOf constraint that didn't properly express mutual exclusivity
- Update field description to match zarr2 style (document mutual exclusivity)
- Update open_as_void description to document mutual exclusivity with field
- Add oneOf type constraint for field to match zarr2 (string or null)

The actual mutual exclusivity validation is done in code via jb::Initialize.
…nsform

For consistency with GetDomain(), explicitly set implicit_lower_bounds
in GetExternalToInternalTransform when building the void access transform.
Both methods now follow the same pattern of explicitly setting both
implicit_lower_bounds and implicit_upper_bounds.
Add assertion that num_fields == 1 in the void access path of DecodeChunk.
Void access always uses a single synthesized field, so this assertion
helps catch any inconsistency between GetDataCache and DecodeChunk.
Add assertions in EncodeChunk and DecodeChunk to verify that arrays
are C-contiguous before performing direct memcpy operations:

- In EncodeChunk: verify component arrays are C-contiguous
- In DecodeChunk: verify decoded byte arrays are C-contiguous

These assertions validate assumptions about array layouts that the
chunk cache relies on for correct operation. The chunk cache write
path (AsyncWriteArray) allocates C-order arrays, and the codec chain
produces C-contiguous decoded arrays.

Also adds the necessary includes and BUILD dependencies for
IsContiguousLayout and c_order.
Replace raw memcpy loops with CopyArray using strided ArrayViews for
structured type encoding and decoding. This follows the standard
TensorStore pattern (as used in zarr v2 with internal::EncodeArray)
where array copies are done via IterateOverArrays which safely handles
any source/destination strides.

The key insight is creating an ArrayView with strides that represent
the interleaved field positions within the struct layout:
- For a field at byte_offset B within a struct of size S
- The strides are [..., S] instead of [..., field_size]
- This allows CopyArray to correctly interleave/deinterleave fields

This approach:
1. Removes the need for contiguity assertions (CopyArray handles any layout)
2. Is consistent with zarr v2's use of internal::EncodeArray
3. Uses the standard IterateOverArrays iteration pattern

The void access decode path retains its memcpy with assertion because
it's a simple byte reinterpretation where both arrays are known to be
C-contiguous (destination freshly allocated, source from codec chain).
Replace manual stride computation loops with ComputeStrides() from
contiguous_layout.h. This is the standard TensorStore utility for
computing C-order (or Fortran-order) byte strides given a shape
and innermost element stride.

The manual loop:
  Index stride = bytes_per_outer_element;
  for (DimensionIndex i = rank; i-- > 0;) {
    strides[i] = stride;
    stride *= shape[i];
  }

Is exactly equivalent to:
  ComputeStrides(c_order, bytes_per_outer_element, shape, strides);
Replace manual loops with standard library and TensorStore utilities:

1. DimensionSet::UpTo(rank) - Creates a DimensionSet with bits [0, rank)
   set to true. Replaces:
     DimensionSet s(false);
     for (i = 0; i < rank; ++i) s[i] = true;

2. std::fill_n for origins (all zeros) and std::copy_n for shape copy.
   This is more idiomatic and clearer than explicit index loops.

These are standard patterns used throughout TensorStore for similar
operations on dimension sets and shape vectors.
The sub-chunk cache in sharding mode uses a grid from the sharding
codec state, which doesn't know about void access. This caused issues:

1. Shape mismatch: The grid's component shape was [4, 4] but decoded
   arrays had shape [4, 4, 4] (with bytes dimension)

2. Invalid key generation: The grid's chunk_shape affected cell indexing

Fix by:
- Add `grid_has_void_dimension_` flag to track whether the grid includes
  the bytes dimension (false for sub-chunk caches)
- For sub-chunk caches with void access on non-structured types, create
  a modified grid with:
  - Component chunk_shape including bytes dimension [4, 4, 4]
  - Grid chunk_shape unchanged [4, 4] (for cell indexing)
  - Proper chunked_to_cell_dimensions mapping

This enables void access to work correctly with sharding codecs.
The ZarrShardSubChunkCache template had duplicate member variables
(open_as_void_, original_is_structured_, bytes_per_element_) that
were already present in the base class ChunkCacheImpl (ZarrLeafChunkCache).

Access these through ChunkCacheImpl:: prefix instead to follow DRY
principle and maintain consistency with other TensorStore patterns.
Reviewed the code for potential inconsistencies and fixed some bugs
/// Support for encoding/decoding zarr "dtype" specifications.
/// See: https://zarr-specs.readthedocs.io/en/latest/v3/core/v3.0.html#data-type

#include <optional>
Copy link
Collaborator

@laramiel laramiel Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The includes should look about like this:

#include <optional>
#include <string>
#include <string_view>
#include <vector>

#include <nlohmann/json.hpp>
#include "absl/status/status.h"
#include "tensorstore/data_type.h"
#include "tensorstore/index.h"
#include "tensorstore/internal/json_binding/bindable.h"
#include "tensorstore/json_serialization_options_base.h"
#include "tensorstore/util/endian.h"
#include "tensorstore/util/result.h"

// See the License for the specific language governing permissions and
// limitations under the License.

#include "tensorstore/driver/zarr3/dtype.h"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The includes should look about like this:

#include "tensorstore/driver/zarr3/dtype.h"

#include <stddef.h>
#include <stdint.h>

#include <algorithm>
#include <limits>
#include <optional>
#include <string>
#include <string_view>

#include <nlohmann/json.hpp>
#include "absl/base/optimization.h"
#include "absl/status/status.h"
#include "absl/strings/ascii.h"
#include "absl/strings/numbers.h"
#include "tensorstore/data_type.h"
#include "tensorstore/index.h"
#include "tensorstore/internal/integer_overflow.h"
#include "tensorstore/internal/json/json.h"
#include "tensorstore/internal/json/value_as.h"
#include "tensorstore/internal/json_binding/bindable.h"
#include "tensorstore/internal/json_binding/json_binding.h"
#include "tensorstore/util/extents.h"
#include "tensorstore/util/quote_string.h"
#include "tensorstore/util/result.h"
#include "tensorstore/util/span.h"
#include "tensorstore/util/status.h"
#include "tensorstore/util/str_cat.h"

const ZarrDType& dtype);
};

bool operator==(const ZarrDType::BaseDType& a,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typically these should be friend member functions; the != should be implemented inline in terms of the ==.

See tensorstore/data_type.h for examples.

/// unstructured scalar array, otherwise `std::nullopt`.
std::optional<DataType> GetScalarDataType(const ZarrDType& dtype);

/// Parses a Zarr 3 data type string.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation.

if (!absl::SimpleAtoi(suffix, &num_bits) ||
num_bits == 0 ||
num_bits % 8 != 0) {
return absl::InvalidArgumentError(tensorstore::StrCat(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in the process of making most of the error messages formatted via absl::StrFormat; this is an easy candidate as dtype is a string_view.

/// \file
/// Support for encoding/decoding the JSON metadata for zarr arrays
/// See: https://zarr-specs.readthedocs.io/en/latest/v3/core/v3.0.html#metadata

Copy link
Collaborator

@laramiel laramiel Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs

#include <stddef.h>
#include <array>
#include <string_view>

#include "tensorstore/driver/zarr3/dtype.h"
#include "tensorstore/driver/zarr3/name_configuration_json_binder.h"
#include "tensorstore/index.h"
#include "tensorstore/index_space/dimension_units.h"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs:

#include "tensorstore/util/dimension_set.h"

@@ -31,6 +33,7 @@
#include "tensorstore/driver/read_request.h"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also include:

#include "tensorstore/data_type.h"

@@ -29,6 +31,7 @@
#include "absl/time/time.h"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also include

#include "third_party/tensorstore/driver/zarr3/dtype.h"
#include "third_party/tensorstore/data_type.h"


#include <algorithm>
#include <cassert>
#include <cstddef>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some platforms export different symbols from stddef.h and cstddef. Unless otherwise required (std::nullptr_t, I'm looking at you), we prefer stddef.h (and likewise, stdint.h).

@@ -121,14 +129,48 @@ class ZarrDriverSpec
"metadata",
jb::Validate(
[](const auto& options, auto* obj) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going to need specify the lambda return type soon: -> absl:Status

@laramiel
Copy link
Collaborator

laramiel commented Feb 3, 2026

FWIW I see a new assert failure in this PR:

[ RUN      ] StorageStatisticsTest.FullyLexicographicOrder
F0202 23:48:22.663513    9604 logging.cc:51] assert.h assertion failed at tensorstore/util/span.h:366 in span<element_type, dynamic_extent> tensorstore::span<const long>::subspan(ptrdiff_t, ptrdiff_t) const [T = const long, Extent = -1]: offset >= 0 && (count == dynamic_extent || (count >= 0 && offset + count <= size()))
*** Check failure stack trace: ***
...
    @     0x7fc7ec534374  __assert_fail
    @     0x7fca7366d7c3  tensorstore::span<>::subspan()
    @     0x7fca7366463b  tensorstore::internal_zarr3::(anonymous namespace)::DataCacheBase::FormatKey()
    @     0x7fca72931660  tensorstore::internal::GetChunkKeyRangesForRegularGridWithSemiLexicographicalKeys()::$_1::operator()()
    @     0x7fca72930eb0  absl::functional_internal::InvokeObject<>()
    @     0x7fca6b5f65fa  tensorstore::internal_grid_partition::(anonymous namespace)::GetGridCellRangesIterateHelper::IterateOverStridedSets()
    @     0x7fca6b5f5d57  tensorstore::internal_grid_partition::GetGridCellRanges()
    @     0x7fca72930c79  tensorstore::internal::GetChunkKeyRangesForRegularGridWithSemiLexicographicalKeys()
    @     0x7fca72d3c281  tensorstore::internal::GetStorageStatisticsForRegularGridWithSemiLexicographicalKeys()
    @     0x7fca73177d4c  tensorstore::internal_zarr3::GridStorageStatisticsChunkHandlerBase::Start()
    @     0x7fca7317737d  tensorstore::internal_zarr3::ZarrLeafChunkCache::GetStorageStatistics()

It's best to run all the tests as you're developing. I typically do bazelisk.py test -k //tensorstore/driver/zarr3/...

So far this is mostly build-based issues. I'll look at more of the structure later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants