Skip to content

fix: apply drop_axes squeeze in partial decode path for sharding (#3691)#3763

Merged
d-v-b merged 4 commits intozarr-developers:mainfrom
abishop1990:fix/sharded-mixed-indexing-3691
Mar 13, 2026
Merged

fix: apply drop_axes squeeze in partial decode path for sharding (#3691)#3763
d-v-b merged 4 commits intozarr-developers:mainfrom
abishop1990:fix/sharded-mixed-indexing-3691

Conversation

@abishop1990
Copy link
Contributor

Summary

Fixes #3691.

Mixed integer/list indexing on sharded arrays (e.g. arr[0:10, 0, [0, 1]]) raised:

ValueError: could not broadcast input array from shape (10,1,2) into shape (10,2)

Root Cause

When OrthogonalIndexer processes advanced indexing (slices + integers + arrays), it applies ix_() to chunk_selection to set up orthogonal numpy indexing. Integer indices become 1-element ranges (size-1 dimensions) via ix_().

CodecPipeline.read_batch() has two paths:

  1. Non-partial decode (regular codecs): Applies drop_axes.squeeze() to remove size-1 integer dims ✅
  2. Partial decode (ShardingCodec): Missing drop_axes.squeeze()

ShardingCodec._decode_partial_single() receives the ix_()-transformed chunk_selection, which looks like pure fancy indexing to get_indexer(), so it routes to CoordinateIndexer. The result is reshaped to the broadcast coordinate shape (10, 1, 2) instead of (10, 2).

Fix

Apply drop_axes squeeze to chunk_array in the partial decode branch of read_batch(), matching the non-partial path behaviour:

if drop_axes != ():
    chunk_array = chunk_array.squeeze(axis=drop_axes)
out[out_selection] = chunk_array

Testing

Added test_sharding_mixed_integer_list_indexing that verifies:

  • Shape and data equality between chunked and sharded arrays for mixed indexing
  • Multiple integer axes (arr[0, 0, [0, 1, 2]])
  • Slice + integer + slice (arr[0:5, 1, 0:3])
tests/test_codecs/test_sharding.py  125 passed, 1 skipped
tests/test_indexing.py  149 passed, 1 skipped, 5 xfailed

When reading sharded arrays with mixed integer/list indexing (e.g.
arr[0:10, 0, [0, 1]]), the outer OrthogonalIndexer produces chunk
selections that have been ix_()-transformed for orthogonal advanced
indexing. Integer indices become single-element ranges (size-1 dims)
via ix_() to enable NumPy orthogonal indexing.

In CodecPipeline.read_batch(), the non-partial path correctly applies
drop_axes.squeeze() to remove those size-1 integer dimensions before
writing to the output buffer. However, the partial decode path (used
by ShardingCodec) was missing this squeeze step.

Fixes zarr-developers#3691

Also: Fix line length violation in test error message to comply with
100 character linting limit.
@abishop1990 abishop1990 force-pushed the fix/sharded-mixed-indexing-3691 branch from a65a546 to 07b6fb7 Compare March 11, 2026 09:40
Cipher and others added 3 commits March 11, 2026 02:42
…rding test

The test uses complex indexing patterns (mixed integer/list indices) that
mypy's zarr.Array stubs don't recognize as valid. Add specific type ignore
comments for [index] and [union-attr] errors to suppress false positives.
…arding test

- Line 542: Fix assert accessing .shape by changing from [index] to [union-attr]
- Line 544: Add missing type-ignore[union-attr] for f-string .shape access
- Lines 554-555: Remove unused type-ignore[index] comments on assignments

The mypy errors were caused by indexing operations returning union types that
include scalar types (int, float, etc.), which don't have a .shape attribute.
The proper fix uses type-ignore[union-attr] for attribute access, not [index].
@d-v-b d-v-b enabled auto-merge (squash) March 13, 2026 15:58
@d-v-b d-v-b merged commit 93dbf78 into zarr-developers:main Mar 13, 2026
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Shape mismatch with mixed integer/list indexing on Sharded arrays

2 participants