L3: Consistency validation¶
Terms¶
- Consistency check
A validation check that reads array data and verifies that the values in one array are logically consistent with values in another. L3 checks read all chunks of all arrays at each level.
- Fragment-index arithmetic check
Verification that the fragment-index blob in
vertex_fragments/<chunk>is self-consistent: magic and version are correct, the range bitmap’s popcount matches the header’sR, CSR offsets are monotone, and every range fragment’s[start, start + count)and every explicit fragment’s indices lie in[0, vertex_count_in_chunk).- Manifest integrity
Verification that every block in
object_index/datareferences a chunk that exists at the level and a fragment that exists in that chunk’svertex_fragments/blob, and that decoding the manifest yields exactly one fragment per fragment reference (no out-of-range indices).- Attribute alignment
Verification that the length of each
attributes/<name>/chunk slice equals the length of the correspondingvertices/chunk slice. Misaligned attributes indicate a bug in the writer’s vertex-reordering logic.
Introduction¶
L3 validation reads array data and checks that the store’s internal structure is logically consistent. It is the first level that can detect bugs introduced by incorrect writer implementations, rechunking errors, or manual store modifications.
L3 is substantially more expensive than L1–L2 because it reads all chunks. For a 100 GB store, L3 may take 5–30 minutes depending on storage bandwidth. For development and CI, run L3 on small synthetic stores; run L1–L2 on full-size production stores unless a specific consistency issue is suspected.
Technical reference¶
Fragment-index arithmetic checks¶
For every non-empty chunk in vertices/ and vertex_fragments/:
Check |
Rule |
Failure type |
|---|---|---|
|
First 4 bytes equal |
Error |
|
Header |
Error |
|
Header |
Error |
|
Bitmap bytes beyond |
Warning |
|
|
Error |
|
Every range fragment’s |
Error |
|
Every explicit fragment’s indices lie in |
Error |
|
Every entry of |
Error |
|
At level 0 with |
Error |
For link_fragments/<chunk>, the same checks apply with
vertex_count_in_chunk replaced by link_count_in_chunk (rows of
links/0/<chunk>).
The frag_vg_order check is the most expensive: it requires computing the
bin flat index of every vertex and comparing to the writer-emitted fragment
order. It runs by default at L3 but can be disabled with
skip_vg_order_check=True for large stores where only fragment-index
arithmetic is needed:
result = validate("scan.zarrvectors", level=3, skip_vg_order_check=True)
Attribute alignment checks¶
For every chunk at every level:
Check |
Rule |
Failure type |
|---|---|---|
|
|
Error |
|
No NaN values in float attributes unless |
Warning |
|
|
Error |
Object index checks¶
Check |
Rule |
Failure type |
|---|---|---|
|
|
Error |
|
Every per-object manifest blob decodes via |
Error |
|
Every block’s |
Error |
|
Every block’s |
Error |
|
When |
Error |
Cross-chunk link checks¶
The L3 walker enumerates every <delta> subdir under
cross_chunk_links/ via list_cross_link_deltas (see
zarr_vectors/validate/consistency.py)
and validates each independently.
Check |
Rule |
Failure type |
|---|---|---|
|
Every endpoint’s chunk-coord tuple has length |
Error |
|
For every |
Error |
|
For |
Error |
|
For |
Error |
|
For every |
Error |
|
For polyline/streamline stores at |
Error |
|
For undirected graph stores at |
Error |
Edge index checks¶
For polyline, streamline, graph, and skeleton types:
Check |
Rule |
Failure type |
|---|---|---|
|
All local vertex indices in |
Error |
|
No edge has |
Error |
|
For polyline/streamline: every non-terminal vertex has exactly one outgoing intra-chunk edge (or a cross-chunk link as continuation) |
Error |
Mesh face checks¶
Check |
Rule |
Failure type |
|---|---|---|
|
All positive face indices in |
Error |
|
All negative face indices decode to valid global vertex IDs |
Error |
|
No face has two or more identical vertex indices |
Error |
Pyramid consistency checks (L3 component)¶
Check |
Rule |
Failure type |
|---|---|---|
|
Total vertex count at level N ≤ total vertex count at level N-1 |
Error |
|
Attribute name sets are identical across all levels |
Warning |
Example L3 report (abbreviated)¶
Level 3 validation of scan.zarrvectors
========================================
Checking 0 (125 chunks)…
PASS frag_magic [0] all chunks: magic 0x5A56_4647 ✓
PASS frag_popcount [0] all chunks: R == popcount(bitmap) ✓
PASS frag_range_in_bounds [0] all range fragments in bounds
PASS frag_vg_order [0] all vertices in correct fragment order
PASS attr_length_matches [0] intensity/: 125/125 chunks aligned
ERROR ccl_different_chunks [0] 2 links found where src chunk == dst chunk
(links at rows 14502, 87331)
PASS edge_indices_in_bounds [0] all intra-chunk edges valid
PASS vertex_count_non_increasing level 1 (82453) ≤ level 0 (100000) ✓
Level 3 validation: FAIL — 47 passed, 0 warnings, 1 error
The error above (ccl_different_chunks) indicates that the writer
erroneously generated cross-chunk links for same-chunk vertex pairs — the
most common correctness bug in cross-chunk link generation. See
Cross-chunk links for the
correct generation algorithm.