Quickstart¶
This page covers the most common operations in zarr-vectors: writing a
point cloud, reading it back, performing a spatial bounding-box query, and
writing a set of streamlines with group labels. All examples use synthetic
data and require only the base install.
Point clouds¶
Writing¶
import numpy as np
from zarr_vectors.types.points import write_points
rng = np.random.default_rng(42)
# 100 000 points uniformly distributed in a 1 000³ μm volume
positions = rng.uniform(0, 1000, size=(100_000, 3)).astype(np.float32)
intensity = rng.uniform(0, 1, size=100_000).astype(np.float32)
label = rng.integers(0, 5, size=100_000).astype(np.int32)
write_points(
"scan.zarrvectors",
positions,
chunk_shape=(200.0, 200.0, 200.0), # I/O unit — one file per chunk
bin_shape=(50.0, 50.0, 50.0), # spatial index unit — 64 bins/chunk
attributes={"intensity": intensity, "label": label},
)
chunk_shape and bin_shape are in the same physical units as positions
(here, micrometres). The store is written as a directory tree called
scan.zarrvectors/. See Concepts for an explanation of the
chunk/bin distinction.
Reading all data¶
from zarr_vectors.types.points import read_points
result = read_points("scan.zarrvectors")
print(result["vertex_count"]) # 100000
print(result["positions"].shape) # (100000, 3)
print(result["attributes"]["intensity"].shape) # (100000,)
Spatial bounding-box query¶
Spatial queries target individual bins rather than entire chunks, so only the data in the requested region is loaded from disk.
result = read_points(
"scan.zarrvectors",
bbox=(np.array([100.0, 100.0, 100.0]),
np.array([200.0, 200.0, 200.0])),
)
print(result["vertex_count"]) # number of points in the 100³ μm box
print(result["positions"].shape)
Reading a coarser resolution level¶
If you have built a multi-resolution pyramid (see
Building pyramids), pass
level= to read from a coarser level:
result = read_points("scan.zarrvectors", level=1)
print(result["vertex_count"]) # fewer points — coarser bin resolution
Streamlines¶
Writing¶
from zarr_vectors.types.polylines import write_polylines
rng = np.random.default_rng(0)
# 500 streamlines, each with 40 vertices, walking through 3-D space
streamlines = [
rng.normal(0, 50, size=(40, 3)).cumsum(axis=0).astype(np.float32)
for _ in range(500)
]
write_polylines(
"tracts.zarrvectors",
streamlines,
chunk_shape=(200.0, 200.0, 200.0),
bin_shape=(50.0, 50.0, 50.0),
# Optional: assign streamlines to named groups
groups={0: list(range(250)), 1: list(range(250, 500))},
)
Reading by group¶
from zarr_vectors.types.polylines import read_polylines
result = read_polylines("tracts.zarrvectors", group_ids=[0])
print(result["polyline_count"]) # 250
Reading a single object by ID¶
result = read_polylines("tracts.zarrvectors", object_ids=[42])
print(result["polyline_count"]) # 1
print(result["polylines"][0].shape) # (40, 3) — the 40 vertices of streamline 42
Format converters¶
Ingesting from third-party formats (LAS, PLY, CSV, TRK, TCK, TRX, SWC,
GraphML, OBJ, STL) and exporting back to them lives in the companion
package zarr-vectors-tools, alongside the zarr-vectors CLI.
Validation¶
from zarr_vectors.validate import validate
result = validate("scan.zarrvectors", level=5)
print(result.summary())
# Level 5 validation: PASS
# 42 passed, 0 warnings, 0 errors
Validation levels 1–5 check progressively deeper properties of the store. See Validation for details.
Next steps¶
Concepts — understand chunks, bins, vertex groups, and the multiscale pyramid before working with larger datasets.
Data type tutorials — deeper walkthroughs for each geometry type.
Building pyramids — add multi-resolution levels for level-of-detail rendering.
Neuroglancer integration — visualise your stores in Neuroglancer using
zv-ngtools.