Define versioning expectations for pilot conventions#139
Conversation
|
@vincentsarago @kylebarron it would be helpful for you to review/comment if you have opinions here |
|
|
||
| The GeoZarr specification is a document that **references** a set of conventions at pinned versions. It is versioned independently of those conventions, on its own editorial cadence: a new GeoZarr release may update prose or re-point a reference without any convention changing, and a convention may release a new version without forcing an immediate GeoZarr release. | ||
|
|
||
| Each GeoZarr release records the exact convention versions it references (in the release notes and the specification's normative references), so that a given GeoZarr version resolves to a specific, reproducible set of conventions. |
There was a problem hiding this comment.
I am not sure that this is a good approach. Data sets may hang around for decades and the conventions they use then fade out of GeoZarr? I would propose that once a convention is referenced in GeoZarr that it persists for all eternity (in that registry that is on the drawing boards).
There was a problem hiding this comment.
I have removed this portion as out of scope for this PR, so we can discuss it independently from the expectations for pilot conventions. Please expect a follow-up PR
pvanlaake
left a comment
There was a problem hiding this comment.
Good addition overall, a few suggestions inlined. One global comment: how does any/all of this align with the Zarr convention work, i.e. those conventions not included in GeoZarr?
|
Why is the GeoZarr spec needed at all if it's just referencing other conventions? This is all feeling a lot like python packaging:
In this metaphor, GeoZarr is a python "metapackage" which itself doesn't have any code but simply references other things. There isn't a metapackage spec; because it's a packaging problem not a spec problem. So why is a GeoZarr spec needed? |
Well I will not go as far as that! The spec is needed but I don't think the spec should tell which version of the conventions are required. To me this will open a 🕳️ . Take the STAC specification, an Item follows the STAC spec v1.0.0 but can have multiple extensions with specific versions, but it still a valid STAC Item. If we tell, a GeoZarr HAVE TO have proj convention v1.0.0, what happens when proj v1.0.1 comes out? will a provider that creates GeoZarr has to wait for the GeoZarr spec to be updated? This will put a large burden on the GeoZarr maintainers to keep up to date with the conventions. Also what happens if a provider want to create a GeoZarr with it's own mixed of versions (e.g proj 1.1, multiscale 2.0), again the GeoZarr spec won't cover this case. The |
Yup this is really the only option when vendoring other specifications, as GeoZarr does. If any of the conventions beneath GeoZarr make a breaking change then GeoZarr itself will be forced to adopt that breaking change causing end users to constantly migrate their GeoZarr datasets / implementations whenever any of the underlying conventions change. I also prefer the STAC model for extension versioning. It's more lightweight and more flexible. |
I also like the STAC model for extension versioning because it's lighter and more flexible. My one hesitation is whether it actually resolves the maintenance-fatigue failure mode Patrick raised in #141, or just relocates it. From the outside, STAC's extension ecosystem looks like it leans heavily on a small number of maintainers, and I can't tell whether that's held together by structural mechanisms that would survive any individual stepping away, or by the current maintainers still having energy. If the latter, the risk is real but opaque to those of us benefiting from their work. Concretely for GeoZarr: pinning concentrates the burden at the spec layer (every convention patch risks forcing a re-release), while the conformance-class model pushes it out to each convention's own maintainers. The conformance-class model lowers central fatigue but multiplies the number of small maintainer pools that each carry their own bus-factor risk. Does STAC have an actual answer to that, or has it just not hit the wall yet? |
|
Thanks for the engagement, folks! I've substantially narrowed the scope of this PR following this discussion. The PR now covers only convention-level versioning mechanics and permanence (Sections 7.1–7.3). This is the minimal set needed to unblock a proj v0.1 release (zarr-conventions/proj#19). Moved out, not dropped:
Deferred upstream: Per @pvanlaake's global comment about alignment with non-GeoZarr conventions: declaration mechanics (how versions are carried in |
This PR documents the consensus on an initial convention version from zarr-conventions/proj#20, zarr-conventions/zarr-conventions-spec#29, zarr-conventions/zarr-conventions-spec#7, and #102. This consensus unlocks solving zarr-conventions/proj#19 via a v0.1 release.
cc participants in those discussions @emmanuelmathot @d-v-b @kylebarron @pvanlaake
This PR also intentionally defers items where a clear consensus has not been reached, so that solving zarr-conventions/proj#19 is not blocked by those discussions.