While talking with people about a wheel 2.0 design, it became very clear that before we could talk about what a wheel 2.0 could look like, we needed to talk about how to get there (beyond just incrementing the wheel major version number!).
This PEP defines a path to making wheel evolution easier, so that future PEPs can focus on the changes to the format and not get bogged down by details of how to deploy the update.
My hope is that once we have a compatibility story, we can move forward with discussions about what a wheel 2.0 should look like. If youāre interested in discussing that, come join us at the wheel-next ideas repo or in the #wheel-next channel on Discord!
An immediate but very minor nit. If weāre going to require a new wheel file extension, and itās not going to be 3 characters anyway, why not just use .wheel?
Iām still digesting the actually interesting parts of the PEP
The x suffix in whlx is intended to evoke an advancement of the whl format, but I have no attachment to the naming. I figured that particular detail will get bikeshed here a fair bit. Iām fine with wheel, but for the purposes of the main content of the PEP I donāt think it really matters.
One thing not covered in the PEP is why not store the major version in the file extension, e.g., whl2?
Actually, using .whlx threw me initially as I thought it was a placeholder for the major version digit, so I would either make that explicit or just make the switch to .wheel now in case anyone else gets confused.
I donāt want to reveal any spoilers, but the plan is that there is an extension mechanism such that you wonāt need to rev the major version number of the wheel spec in an backward incompatible way again.
Yeah, I definitely need to add this to rejected ideas. I have another draft PEP (that Barry alluded to) that I hope to polish soon that would introduce feature flags to wheels (with similar semantics as a major version bump, but allowing for clearer communication of intent). I think feature flags better encode the idea behind some changes, but others definitely seem like a real major version bump.
I think there are three issues with using whl2:
You need to encode the major version in the wheel name going forward, otherwise youād have the confusing situation of a wheel of major version 3 named whl2
Part of the brittleness of the current wheel spec comes from encoding so much information into the filename. Filenames arenāt well suited for storing complicated structured information. I hope with wheel 2, we can have a wheel format that encodes not much more than the name and version of the distribution. So putting the wheel version into the name goes against this goal.
It becomes a lot harder to define āwhat is a wheel?ā and it requires tools to adapt every new wheel major version. If Iām making a windows file association for wheels, how many versions do I register? How forwards compatible is that?
Iāll jump on the bike shed early. Please letās pick an extension thatās not pronounced āwheelā, which is how everyone Iāve talked with pronounces āwhlā. āDid you mean a ā.wheelā file or a ā.whlā file?ā sounds like confusion waiting to happen.
It would stay .wheel or .whlx or whatever we bikeshed going forward. I will be explicit about this.
Thank you!
I chose this invariant because tools will need to read .dist-info/METADATA or .dist-info/WHEEL to be able to tell what the wheel major version is and if they can install a file on disk. Unless we go with .whl2, whl3, etc., this will need to continue to work for all future versions of the wheel specification. I should probably clarify the rationale for this in the PEP.
I can understand that this mechanism needs to be invariant moving forward. If thereās any reason at all to switch to something else it would need to be now, while changing the extension.
Thatās not to say that it should changeāthe only other option I can think of is a tarball and that doesnāt seem obviously better.
Maybe a tar (with metadata files at the beginning of the archive if reading some files is desirable without having to read the whole archive) combined with a stream compression algorithm like zstd? I have no idea though how much reduction in file size this would actually give for real world packages compared to zip.
Wouldnāt this be the perfect time to switch to .dist-info/METADATA.json? Since it has a different extension, an installer needs to know about the extension to read it, so might as well change now. Though a METADATA file could/would be required as well for a while for extraction into site-packages. Maybe that could be Python version specific?
Agreed the change would need to happen now. I donāt think we should change it however for a few reasons:
A future wheel version could provide better compression by putting non-metadata files into a .tar.zstd or some other compressed tar file and require installers decompress that in some way. The metadata would be accessible the exact same as past versions, but large shared libraries or other content could be compressed significantly. The outer compression format does not need to change to take advantage of compression.
I donāt think itās a good idea to boil the oceans on the format, we could make something completely different from a wheel, but that would require significantly more work for tools, and a much more involved migration. Unless there is some reason an outer zip file is a problem (see next point to the contrary), I donāt think it makes sense to change things.
zip files have some nice features tar files donāt, such as random access. pip and uv both use this to do HTTP range requests when supported if an index doesnāt serve the metadata file, and this wouldnāt be possible with an outer tar file.
Iāll include these points in a rejected idea about changing the outer wheel format.
I think that is a topic that would best be put in a wheel 2.0 PEP specifying changes to the file format, not this PEP that specifies how to change the file format in such a PEP. When I do write up the 2.0 format spec, I plan on including a metadata.json file.
Sorry for triggering a big bike shedding argument straight off, but I agree, the rest seems good.
One substantive question I have is around the other places core metadata is stored. Would metadata in sdists and on disk in installed distributions be expected to omit the wheel version, or will it be optional but meaningless in those places? This PEP will need to more formally define the new metadata item (in the same sort of format as the existing definitions - for reference, āDynamicā is an example of an existing item that is only meaningful in one file format).
I was expecting the new extension to be bikeshed, so no worries. Glad you like the rest! Would you be content with .whlx if I added a section going over some of the mentioned alternatives in rejected ideas and clarified that x does not mean the major version when introducing .whlx?
My thinking on this is that it should only be allowed in wheels, served from an index via PEP 658 (when pulled from a wheel), or potentially on disk in the installed directory. Iām not as sure about the last one as the other two. Itās not a big ask for installers to just strip it out at install time, but maybe someone will want to inspect the information? I donāt think thereās a reason not to let it be installed into .dist-info/METADATA, so I think I would err on the side of not making the installation process more complicated.
FWIW I would personally avoid saying that a field MUST NOT appear in another context, but only that it MUST NOT be used to change the interpretation of that format, if found.
If you say MUST NOT, then any tool that wants to validate will need to enforce that rule even if it makes no difference to the operation of that tool. Ignoring extraneous metadata is a simple, forward-compatible default.
My main dislike of the x is that it feels reminiscent of its use in .docx and .xlsx to mean āextended versionā, and in Windows SuchAndSuchEx APIs with the same meaning. Because itās common in Microsoft products, I have a vague feeling that itās some sort of ācorporate over-engineeringā. Itās also a dead end, in that if we ever need to do this again, .whlxx just feels silly.
I can certainly live with it, but my main complaint is why not use a readable extension like .wheel? @ericvsmith mentioned the potential for confusing when speaking because .whl and .wheel could be pronounced the same, and I guess thatās a fair point, but I hope we donāt all end up referring to āWheel-Xā files, so I think verbal distinction is just something weāll need to sort out as we go allong (āNew wheelā works just fine for meā¦)
It is bikeshedding, though, and if you say the PEPās going to choose .whlx, then thatās your right as the author. I appreciate you taking the question seriously, but Iām not going to make a fuss about it.
My feeling is:
It should be prohibited in sdists.
It should be mandatory in (new) wheels.
PEP 658 metadata files have to match whatās in the file itself - the PEP says:
The metadata must only be served for standards-compliant distributions such as wheels [wheel] and sdists [sdist], and must be identical to the distributionās canonical metadata file, such as a wheelās METADATA file in the .dist-info directory [dist-info].
The hard one is installed distributions. I really donāt want to add complexity to the process of installing a wheel - at the moment, itās āunpack and copy a bunch of filesā. If we require modifying the metadata, that means that file needs to be rewritten, and the RECORD file needs modifying to correct the size and hash of the METADATA file. And I bet weāll end up with mistakes being made resulting in installations where RECORD wasnāt corrected.
Overall, I think we should require that installing a distribution from a wheel must continue to copy METADATA and RECORD unchanged. So the wheel version metadata may be present in an installed distribution. However, while thereās no standard saying how to install a package from anything other than a wheel, thereās nothing prohibiting a user doing that manually. So I think we have to say that the wheel version metadata is optional when a package was not installed from a wheel.
I wonder how distributions will view this? I believe they create their distro packages by building and installing wheels into an isolated area, and then repackaging that into a distro-specific format. I could interpret that as being a case of not installing from a wheel, although I doubt anyone would actually care.
Long story short - IMO for installed packages the wheel version metadata should be optional, but the spec for installing from a new-style wheel should explicitly state that METDATA (and its RECORD entry) must be copied unchanged (so that the wheel version is always present for packages installed from a wheel).