I’d like to start a discussion around how Rocky will handle repository management, as it will have implications for both infrastructure and the end-user experience. A generic overview of existing setups:
CentOS
Two repository locations:
- Vault:
- Each minor release of CentOS is given its own directory in Vault.
- All source RPMs are found in their respective version directories (where the source repos point).
- When a new minor release of CentOS is released, binary RPMs for the prior release are migrated from.
- Users can choose to pull from Vault repos by enabling them.
Mirror into their respective Vault locations.
- Mirror
- Each minor release has its own directory for binary RPMs.
- The simple major release (e.g. 6, 7, 8) is a symlink to the latest minor release.
- N-0.1 releases are archived to Vault after N is released and the major symlink changes.
- Only repos for N are accessible by package managers (as other releases have been evacuated).
Impacts of this structure:
- There is no package duplication going on, so this takes the least amount of disc space.
- From a consumer standpoint on “latest” is available, there is no backtracking once a new release is out without dipping into Vault.
- CentOS presently only ships N-0.x Vault repos by default, so it is not possible to lock oneself to a specific version without first upgrading the centos-release packages to N+0.1 or manually editing the Vault repo file.
RHEL
Hard to know specifics without being a RH engineer, but from a user standpoint:
- By default a user is subscribed to the latest general major release for updates.
- The user can choose to lock themselves off to specific minor release streams through subscription-manager (Vault swapping analog).
- The entire history of RHEL packaging is accessible by default, unfiltered, in the main repos (no need for Vault).
Impacts of this structure:
- Quite possibly lots of package duplication (source+binary) on disk:
- Total glob of all packages
- Total glob of all minor packages for each release on their own
- A user has access to everything without needing to think about alternate repos.
As a consumer, I prefer the RHEL orientation as it is the simplest and most accessible. However it is more complex on the infrastructure side of things. You are either duplicating every package twice (for standalone release and mainline), or you are involving some kind of per-package symlink structure that (I’m not 100% sure) wouldn’t carry over to mirrors and would make them bear the weight of the load. And the mirrors would be particularly large. I’m not even sure if you can create a repo that has symlinks in it (I’ve never tried).
Part of the reasoning for CentOS not doing this (providing the full package history by default) is that they only ever “support” the latest and greatest packages/version, as there is no actual CentOS support. I assume Rocky will inherit that playbook, but should the “cloning” of RHEL be extended to also include its repository experience?