Repository Management/Structure

I’d like to start a discussion around how Rocky will handle repository management, as it will have implications for both infrastructure and the end-user experience. A generic overview of existing setups:


CentOS

Two repository locations:

  • Vault:
    • Each minor release of CentOS is given its own directory in Vault.
    • All source RPMs are found in their respective version directories (where the source repos point).
    • When a new minor release of CentOS is released, binary RPMs for the prior release are migrated from.
    • Users can choose to pull from Vault repos by enabling them.
      Mirror into their respective Vault locations.
  • Mirror
    • Each minor release has its own directory for binary RPMs.
    • The simple major release (e.g. 6, 7, 8) is a symlink to the latest minor release.
    • N-0.1 releases are archived to Vault after N is released and the major symlink changes.
    • Only repos for N are accessible by package managers (as other releases have been evacuated).

Impacts of this structure:

  • There is no package duplication going on, so this takes the least amount of disc space.
  • From a consumer standpoint on “latest” is available, there is no backtracking once a new release is out without dipping into Vault.
  • CentOS presently only ships N-0.x Vault repos by default, so it is not possible to lock oneself to a specific version without first upgrading the centos-release packages to N+0.1 or manually editing the Vault repo file.

RHEL

Hard to know specifics without being a RH engineer, but from a user standpoint:

  • By default a user is subscribed to the latest general major release for updates.
  • The user can choose to lock themselves off to specific minor release streams through subscription-manager (Vault swapping analog).
  • The entire history of RHEL packaging is accessible by default, unfiltered, in the main repos (no need for Vault).

Impacts of this structure:

  • Quite possibly lots of package duplication (source+binary) on disk:
    • Total glob of all packages
    • Total glob of all minor packages for each release on their own
  • A user has access to everything without needing to think about alternate repos.

As a consumer, I prefer the RHEL orientation as it is the simplest and most accessible. However it is more complex on the infrastructure side of things. You are either duplicating every package twice (for standalone release and mainline), or you are involving some kind of per-package symlink structure that (I’m not 100% sure) wouldn’t carry over to mirrors and would make them bear the weight of the load. And the mirrors would be particularly large. I’m not even sure if you can create a repo that has symlinks in it (I’ve never tried).

Part of the reasoning for CentOS not doing this (providing the full package history by default) is that they only ever “support” the latest and greatest packages/version, as there is no actual CentOS support. I assume Rocky will inherit that playbook, but should the “cloning” of RHEL be extended to also include its repository experience?

2 Likes

I personally don’t believe the repository experience of RHEL should be honored. This produces problems when someone comes asking for help on a forum or the IRC but they’re on a very, very old release that is riddled with security bugs (I cannot tell you how many times people came in with 6.1-7 boxes last year on the centos IRC). While supporting centos from a community stand point, on forums and IRC, it is widely known that only the latest is supported because of potential bugs and security issues addressed, and that users who want something older must go to the vault and/or get EUS with Red Hat.

However, I personally believe that we should still have a vault structure and for the lifetime of a minor release regardless, all packages within that release will stay there until the next minor release change, and then moved to vault once a new release comes. For example, all packages produced during the 8.3 lifecycle will be there. 8.4 drops, 8.3’s vault will be updated and 8.4 is a fresh state. Yes, this means users may have to go back and find “older” stuff, but the point is to keep things secure and as bug free as possible for everyone.

This is something that we’ll talk about when we get there, but this is a good topic to note for the future!

For the purposes of what this distribution aims to be, I would tend to agree. I think one difference that could be made compared to the way CentOS handles things is to have the current minor release already in the vault repo, compared to only previous minor releases. This would allow users who (for whatever reason) need to stay on a specific version regardless of upstream/community sentiment don’t have to do any self-modifying or selective package updates, they can just disable the normal repos and enable the Vault ones.

But, as you said, this is a bridge we’ll cross once we get there.

Actually CentOS has always done this. The current minor releases will be dropped in vault at the same time of its release. So there already is an opportunity for users to switch to those vaults.


The point I was making was that if you browse the CentOS-Vault.repo, it doesn’t contain the Vault repo for the release you are presently on when a new release drops. I think it would be worth having it there from the get-go.

So the 8.3 release should have the 8.3 Vault repo available in the repo file (so it’s there when 8.4 drops), which is currently not the case.

Side note: I’m doing some investigating and the Vault repos for 8-8.2 are completely empty (the file, not mirror) and 8.3 outright removes the repo file. The behavior I’m expressing above is true for CentOS 7 and its minor releases.

I understand now. I think it would be beneficial to have a repo file like that. That’s a good idea. They clearly have changed their approach on the minor releases, you used to be able to get all of it pre-7.

1 Like